Skip to main content
Vault is a secrets management tool that helps organizations store, access, and manage sensitive data such as API keys, passwords, certificates, and other secrets. It provides a centralized platform for managing secrets across applications and infrastructure.

Configuring Vault as a Source

In the Sources tab, click on the “Add source” button located on the top right of your screen. Then, select the Vault option from the list of connectors. Click Next and you’ll be prompted to add your access.

1. Add account access

You’ll need to provide your Vault instance details and authentication token to access your secrets data. The following configurations are available:
  • Vault Address: The URL of your Vault instance (e.g., https://vault.<your-company>.com.br/). The connector will automatically extract the base URL and use the /v1 API endpoint.
  • Auth Token: Your Vault authentication token (Bearer token) used for API access. This token should have appropriate permissions to read secrets from the KV v2 secrets engine.
Once you’re done, click Next.

2. Select streams

Choose which data streams you want to sync. The connector provides three streams:
  • kv_list_secrets: Lists all secret paths and keys recursively from your Vault instance
  • kv_secrets: Retrieves the actual secret data (key-value pairs) for each secret
  • kv_subkeys: Retrieves the subkey structure for each secret (keys only, values are null)
Tip: The stream can be found more easily by typing its name.
Select the streams and click Next.

3. Configure data streams

Customize how you want your data to appear in your catalog. Select the desired layer where the data will be placed, a folder to organize it inside the layer, a name for each table (which will effectively contain the fetched data) and the type of sync.
  • Layer: choose between the existing layers on your catalog. This is where you will find your new extracted tables as the extraction runs successfully.
  • Folder: a folder can be created inside the selected layer to group all tables being created from this new data source.
  • Table name: we suggest a name, but feel free to customize it. You have the option to add a prefix to all tables at once and make this process faster!
  • Sync Type: you can choose between INCREMENTAL and FULL_TABLE.
    • Incremental: every time the extraction happens, we’ll get only the new data - which is good if, for example, you want to keep every record ever fetched.
    • Full table: every time the extraction happens, we’ll get the current state of the data - which is good if, for example, you don’t want to have deleted data in your catalog.
Once you are done configuring, click Next.

4. Configure data source

Describe your data source for easy identification within your organization, not exceeding 140 characters. To define your Trigger, consider how often you want data to be extracted from this source. This decision usually depends on how frequently you need the new table data updated (every day, once a week, or only at specific times). Optionally, you can define some additional settings:
  • Configure Delta Log Retention and determine for how long we should store old states of this table as it gets updated. Read more about this resource here.
  • Determine when to execute an Additional Full Sync. This will complement the incremental data extractions, ensuring that your data is completely synchronized with your source every once in a while.
Once you are ready, click Next to finalize the setup.

5. Check your new source

You can view your new source on the Sources page. If needed, manually trigger the source extraction by clicking on the arrow button. Once executed, your data will appear in your Catalog.
For you to be able to see it on your Catalog, you need at least one successful source run.

Streams and Fields

Below you’ll find all available data streams from Vault and their corresponding fields:
Parent stream that recursively lists all secret paths and keys from your Vault KV v2 secrets engine. This stream traverses folders and subfolders to discover all available secrets.Key Fields:
  • path - The folder path where the secret or key was found
  • key - The name of the key or secret (folders end with /)
  • is_folder - Boolean indicating whether the key is a folder (true) or an actual secret (false)
How it works:
  • Starts from the root of the KV v2 secrets engine (secret/)
  • Recursively traverses all folders and subfolders
  • Lists both folders and individual secrets
  • Child streams (kv_secrets and kv_subkeys) use this stream’s output to fetch detailed data for each secret
Child stream that retrieves the actual secret data (key-value pairs) for each secret discovered by kv_list_secrets. This stream fetches the complete secret data including all key-value pairs stored in each secret.Key Fields:
  • path - The full path to the secret (e.g., Data Ops/airbnb-contato)
  • version - The version number of the secret
  • data - JSON field containing all key-value pairs stored in the secret. Values can be strings, numbers, booleans, or other JSON types.
  • metadata - Object containing version-specific metadata:
    • created_time - ISO 8601 timestamp when the secret version was created
    • custom_metadata - JSON field with custom metadata key-value pairs (string values)
    • deletion_time - ISO 8601 timestamp when the secret was deleted (empty string if not deleted)
    • destroyed - Boolean indicating if the secret version has been destroyed
    • version - The version number
Notes:
  • Returns one record per secret version
  • The data field can contain any JSON-serializable values (not just strings)
  • Secrets that don’t exist or return 404 are gracefully skipped
Child stream that retrieves the subkey structure for each secret discovered by kv_list_secrets. This stream provides the hierarchical structure of keys within each secret, with values set to null at leaf nodes.Key Fields:
  • path - The full path to the secret (e.g., Data Ops/airbnb-contato)
  • version - The version number of the secret
  • subkeys - JSON field representing the hierarchical structure of keys within the secret. Leaf nodes have null values, while nested objects represent subdirectories with the same structure.
  • metadata - Object containing version-specific metadata (same structure as kv_secrets):
    • created_time - ISO 8601 timestamp when the secret version was created
    • custom_metadata - JSON field with custom metadata key-value pairs (string values)
    • deletion_time - ISO 8601 timestamp when the secret was deleted (empty string if not deleted)
    • destroyed - Boolean indicating if the secret version has been destroyed
    • version - The version number
Notes:
  • Returns the structure of keys without their values (values are null)
  • Useful for understanding the schema and organization of secrets
  • Secrets that don’t exist or return 404 are gracefully skipped

Data Model

The following diagram illustrates the relationships between the data streams in Vault. The arrows indicate how child streams depend on the parent stream for context. Relationship Details:
  • kv_list_secrets is the parent stream that discovers all secrets recursively
  • Both kv_secrets and kv_subkeys are child streams that depend on kv_list_secrets
  • Each record from kv_list_secrets provides a secret_path context that child streams use to fetch detailed data
  • Child streams automatically skip folder records (where is_folder is true)

Use Cases for Data Analysis

This guide outlines valuable business intelligence use cases when consolidating Vault data, along with ready-to-use SQL queries that you can run on Explorer.

1. Secret Inventory and Organization

Get a comprehensive overview of all secrets stored in your Vault instance, organized by path and type. Business Value:
  • Maintain an inventory of all secrets across your organization
  • Identify secrets by their location and path structure
  • Track which secrets are folders vs. actual secret data
  • Audit secret organization and naming conventions
SELECT
   path,
   key,
   is_folder,
   CASE
      WHEN is_folder THEN 'Folder'
      ELSE 'Secret'
   END AS type
FROM
   nekt_raw.vault_kv_list_secrets
ORDER BY
   path,
   key
pathkeyis_foldertype
(root)airbnb-contatofalseSecret
(root)defaultfalseSecret
Data Opsairbnb-contatofalseSecret
Data OpsdefaultfalseSecret
Data SolutionsDataResourcesPostgreSQLfalseSecret
Administrativo Serviços360ImprimirfalseSecret
Administrativo ServiçosACATEfalseSecret

2. Secret Data Analysis

Analyze the actual secret data stored in your Vault, including all key-value pairs and their metadata. Business Value:
  • Understand what data is stored in each secret
  • Track secret versions and creation times
  • Identify secrets that have been deleted or destroyed
  • Monitor custom metadata associated with secrets
SELECT
   path,
   version,
   data,
   metadata.created_time,
   metadata.destroyed,
   metadata.deletion_time,
   metadata.custom_metadata
FROM
   nekt_raw.vault_kv_secrets
WHERE
   metadata.destroyed = false
   AND metadata.deletion_time = ''
ORDER BY
   path,
   version DESC
pathversiondatacreated_timedestroyeddeletion_timecustom_metadata
Data Ops/airbnb-contato1{"Login": "[email protected]", "Password": "***"}2024-09-23T14:27:07Zfalsenull
Data Solutions/DataResourcesPostgreSQL1{"DB_HOST": "db.example.com", "DB_PORT": 5432, "DB_NAME": "mydb"}2024-04-12T13:12:10Zfalsenull

3. Secret Structure Analysis

Analyze the hierarchical structure of keys within secrets to understand their organization and schema. Business Value:
  • Understand the schema and structure of secrets
  • Identify common key patterns across secrets
  • Track changes in secret structure over time
  • Plan migrations or reorganizations based on structure
SELECT
   path,
   version,
   subkeys,
   metadata.created_time,
   metadata.version
FROM
   nekt_raw.vault_kv_subkeys
WHERE
   metadata.destroyed = false
ORDER BY
   path,
   version DESC
pathversionsubkeyscreated_time
Data Solutions/DataResourcesPostgreSQL1{"DB_HOST": null, "DB_PORT": null, "DB_NAME": null}2024-04-12T13:12:10Z
Administrativo Serviços/360Imprimir1{"Senha": null, "Usuário": null}2024-04-12T13:12:10Z

4. Secret Version Tracking

Track secret versions and their lifecycle to understand when secrets were created, modified, or deleted. Business Value:
  • Monitor secret version history
  • Identify recently created or modified secrets
  • Track secret lifecycle and changes over time
  • Audit secret management practices
WITH secret_versions AS (
   SELECT
      path,
      version,
      metadata.created_time,
      metadata.destroyed,
      metadata.deletion_time,
      ROW_NUMBER() OVER (PARTITION BY path ORDER BY version DESC) AS version_rank
   FROM
      nekt_raw.vault_kv_secrets
)
SELECT
   path,
   MAX(version) AS latest_version,
   COUNT(*) AS total_versions,
   MIN(metadata.created_time) AS first_created,
   MAX(metadata.created_time) AS last_modified,
   SUM(CASE WHEN metadata.destroyed THEN 1 ELSE 0 END) AS destroyed_versions,
   SUM(CASE WHEN metadata.deletion_time != '' THEN 1 ELSE 0 END) AS deleted_versions
FROM
   nekt_raw.vault_kv_secrets
GROUP BY
   path
ORDER BY
   last_modified DESC
pathlatest_versiontotal_versionsfirst_createdlast_modifieddestroyed_versionsdeleted_versions
Administrativo Serviços/Amazon332023-08-22T19:56:51Z2024-01-15T10:30:22Z00
Data Ops/airbnb-contato112024-09-23T14:27:07Z2024-09-23T14:27:07Z00
Data Solutions/DataResourcesPostgreSQL112024-04-12T13:12:10Z2024-04-12T13:12:10Z00

Implementation Notes

Data Quality Considerations

  • The connector uses KV v2 secrets engine (secret/ mount path)
  • Paths with spaces and special characters are automatically URL-encoded
  • Secrets that return 404 are gracefully skipped (not treated as errors)
  • Folder records are automatically filtered out from child streams
  • The connector recursively traverses all folders and subfolders

API Limits & Performance

  • A 1-second delay is applied between requests to avoid rate limiting
  • For large Vault instances with many secrets, extraction may take some time
  • The connector handles recursive folder traversal efficiently by tracking visited folders
  • Child streams only process individual secrets (folders are automatically skipped)

Security Considerations

  • Authentication tokens are stored securely and marked as secret fields
  • The connector only reads secrets (does not modify or delete them)
  • Ensure your authentication token has appropriate read permissions for the KV v2 secrets engine
  • Consider using Vault policies to restrict access to only necessary paths