Configuring Vault as a Source
In the Sources tab, click on the “Add source” button located on the top right of your screen. Then, select the Vault option from the list of connectors. Click Next and you’ll be prompted to add your access.1. Add account access
You’ll need to provide your Vault instance details and authentication token to access your secrets data. The following configurations are available:-
Vault Address: The URL of your Vault instance (e.g.,
https://vault.<your-company>.com.br/). The connector will automatically extract the base URL and use the/v1API endpoint. - Auth Token: Your Vault authentication token (Bearer token) used for API access. This token should have appropriate permissions to read secrets from the KV v2 secrets engine.
2. Select streams
Choose which data streams you want to sync. The connector provides three streams:- kv_list_secrets: Lists all secret paths and keys recursively from your Vault instance
- kv_secrets: Retrieves the actual secret data (key-value pairs) for each secret
- kv_subkeys: Retrieves the subkey structure for each secret (keys only, values are null)
Tip: The stream can be found more easily by typing its name.Select the streams and click Next.
3. Configure data streams
Customize how you want your data to appear in your catalog. Select the desired layer where the data will be placed, a folder to organize it inside the layer, a name for each table (which will effectively contain the fetched data) and the type of sync.- Layer: choose between the existing layers on your catalog. This is where you will find your new extracted tables as the extraction runs successfully.
- Folder: a folder can be created inside the selected layer to group all tables being created from this new data source.
- Table name: we suggest a name, but feel free to customize it. You have the option to add a prefix to all tables at once and make this process faster!
- Sync Type: you can choose between INCREMENTAL and FULL_TABLE.
- Incremental: every time the extraction happens, we’ll get only the new data - which is good if, for example, you want to keep every record ever fetched.
- Full table: every time the extraction happens, we’ll get the current state of the data - which is good if, for example, you don’t want to have deleted data in your catalog.
4. Configure data source
Describe your data source for easy identification within your organization, not exceeding 140 characters. To define your Trigger, consider how often you want data to be extracted from this source. This decision usually depends on how frequently you need the new table data updated (every day, once a week, or only at specific times). Optionally, you can define some additional settings:- Configure Delta Log Retention and determine for how long we should store old states of this table as it gets updated. Read more about this resource here.
- Determine when to execute an Additional Full Sync. This will complement the incremental data extractions, ensuring that your data is completely synchronized with your source every once in a while.
5. Check your new source
You can view your new source on the Sources page. If needed, manually trigger the source extraction by clicking on the arrow button. Once executed, your data will appear in your Catalog.Streams and Fields
Below you’ll find all available data streams from Vault and their corresponding fields:kv_list_secrets
kv_list_secrets
Parent stream that recursively lists all secret paths and keys from your Vault KV v2 secrets engine. This stream traverses folders and subfolders to discover all available secrets.Key Fields:
path- The folder path where the secret or key was foundkey- The name of the key or secret (folders end with/)is_folder- Boolean indicating whether the key is a folder (true) or an actual secret (false)
- Starts from the root of the KV v2 secrets engine (
secret/) - Recursively traverses all folders and subfolders
- Lists both folders and individual secrets
- Child streams (
kv_secretsandkv_subkeys) use this stream’s output to fetch detailed data for each secret
kv_secrets
kv_secrets
Child stream that retrieves the actual secret data (key-value pairs) for each secret discovered by
kv_list_secrets. This stream fetches the complete secret data including all key-value pairs stored in each secret.Key Fields:path- The full path to the secret (e.g.,Data Ops/airbnb-contato)version- The version number of the secretdata- JSON field containing all key-value pairs stored in the secret. Values can be strings, numbers, booleans, or other JSON types.metadata- Object containing version-specific metadata:created_time- ISO 8601 timestamp when the secret version was createdcustom_metadata- JSON field with custom metadata key-value pairs (string values)deletion_time- ISO 8601 timestamp when the secret was deleted (empty string if not deleted)destroyed- Boolean indicating if the secret version has been destroyedversion- The version number
- Returns one record per secret version
- The
datafield can contain any JSON-serializable values (not just strings) - Secrets that don’t exist or return 404 are gracefully skipped
kv_subkeys
kv_subkeys
Child stream that retrieves the subkey structure for each secret discovered by
kv_list_secrets. This stream provides the hierarchical structure of keys within each secret, with values set to null at leaf nodes.Key Fields:path- The full path to the secret (e.g.,Data Ops/airbnb-contato)version- The version number of the secretsubkeys- JSON field representing the hierarchical structure of keys within the secret. Leaf nodes havenullvalues, while nested objects represent subdirectories with the same structure.metadata- Object containing version-specific metadata (same structure askv_secrets):created_time- ISO 8601 timestamp when the secret version was createdcustom_metadata- JSON field with custom metadata key-value pairs (string values)deletion_time- ISO 8601 timestamp when the secret was deleted (empty string if not deleted)destroyed- Boolean indicating if the secret version has been destroyedversion- The version number
- Returns the structure of keys without their values (values are
null) - Useful for understanding the schema and organization of secrets
- Secrets that don’t exist or return 404 are gracefully skipped
Data Model
The following diagram illustrates the relationships between the data streams in Vault. The arrows indicate how child streams depend on the parent stream for context. Relationship Details:kv_list_secretsis the parent stream that discovers all secrets recursively- Both
kv_secretsandkv_subkeysare child streams that depend onkv_list_secrets - Each record from
kv_list_secretsprovides asecret_pathcontext that child streams use to fetch detailed data - Child streams automatically skip folder records (where
is_folderistrue)
Use Cases for Data Analysis
This guide outlines valuable business intelligence use cases when consolidating Vault data, along with ready-to-use SQL queries that you can run on Explorer.1. Secret Inventory and Organization
Get a comprehensive overview of all secrets stored in your Vault instance, organized by path and type. Business Value:- Maintain an inventory of all secrets across your organization
- Identify secrets by their location and path structure
- Track which secrets are folders vs. actual secret data
- Audit secret organization and naming conventions
SQL query
SQL query
- AWS
- GCP
Sample Result
Sample Result
| path | key | is_folder | type |
|---|---|---|---|
| (root) | airbnb-contato | false | Secret |
| (root) | default | false | Secret |
| Data Ops | airbnb-contato | false | Secret |
| Data Ops | default | false | Secret |
| Data Solutions | DataResourcesPostgreSQL | false | Secret |
| Administrativo Serviços | 360Imprimir | false | Secret |
| Administrativo Serviços | ACATE | false | Secret |
2. Secret Data Analysis
Analyze the actual secret data stored in your Vault, including all key-value pairs and their metadata. Business Value:- Understand what data is stored in each secret
- Track secret versions and creation times
- Identify secrets that have been deleted or destroyed
- Monitor custom metadata associated with secrets
SQL query
SQL query
- AWS
- GCP
Sample Result
Sample Result
| path | version | data | created_time | destroyed | deletion_time | custom_metadata |
|---|---|---|---|---|---|---|
| Data Ops/airbnb-contato | 1 | {"Login": "[email protected]", "Password": "***"} | 2024-09-23T14:27:07Z | false | null | |
| Data Solutions/DataResourcesPostgreSQL | 1 | {"DB_HOST": "db.example.com", "DB_PORT": 5432, "DB_NAME": "mydb"} | 2024-04-12T13:12:10Z | false | null |
3. Secret Structure Analysis
Analyze the hierarchical structure of keys within secrets to understand their organization and schema. Business Value:- Understand the schema and structure of secrets
- Identify common key patterns across secrets
- Track changes in secret structure over time
- Plan migrations or reorganizations based on structure
SQL query
SQL query
- AWS
- GCP
Sample Result
Sample Result
| path | version | subkeys | created_time |
|---|---|---|---|
| Data Solutions/DataResourcesPostgreSQL | 1 | {"DB_HOST": null, "DB_PORT": null, "DB_NAME": null} | 2024-04-12T13:12:10Z |
| Administrativo Serviços/360Imprimir | 1 | {"Senha": null, "Usuário": null} | 2024-04-12T13:12:10Z |
4. Secret Version Tracking
Track secret versions and their lifecycle to understand when secrets were created, modified, or deleted. Business Value:- Monitor secret version history
- Identify recently created or modified secrets
- Track secret lifecycle and changes over time
- Audit secret management practices
SQL query
SQL query
- AWS
- GCP
Sample Result
Sample Result
| path | latest_version | total_versions | first_created | last_modified | destroyed_versions | deleted_versions |
|---|---|---|---|---|---|---|
| Administrativo Serviços/Amazon | 3 | 3 | 2023-08-22T19:56:51Z | 2024-01-15T10:30:22Z | 0 | 0 |
| Data Ops/airbnb-contato | 1 | 1 | 2024-09-23T14:27:07Z | 2024-09-23T14:27:07Z | 0 | 0 |
| Data Solutions/DataResourcesPostgreSQL | 1 | 1 | 2024-04-12T13:12:10Z | 2024-04-12T13:12:10Z | 0 | 0 |
Implementation Notes
Data Quality Considerations
- The connector uses KV v2 secrets engine (
secret/mount path) - Paths with spaces and special characters are automatically URL-encoded
- Secrets that return 404 are gracefully skipped (not treated as errors)
- Folder records are automatically filtered out from child streams
- The connector recursively traverses all folders and subfolders
API Limits & Performance
- A 1-second delay is applied between requests to avoid rate limiting
- For large Vault instances with many secrets, extraction may take some time
- The connector handles recursive folder traversal efficiently by tracking visited folders
- Child streams only process individual secrets (folders are automatically skipped)
Security Considerations
- Authentication tokens are stored securely and marked as secret fields
- The connector only reads secrets (does not modify or delete them)
- Ensure your authentication token has appropriate read permissions for the KV v2 secrets engine
- Consider using Vault policies to restrict access to only necessary paths