Datadog is a monitoring and analytics platform for cloud-scale applications, providing monitoring of servers, databases, tools, and services through a SaaS-based data analytics platform. It offers comprehensive logging, metrics, and tracing capabilities to help organizations monitor their infrastructure and applications in real-time.

1. Add your Datadog access

  1. In the Sources tab, click on the “Add source” button located on the top right of your screen. Then, select the Datadog option from the list of connectors. Click Next and you’ll be prompted to add your access.

  2. API Key: Enter your Datadog API key. You can find this in your Datadog account under Organization Settings > API Keys.

  3. App Key: Enter your Datadog Application key. You can find this in your Datadog account under Organization Settings > Application Keys.

  4. Start date: Define the date from which you want to start extracting data. This will be the earliest date for which log data will be retrieved.

  5. Query (optional): Specify a custom query to filter the logs you want to extract. If not provided, all logs will be retrieved. You can use Datadog’s query syntax to filter by service, host, tags, or any other attributes.

  6. Click Next.

2. Select your Datadog streams

  1. The next step is letting us know which streams you want to bring.

  2. The following stream is available:

    Stream NameDescriptionPrimary KeyReplication Key
    logsLog events from Datadogidtimestamp

    The logs stream provides the following data:

    • id: Unique identifier for each log event
    • timestamp: When the log event occurred
    • type: Type of log event
    • attributes: Additional log metadata including:
      • host: The host where the log originated
      • service: The service name
      • message: The log message content
      • status: Status of the log event
      • tags: Array of tags associated with the log
      • attributes: Array of custom attributes with key-value pairs
      • timestamp: Timestamp within the attributes object
  3. Select the stream and click Next.

3. Configure your Datadog data streams

  1. Customize how you want your data to appear in your catalog. Select the desired layer where the data will be placed, a folder to organize it inside the layer, a name for each table (which will effectively contain the fetched data) and the type of sync.
  • Layer: choose between the existing layers on your catalog. This is where you will find your new extracted tables as the extraction runs successfully.
  • Folder: a folder can be created inside the selected layer to group all tables being created from this new data source.
  • Table name: we suggest a name, but feel free to customize it. You have the option to add a prefix to all tables at once and make this process faster!
  • Sync Type: depending on the data you are bringing to the lake, you can choose between INCREMENTAL and FULL_TABLE. Read more about Sync Types here.
  1. Click Next.

4. Configure your Datadog data source

  1. Describe your data source for easy identification within your organization. You can inform things like what data it brings, to which team it belongs, etc.

  2. To define your Trigger, consider how often you want data to be extracted from this source. This decision usually depends on how frequently you need the new table data updated (every day, once a week, or only at specific times).

  3. Optionally, you can define some additional settings (if available).

  • Configure Delta Log Retention and determine for how long we should store old states of this table as it gets updated. Read more about this resource here.
  • Determine when to execute an Additional Full Sync. This will complement the incremental data extractions, ensuring that your data is completely synchronized with your source every once in a while.

Check your new source!

  1. Click Next to finalize the setup. Once completed, you’ll receive confirmation that your new source is set up!

  2. You can view your new source on the Sources page. Now, for you to be able to see it on your Catalog, you have to wait for the pipeline to run. You can now monitor it on the Sources page to see its execution and completion. If needed, manually trigger the pipeline by clicking on the refresh icon. Once executed, your new table will appear in the Catalog section.

If you encounter any issues, reach out to us, and we’ll gladly assist you!