Skip to main content
ClickHouse is a high-performance, column-oriented SQL database management system designed for online analytical processing (OLAP). It’s known for its blazing-fast query performance on large datasets, making it ideal for real-time analytics, log processing, and business intelligence applications.

Configuring ClickHouse as a Source

In the Sources tab, click on the “Add source” button located on the top right of your screen. Then, select the ClickHouse option from the list of connectors. Click Next and you’ll be prompted to add your database access.

1. Add database access

You’ll need the following connection details to connect to your ClickHouse database:
  • Host: The host address of your ClickHouse database. Do not include the protocol (http:// or https://).
    • For ClickHouse Cloud: your-instance.region.provider.clickhouse.cloud
    • For self-hosted: Your server hostname or IP address
  • Port: The port used for connecting to your ClickHouse database.
    • Default for HTTPS: 8443
    • Default for HTTP: 8123
    • Default for Native protocol: 9440 (secure) or 9000 (insecure)
  • Database: The name of the database you want to extract data from. Default is default.
  • Username: The username for accessing your ClickHouse database. Default is default.
  • Password: The password for the specified user.
  • Batch Size: The number of rows to fetch per batch during extraction. Default is 50000. Adjust based on your table row sizes and memory constraints.
If you’re using ClickHouse Cloud:
  1. Log in to your ClickHouse Cloud Console
  2. Select your service from the dashboard
  3. Click on Connect in the left sidebar
  4. Choose HTTPS as the connection method
  5. Copy the following details:
    • Host: The hostname shown (e.g., abc123.us-east1.gcp.clickhouse.cloud)
    • Port: Usually 8443 for HTTPS
    • Username: Your configured username (default is default)
    • Password: The password you set when creating the service
Make sure your ClickHouse Cloud service allows connections from Nekt’s IP addresses. You may need to configure the IP Access List in your service settings.
Once you’re done, click Next.

2. Select streams

Choose which data streams (tables/views) you want to sync. You can select entire databases or pick specific tables. The connector will automatically discover all available tables and views in the specified database. System tables are excluded by default.
Tip: The stream can be found more easily by typing its name.
Select the streams and click Next.

3. Configure data streams

Customize how you want your data to appear in your catalog. Select a name for each table (which will contain the fetched data) and the type of sync.
  • Table name: We suggest a name, but feel free to customize it. You have the option to add a prefix and make this process faster!
  • Sync Type: You can choose between INCREMENTAL and FULL_TABLE.
    • Incremental: Every time the extraction happens, we’ll get only the new data since the last sync. This is efficient for large tables with a reliable timestamp or incrementing ID column.
    • Full table: Every time the extraction happens, we’ll get the current state of the data. This is useful for dimension tables or when you need complete accuracy.
For incremental syncs, you’ll need to select a Replication Key - a column that indicates when a row was created or modified (e.g., created_at, updated_at, or an auto-incrementing id).
Once you are done configuring, click Next.

4. Configure data source

Describe your data source for easy identification within your organization, not exceeding 140 characters. To define your Trigger, consider how often you want data to be extracted from this source. This decision usually depends on how frequently your ClickHouse data is updated and how current you need your analytics to be. Optionally, you can determine when to execute a full sync. This will complement the incremental data extractions, ensuring that your data is completely synchronized with your source every once in a while. Once you are ready, click Next to finalize the setup.

5. Check your new source

You can view your new source on the Sources page. If needed, manually trigger the source extraction by clicking on the arrow button. Once executed, your data will appear in your Catalog.
For you to be able to see it on your Catalog, you need at least one successful source run.

Implementation Notes

By default, the connector uses secure HTTPS connections to your ClickHouse database. This ensures your data is encrypted in transit.ClickHouse Cloud: Always uses HTTPS on port 8443.Self-hosted: Make sure your ClickHouse server is configured with SSL/TLS certificates for secure connections.
ClickHouse is optimized for reading large amounts of data quickly. To get the best performance:
  • Batch Size: The default batch size of 50,000 rows works well for most use cases. If you have very wide tables (many columns), consider reducing this value.
  • Incremental Syncs: Use incremental syncs whenever possible. ClickHouse excels at filtering data by sorted columns (typically date-based).
  • Replication Key Selection: Choose a column that ClickHouse can efficiently filter on like updated_at or a column that identifies a record has been modified.
ClickHouse DateTime columns don’t accept ISO format strings with timezone suffixes in comparisons. The connector automatically handles this by converting bookmark values to ClickHouse-compatible format (YYYY-MM-DD HH:MM:SS).This means you don’t need to worry about timezone handling - the connector takes care of it automatically.
The following system databases are automatically excluded from discovery:
  • system
  • INFORMATION_SCHEMA
  • information_schema
Only user-created databases and tables will be available for extraction.
ClickHouse data types are mapped to standard types for compatibility:
ClickHouse TypeOutput Type
Int8, Int16, Int32, Int64Integer
UInt8, UInt16, UInt32, UInt64Integer
Float32, Float64Number
DecimalNumber
String, FixedStringString
Date, Date32Date
DateTime, DateTime64DateTime
BoolBoolean
UUIDString
ArrayArray
Nullable(T)Nullable of T