
Configuring Nekt as a Source
In the Sources tab, click on the “Add source” button located on the top right of your screen. Then, select the Nekt option from the list of connectors. Click Next and you’ll be prompted to add your access.1. Add account access
You’ll need the following credentials from your Nekt account:- API Key: Your personal API key for accessing the Nekt API. You can generate one in your Nekt workspace settings under the “API Keys” section. Make sure the key has the necessary permissions to read the data you want to sync.
- Start Date: The earliest record date to sync.
2. Select streams
Choose which data streams you want to sync - you can select all streams or pick specific ones that matter most to you.Tip: The stream can be found more easily by typing its name.Select the streams and click Next.
3. Configure data streams
Customize how you want your data to appear in your catalog. Select a name for each table (which will contain the fetched data) and the type of sync.- Table name: we suggest a name, but feel free to customize it. You have the option to add a prefix and make this process faster!
-
Sync Type: you can choose between INCREMENTAL and FULL_TABLE.
- Incremental: every time the extraction happens, we’ll get only the new data - which is good if, for example, you want to keep every record ever fetched.
- Full table: every time the extraction happens, we’ll get the current state of the data - which is good if, for example, you don’t want to have deleted data in your catalog.
4. Configure data source
Describe your data source for easy identification within your organization, not exceeding 140 characters. To define your Trigger, consider how often you want data to be extracted from this source. This decision usually depends on how frequently you need the new table data updated (every day, once a week, or only at specific times). Optionally, you can determine when to execute a full sync. This will complement the incremental data extractions, ensuring that your data is completely synchronized with your source every once in a while. Once you are ready, click Next to finalize the setup.5. Check your new source
You can view your new source on the Sources page. If needed, manually trigger the source extraction by clicking on the arrow button. Once executed, your data will appear in your Catalog.For you to be able to see it on your Catalog, you need at least one successful source run.
Streams and Fields
Below you’ll find all available data streams from Nekt and their corresponding fields:Sources
Sources
Stream containing all data sources configured in your Nekt workspace.Key Fields:
id
- Unique identifier for the sourceslug
- URL-friendly identifier for the sourceactive
- Whether the source is activestatus
- Current status of the sourcedescription
- Description of the sourcetable_prefix
- Prefix applied to tables created by this sourcecreated_at
- When the source was createdupdated_at
- When the source was last updatedconnector_config
- Configuration settings for the connector (JSON string)connector_version
- Version of the connector being usedlast_run
- Timestamp of the last execution
python_execution_cpu
- CPU allocation for Python executionpython_execution_memory
- Memory allocation for Python executionspark_driver_cores
- Number of cores for Spark driverspark_driver_memory
- Memory allocation for Spark driverspark_executor_cores
- Number of cores per Spark executorspark_executor_memory
- Memory allocation per Spark executorspark_executor_instances
- Number of Spark executor instancesspark_executor_disk
- Disk space allocation per Spark executorspark_execution_timeout_minutes
- Timeout for Spark execution
settings_number_of_retries
- Number of retry attempts on failuresettings_retry_delay_seconds
- Delay between retry attemptssettings_max_consecutive_failures
- Maximum consecutive failures before pausingsettings_full_sync_cron
- Cron expression for full sync schedulesettings_full_sync_cron_timezone
- Timezone for full sync schedule
output_layer
- Target layer information (id, name, slug, description, database_name)created_by
- User who created the source (id, name, email, role, permissions)
Runs
Runs
Stream containing all pipeline execution runs in your Nekt workspace.Key Fields:
id
- Unique identifier for the runnumber
- Sequential run numbercreated_at
- When the run was createdupdated_at
- When the run was last updatedstarted_at
- When the run execution startedended_at
- When the run execution endedduration_seconds
- Total execution time in secondsstatus
- Current status of the run (e.g., running, completed, failed)full_sync
- Whether this was a full synchronization runcredit_charge_status
- Status of credit charging for this runtriggered_by_token
- Token used to trigger the run (if applicable)
trigger
- Trigger configuration (type, cron_expression, cron_timezone, event_rule)triggered_by
- User who triggered the runsource
- Source information (id, slug, active, status, description)destination
- Destination information (id, slug, active, status, description)transformation
- Transformation information (id, slug, type, active, status, description)
Destinations
Destinations
Stream containing all data destinations configured in your Nekt workspace.Key Fields:
id
- Unique identifier for the destinationslug
- URL-friendly identifier for the destinationactive
- Whether the destination is activestatus
- Current status of the destinationdescription
- Description of the destinationis_ml_ai_connector
- Whether this is an ML/AI connectorconnector_config
- Configuration settings for the connector (JSON string)connector_version
- Version of the connector being usedcreated_at
- When the destination was createdupdated_at
- When the destination was last updatedlast_run
- Timestamp of the last execution
python_execution_cpu
- CPU allocation for Python executionpython_execution_memory
- Memory allocation for Python execution
settings_number_of_retries
- Number of retry attempts on failuresettings_retry_delay_seconds
- Delay between retry attemptssettings_max_consecutive_failures
- Maximum consecutive failures before pausing
input_tables
- Array of input table configurations (table, primary_keys, fields, name, layer)created_by
- User who created the destination
Transformations
Transformations
Stream containing all data transformations configured in your Nekt workspace.Key Fields:
id
- Unique identifier for the transformationslug
- URL-friendly identifier for the transformationtype
- Type of transformation (e.g., pyspark, sql)active
- Whether the transformation is activestatus
- Current status of the transformationdescription
- Description of the transformationcode
- Transformation code/scriptcreated_at
- When the transformation was createdupdated_at
- When the transformation was last updatedlast_run
- Timestamp of the last executiondependencies
- Array of Python dependenciesadd_apache_sedona
- Whether Apache Sedona is enabled
spark_driver_cores
- Number of cores for Spark driverspark_driver_memory
- Memory allocation for Spark driverspark_executor_cores
- Number of cores per Spark executorspark_executor_memory
- Memory allocation per Spark executorspark_executor_instances
- Number of Spark executor instancesspark_executor_disk
- Disk space allocation per Spark executorspark_execution_timeout_minutes
- Timeout for Spark execution
settings_number_of_retries
- Number of retry attempts on failuresettings_retry_delay_seconds
- Delay between retry attemptssettings_max_consecutive_failures
- Maximum consecutive failures before pausingsettings_timezone
- Timezone for transformation executiondelta_log_retention_duration
- Delta log retention period
input_tables
- Array of input tables (id, name_reference, table, timestamps)output_tables
- Array of output tables (id, name_reference, table, timestamps)created_by
- User who created the transformationinput_volumes
- Array of input volume references
Activities
Activities
Stream containing all audit activities and changes in your Nekt workspace.Key Fields:
id
- Unique identifier for the activityactivity_type
- Type of activity performedchanged_fields
- Fields that were changed (JSON string)created_at
- When the activity occurredcreated_by_system
- Whether the activity was performed by the systempipeline_automatically_paused_after_x_failures
- Auto-pause threshold
source
- Related source ID (if applicable)destination
- Related destination ID (if applicable)transformation
- Related transformation ID (if applicable)table
- Related table ID (if applicable)volume
- Related volume ID (if applicable)run
- Related run ID (if applicable)visualization
- Related visualization ID (if applicable)
created_by
- User who performed the activity
Source Triggers
Source Triggers
Stream containing trigger configurations for data sources.Key Fields:
id
- Unique identifier for the triggertype
- Type of trigger (e.g., cron, event)created_at
- When the trigger was createdupdated_at
- When the trigger was last updatedcron_expression
- Cron expression for scheduled triggerscron_timezone
- Timezone for cron executionevents
- Array of events that trigger executionevent_rule
- Rule for event-based triggers
Destination Triggers
Destination Triggers
Stream containing trigger configurations for data destinations.Key Fields:
id
- Unique identifier for the triggertype
- Type of trigger (e.g., cron, event)created_at
- When the trigger was createdupdated_at
- When the trigger was last updatedcron_expression
- Cron expression for scheduled triggerscron_timezone
- Timezone for cron executionevents
- Array of events that trigger executionevent_rule
- Rule for event-based triggers
Data Model
The following diagram illustrates the relationships between the core data streams in Nekt. The arrows indicate the join keys that link the different entities, providing a clear overview of the data platform structure.Use Cases for Data Analysis
Here are some valuable business intelligence use cases when consolidating Nekt platform data, along with ready-to-use SQL queries that you can run on Explorer.1. Daily Credits Consumption per Pipeline
Monitor credit consumption for each pipeline in a daily basis. Business Value:- Track pipeline credit consumption
- Analyze resource utilization and make future projections
- Identify pipelines that can be optmized
SQL code
SQL code
2. Data Source Performance and Reliability Analysis
Monitor data pipeline performance, execution times, and failure rates across your data sources. Business Value:- Track pipeline execution trends and identify performance bottlenecks
- Monitor data pipeline reliability and uptime
- Analyze resource utilization and optimize infrastructure costs
- Identify sources that need attention
SQL code
SQL code
3. User Activity and Platform Usage Analysis
Track user engagement, platform adoption, and operational activities across your Nekt workspace. Business Value:- Monitor platform adoption and user engagement
- Identify power users and training needs
- Track operational patterns and peak usage times
- Optimize team collaboration and workflow efficiency
SQL code
SQL code