Nekt is a modern data platform that provides comprehensive data engineering solutions for companies. It enables organizations to extract, transform, and load data from various external sources through custom connectors, manage data pipelines, and create powerful analytics workflows.

Configuring Nekt as a Source

In the Sources tab, click on the “Add source” button located on the top right of your screen. Then, select the Nekt option from the list of connectors. Click Next and you’ll be prompted to add your access.

1. Add account access

You’ll need the following credentials from your Nekt account:

API Key: Your personal API key for accessing the Nekt API. You can generate one in your Nekt workspace settings under the “API Keys” section. Make sure the key has the necessary permissions to read the data you want to sync.
Start Date: The earliest record date to sync.

Once you have the required credentials, add the account access and click Next.

2. Select streams

Choose which data streams you want to sync - you can select all streams or pick specific ones that matter most to you.

Tip: The stream can be found more easily by typing its name.

Select the streams and click Next.

3. Configure data streams

Customize how you want your data to appear in your catalog. Select a name for each table (which will contain the fetched data) and the type of sync.

Table name: we suggest a name, but feel free to customize it. You have the option to add a prefix and make this process faster!
Sync Type: you can choose between INCREMENTAL and FULL_TABLE.
- Incremental: every time the extraction happens, we’ll get only the new data - which is good if, for example, you want to keep every record ever fetched.
- Full table: every time the extraction happens, we’ll get the current state of the data - which is good if, for example, you don’t want to have deleted data in your catalog.

Once you are done configuring, click Next.

4. Configure data source

Describe your data source for easy identification within your organization, not exceeding 140 characters. To define your Trigger, consider how often you want data to be extracted from this source. This decision usually depends on how frequently you need the new table data updated (every day, once a week, or only at specific times). Optionally, you can determine when to execute a full sync. This will complement the incremental data extractions, ensuring that your data is completely synchronized with your source every once in a while. Once you are ready, click Next to finalize the setup.

5. Check your new source

You can view your new source on the Sources page. If needed, manually trigger the source extraction by clicking on the arrow button. Once executed, your data will appear in your Catalog.

For you to be able to see it on your Catalog, you need at least one successful source run.

Streams and Fields

Below you’ll find all available data streams from Nekt and their corresponding fields:

Sources

Stream containing all data sources configured in your Nekt workspace.Key Fields:

id - Unique identifier for the source
slug - URL-friendly identifier for the source
active - Whether the source is active
status - Current status of the source
description - Description of the source
table_prefix - Prefix applied to tables created by this source
created_at - When the source was created
updated_at - When the source was last updated
connector_config - Configuration settings for the connector (JSON string)
connector_version - Version of the connector being used
last_run - Timestamp of the last execution

Resource Configuration:

python_execution_cpu - CPU allocation for Python execution
python_execution_memory - Memory allocation for Python execution
spark_driver_cores - Number of cores for Spark driver
spark_driver_memory - Memory allocation for Spark driver
spark_executor_cores - Number of cores per Spark executor
spark_executor_memory - Memory allocation per Spark executor
spark_executor_instances - Number of Spark executor instances
spark_executor_disk - Disk space allocation per Spark executor
spark_execution_timeout_minutes - Timeout for Spark execution

Settings:

settings_number_of_retries - Number of retry attempts on failure
settings_retry_delay_seconds - Delay between retry attempts
settings_max_consecutive_failures - Maximum consecutive failures before pausing
settings_full_sync_cron - Cron expression for full sync schedule
settings_full_sync_cron_timezone - Timezone for full sync schedule

Nested Objects:

output_layer - Target layer information (id, name, slug, description, database_name)
created_by - User who created the source (id, name, email, role, permissions)

Runs

Stream containing all pipeline execution runs in your Nekt workspace.Key Fields:

id - Unique identifier for the run
number - Sequential run number
created_at - When the run was created
updated_at - When the run was last updated
started_at - When the run execution started
ended_at - When the run execution ended
duration_seconds - Total execution time in seconds
status - Current status of the run (e.g., running, completed, failed)
full_sync - Whether this was a full synchronization run
credit_charge_status - Status of credit charging for this run
triggered_by_token - Token used to trigger the run (if applicable)

Nested Objects:

trigger - Trigger configuration (type, cron_expression, cron_timezone, event_rule)
triggered_by - User who triggered the run
source - Source information (id, slug, active, status, description)
destination - Destination information (id, slug, active, status, description)
transformation - Transformation information (id, slug, type, active, status, description)

Destinations

Stream containing all data destinations configured in your Nekt workspace.Key Fields:

id - Unique identifier for the destination
slug - URL-friendly identifier for the destination
active - Whether the destination is active
status - Current status of the destination
description - Description of the destination
is_ml_ai_connector - Whether this is an ML/AI connector
connector_config - Configuration settings for the connector (JSON string)
connector_version - Version of the connector being used
created_at - When the destination was created
updated_at - When the destination was last updated
last_run - Timestamp of the last execution

Resource Configuration:

python_execution_cpu - CPU allocation for Python execution
python_execution_memory - Memory allocation for Python execution

Settings:

settings_number_of_retries - Number of retry attempts on failure
settings_retry_delay_seconds - Delay between retry attempts
settings_max_consecutive_failures - Maximum consecutive failures before pausing

Nested Objects:

input_tables - Array of input table configurations (table, primary_keys, fields, name, layer)
created_by - User who created the destination

Transformations

Stream containing all data transformations configured in your Nekt workspace.Key Fields:

id - Unique identifier for the transformation
slug - URL-friendly identifier for the transformation
type - Type of transformation (e.g., pyspark, sql)
active - Whether the transformation is active
status - Current status of the transformation
description - Description of the transformation
code - Transformation code/script
created_at - When the transformation was created
updated_at - When the transformation was last updated
last_run - Timestamp of the last execution
dependencies - Array of Python dependencies
add_apache_sedona - Whether Apache Sedona is enabled

Resource Configuration:

spark_driver_cores - Number of cores for Spark driver
spark_driver_memory - Memory allocation for Spark driver
spark_executor_cores - Number of cores per Spark executor
spark_executor_memory - Memory allocation per Spark executor
spark_executor_instances - Number of Spark executor instances
spark_executor_disk - Disk space allocation per Spark executor
spark_execution_timeout_minutes - Timeout for Spark execution

Settings:

settings_number_of_retries - Number of retry attempts on failure
settings_retry_delay_seconds - Delay between retry attempts
settings_max_consecutive_failures - Maximum consecutive failures before pausing
settings_timezone - Timezone for transformation execution
delta_log_retention_duration - Delta log retention period

Nested Objects:

input_tables - Array of input tables (id, name_reference, table, timestamps)
output_tables - Array of output tables (id, name_reference, table, timestamps)
created_by - User who created the transformation
input_volumes - Array of input volume references

Activities

Stream containing all audit activities and changes in your Nekt workspace.Key Fields:

id - Unique identifier for the activity
activity_type - Type of activity performed
changed_fields - Fields that were changed (JSON string)
created_at - When the activity occurred
created_by_system - Whether the activity was performed by the system
pipeline_automatically_paused_after_x_failures - Auto-pause threshold

Entity References:

source - Related source ID (if applicable)
destination - Related destination ID (if applicable)
transformation - Related transformation ID (if applicable)
table - Related table ID (if applicable)
volume - Related volume ID (if applicable)
run - Related run ID (if applicable)
visualization - Related visualization ID (if applicable)

Nested Objects:

created_by - User who performed the activity

Source Triggers

Stream containing trigger configurations for data sources.Key Fields:

id - Unique identifier for the trigger
type - Type of trigger (e.g., cron, event)
created_at - When the trigger was created
updated_at - When the trigger was last updated
cron_expression - Cron expression for scheduled triggers
cron_timezone - Timezone for cron execution
events - Array of events that trigger execution
event_rule - Rule for event-based triggers

Destination Triggers

Stream containing trigger configurations for data destinations.Key Fields:

id - Unique identifier for the trigger
type - Type of trigger (e.g., cron, event)
created_at - When the trigger was created
updated_at - When the trigger was last updated
cron_expression - Cron expression for scheduled triggers
cron_timezone - Timezone for cron execution
events - Array of events that trigger execution
event_rule - Rule for event-based triggers

Data Model

The following diagram illustrates the relationships between the core data streams in Nekt. The arrows indicate the join keys that link the different entities, providing a clear overview of the data platform structure.

Use Cases for Data Analysis

Here are some valuable business intelligence use cases when consolidating Nekt platform data, along with ready-to-use SQL queries that you can run on Explorer.

1. Daily Credits Consumption per Pipeline

Monitor credit consumption for each pipeline in a daily basis. Business Value:

Track pipeline credit consumption
Analyze resource utilization and make future projections
Identify pipelines that can be optmized

SQL code

SELECT
	coalesce(s.slug, t.slug, d.slug) AS identifier,
	DATE(r.ended_at) AS date,
	count(r.id) AS total_runs,
	sum(r.duration_seconds) / 60 AS duration_minutes
FROM
	"nekt_raw"."nekt_runs" r
	LEFT JOIN "nekt_raw"."nekt_sources" s ON r.source.id = s.id
	LEFT JOIN "nekt_raw"."nekt_destinations" d ON r.destination.id = d.id
	LEFT JOIN "nekt_raw"."nekt_transformations" t ON r.transformation.id = t.id
WHERE
	r.credit_charge_status = 'charged'
GROUP BY
	coalesce(s.slug, t.slug, d.slug),
	DATE(r.ended_at)
ORDER BY
	DATE(r.ended_at) DESC

2. Data Source Performance and Reliability Analysis

Monitor data pipeline performance, execution times, and failure rates across your data sources. Business Value:

Track pipeline execution trends and identify performance bottlenecks
Monitor data pipeline reliability and uptime
Analyze resource utilization and optimize infrastructure costs
Identify sources that need attention

SQL code

WITH
	run_metrics AS (
		SELECT
			r.source.slug AS source_name,
			r.status,
			r.full_sync,
			r.duration_seconds,
			r.started_at,
			r.ended_at,
			r.credit_charge_status,
			DATE_TRUNC ('day', r.started_at) AS run_date,
			CASE
				WHEN r.status = 'success' THEN 1
				ELSE 0
			END AS success_flag,
			CASE
				WHEN r.status = 'failed' THEN 1
				ELSE 0
			END AS failure_flag
		FROM
			nekt_raw.nekt_runs r
		WHERE
			r.started_at >= CURRENT_DATE - INTERVAL '30' DAY
			AND r.started_at IS NOT NULL
	),
	daily_summary AS (
		SELECT
			run_date,
			source_name,
			COUNT(*) AS total_runs,
			SUM(success_flag) AS successful_runs,
			SUM(failure_flag) AS failed_runs,
			AVG(duration_seconds) AS avg_duration_seconds,
			APPROX_PERCENTILE (duration_seconds, 0.5) AS median_duration_seconds,
			MAX(duration_seconds) AS max_duration_seconds,
			COUNT(
				CASE
					WHEN full_sync = TRUE THEN 1
				END
			) AS full_sync_runs
		FROM
			run_metrics
		WHERE
			source_name IS NOT NULL
		GROUP BY
			run_date,
			source_name
	),
	source_reliability AS (
		SELECT
			source_name,
			SUM(total_runs) AS total_runs_30d,
			SUM(successful_runs) AS total_successful_runs,
			SUM(failed_runs) AS total_failed_runs,
			ROUND(
				CAST(SUM(successful_runs) AS DOUBLE) / NULLIF(SUM(total_runs), 0) * 100,
				2
			) AS success_rate_percentage,
			ROUND(AVG(avg_duration_seconds) / 60.0, 2) AS avg_duration_minutes,
			ROUND(AVG(median_duration_seconds) / 60.0, 2) AS median_duration_minutes,
			COUNT(DISTINCT run_date) AS active_days
		FROM
			daily_summary
		GROUP BY
			source_name
	)
SELECT
	source_name,
	total_runs_30d,
	total_successful_runs,
	total_failed_runs,
	success_rate_percentage,
	avg_duration_minutes,
	median_duration_minutes,
	active_days,
	CASE
		WHEN success_rate_percentage >= 95 THEN 'Excellent'
		WHEN success_rate_percentage >= 90 THEN 'Good'
		WHEN success_rate_percentage >= 80 THEN 'Needs Attention'
		ELSE 'Critical'
	END AS reliability_status,
	CASE
		WHEN avg_duration_minutes <= 5 THEN 'Fast'
		WHEN avg_duration_minutes <= 15 THEN 'Normal'
		WHEN avg_duration_minutes <= 60 THEN 'Slow'
		ELSE 'Very Slow'
	END AS performance_status
FROM
	source_reliability
WHERE
	total_runs_30d > 0
ORDER BY
	total_runs_30d DESC,
	success_rate_percentage DESC

3. User Activity and Platform Usage Analysis

Track user engagement, platform adoption, and operational activities across your Nekt workspace. Business Value:

Monitor platform adoption and user engagement
Identify power users and training needs
Track operational patterns and peak usage times
Optimize team collaboration and workflow efficiency

SQL code

WITH
  user_activities AS (
    SELECT
      a.created_by.email AS user_email,
      a.created_by.first_name || ' ' || a.created_by.last_name AS user_name,
      a.created_by.role AS user_role,
      a.created_by.functional_area AS functional_area,
      a.activity_type,
      a.created_at,
      a.created_by_system,
      DATE_TRUNC('day', a.created_at) AS activity_date,
      DATE_TRUNC('hour', a.created_at) AS activity_hour,
      CASE
        WHEN a.source IS NOT NULL THEN 'source'
        WHEN a.destination IS NOT NULL THEN 'destination'
        WHEN a.transformation IS NOT NULL THEN 'transformation'
        WHEN a.visualization IS NOT NULL THEN 'visualization'
        WHEN a.run IS NOT NULL THEN 'run'
        ELSE 'other'
      END AS entity_type
    FROM nekt_raw.nekt_activities a
    WHERE a.created_at >= CURRENT_DATE - INTERVAL '30' DAY
      AND a.created_by_system = false
      AND a.created_by.email IS NOT NULL
  ),
  user_summary AS (
    SELECT
      user_email,
      user_name,
      user_role,
      functional_area,
      COUNT(*) AS total_activities,
      COUNT(DISTINCT activity_date) AS active_days,
      COUNT(DISTINCT activity_type) AS unique_activity_types,
      COUNT(DISTINCT entity_type) AS entity_types_used,
      MIN(created_at) AS first_activity,
      MAX(created_at) AS last_activity,
      ARBITRARY(activity_type) AS most_common_activity,
      ARBITRARY(entity_type) AS most_used_entity_type
    FROM user_activities
    GROUP BY user_email, user_name, user_role, functional_area
  ),
  hourly_patterns AS (
    SELECT
      user_email,
      EXTRACT(hour FROM activity_hour) AS hour_of_day,
      COUNT(*) AS activities_count
    FROM user_activities
    GROUP BY user_email, EXTRACT(hour FROM activity_hour)
  ),
  peak_hours AS (
    SELECT
      user_email,
      hour_of_day,
      activities_count,
      ROW_NUMBER() OVER (PARTITION BY user_email ORDER BY activities_count DESC) AS hour_rank
    FROM hourly_patterns
  ),
  team_collaboration AS (
    SELECT
      functional_area,
      COUNT(DISTINCT user_email) AS team_size,
      SUM(total_activities) AS team_total_activities,
      AVG(total_activities) AS avg_activities_per_user,
      AVG(active_days) AS avg_active_days_per_user,
      SUM(CASE WHEN active_days >= 20 THEN 1 ELSE 0 END) AS highly_active_users
    FROM user_summary
    WHERE functional_area IS NOT NULL
    GROUP BY functional_area
  )
SELECT
  us.user_name,
  us.user_email,
  us.user_role,
  us.functional_area,
  us.total_activities,
  us.active_days,
  ROUND(
    CAST(us.active_days AS DOUBLE) / 30.0 * 100,
    1
  ) AS engagement_percentage,
  us.unique_activity_types,
  us.entity_types_used,
  us.most_common_activity,
  us.most_used_entity_type,
  ph.hour_of_day AS peak_activity_hour,
  us.first_activity,
  us.last_activity,
  tc.team_size,
  ROUND(tc.avg_activities_per_user, 1) AS team_avg_activities,
  CASE
    WHEN us.active_days >= 25 THEN 'Highly Active'
    WHEN us.active_days >= 15 THEN 'Active'
    WHEN us.active_days >= 5 THEN 'Moderate'
    ELSE 'Low Activity'
  END AS activity_level,
  CASE
    WHEN us.entity_types_used >= 4 THEN 'Power User'
    WHEN us.entity_types_used >= 2 THEN 'Regular User'
    ELSE 'Basic User'
  END AS user_type
FROM user_summary us
LEFT JOIN peak_hours ph ON us.user_email = ph.user_email AND ph.hour_rank = 1
LEFT JOIN team_collaboration tc ON us.functional_area = tc.functional_area
ORDER BY 
  us.total_activities DESC,
  us.active_days DESC

Introduction

Get started

Using Nekt

Workspace and access

MCP

SDK

Nekt as a data source

Configuring Nekt as a Source

1. Add account access

2. Select streams

3. Configure data streams

4. Configure data source

5. Check your new source

Streams and Fields

Data Model

Use Cases for Data Analysis

1. Daily Credits Consumption per Pipeline

2. Data Source Performance and Reliability Analysis

3. User Activity and Platform Usage Analysis

Introduction

Get started

Using Nekt

Workspace and access

MCP

SDK

​Configuring Nekt as a Source

​1. Add account access

​2. Select streams

​3. Configure data streams

​4. Configure data source

​5. Check your new source

​Streams and Fields

​Data Model

​Use Cases for Data Analysis

​1. Daily Credits Consumption per Pipeline

​2. Data Source Performance and Reliability Analysis

​3. User Activity and Platform Usage Analysis

Configuring Nekt as a Source

1. Add account access

2. Select streams

3. Configure data streams

4. Configure data source

5. Check your new source

Streams and Fields

Data Model

Use Cases for Data Analysis

1. Daily Credits Consumption per Pipeline

2. Data Source Performance and Reliability Analysis

3. User Activity and Platform Usage Analysis