Skip to main content
Mixpanel is a product analytics platform that helps teams understand how users interact with their products. It tracks events, user properties, and funnels so you can analyze behavior and improve conversion. The Nekt connector uses the Mixpanel Data Export API to replicate raw event data into your Lakehouse.

Configuring Mixpanel as a Source

In the Sources tab, click on the “Add source” button located on the top right of your screen. Then, select the Mixpanel option from the list of connectors. Click Next and you’ll be prompted to add your access.

1. Add account access

You need Mixpanel API credentials with access to the Data Export API. Use a service account (username) and its secret for authentication. The following configurations are available:
  • Project ID: The Mixpanel project ID you want to extract events from. You can find it in your Mixpanel project settings (e.g. Settings > Project Settings > Project ID).
  • Username: The username used to authenticate against the API. This is typically a service account username. Mixpanel recommends using a service account for programmatic access.
  • Secret: The secret (password) for the service account or user. This value is stored securely and is never displayed after saving.
  • Start date: The earliest date from which records will be synced. Used for the first full sync and when no previous state exists. Format: YYYY-MM-DD.
Once you’re done, click Next.

2. Select streams

The Mixpanel connector exposes a single stream: events. Choose whether to sync it. Select the stream and click Next.

3. Configure data streams

Customize how you want your data to appear in your catalog. Select the desired layer where the data will be placed, a folder to organize it inside the layer, a name for the table (which will contain the fetched events), and the type of sync.
  • Layer: Choose between the existing layers on your catalog. This is where you will find your new extracted table once the extraction runs successfully.
  • Folder: A folder can be created inside the selected layer to group all tables from this data source.
  • Table name: A default name is suggested; you can change it. You can add a prefix to all tables at once to speed up configuration.
  • Sync Type: You can choose between INCREMENTAL and FULL_TABLE.
    • Incremental: Each run fetches only events since the last replicated timestamp. Recommended for ongoing syncs so you keep every event without re-reading full history.
    • Full table: Each run re-exports events from the configured start date (or from the beginning). Use when you need to backfill or fully refresh the dataset.
Once you are done configuring, click Next.

4. Configure data source

Describe your data source for easy identification within your organization, not exceeding 140 characters. To define your Trigger, consider how often you want data to be extracted. For event data, common choices are:
  • Daily: Typical for analytics and reporting.
  • Every 12 hours or hourly: For more up-to-date event pipelines.
  • Weekly: For lighter reporting needs.
Optionally, you can define:
  • Delta Log Retention: How long to keep old states of the table. See Resource control.
  • Additional Full Sync: Run a full export periodically in addition to incremental syncs.
When you are ready, click Next to finalize the setup.

5. Check your new source

You can view your new source on the Sources page. If needed, manually trigger the source extraction by clicking on the arrow button. Once the run completes successfully, data will appear in your Catalog.
You need at least one successful source run to see the table in your Catalog.

Streams and Fields

The Mixpanel connector exposes one stream: events. It corresponds to the Mixpanel Export API and returns one row per event.
Stream of raw analytics events from your Mixpanel project. Each record is a single event (e.g. page view, signup, purchase) with a unique insert ID, event name, timestamp, and a JSON blob of event properties.Key fields:
FieldTypeDescription
idStringUnique identifier for the event. Sourced from Mixpanel’s $insert_id property.
eventStringThe event name (e.g. "Page View", "Sign Up", "Purchase").
timeIntegerEvent timestamp. Stored in milliseconds (API is called with time_in_ms: true).
propertiesStringFull set of event properties as a JSON string. Includes Mixpanel reserved properties (e.g. $insert_id, time, distinct_id, $user_id) and any custom properties you send.
Notes:
  • Primary key: id (from $insert_id). Ensures one row per event in the Export API response.
  • Replication: The stream uses time as the replication key. Incremental syncs request only events from the last replicated timestamp onward.
  • Pagination: The connector requests data in monthly date ranges (from start_date or state to “now”) to respect API behavior and avoid timeouts on large projects.
  • Properties: All event properties (including nested objects) are serialized into the properties JSON string. To use them in SQL or BI tools, parse JSON in your transformations (e.g. JSON_EXTRACT / json_extract or flatten into columns).
Example properties (conceptual):
{
  "$insert_id": "abc-123",
  "time": 1704067200000,
  "distinct_id": "user_456",
  "$user_id": "user_456",
  "$device_id": "device_789",
  "utm_source": "google",
  "utm_campaign": "winter_sale",
  "revenue": 99.99
}
In the stream, id will be "abc-123", event will be the event name, time will be 1704067200000, and the full object above will appear in properties as a string.

Data Model

The connector has a single stream. Events are identified by id and ordered by time.

Transformation example: pivoting properties to columns

The properties column stores all event attributes as a single JSON string. To analyze dimensions like user ID, UTM fields, or revenue in Explorer or downstream models, parse the JSON and expose the keys as separate columns. The example below selects common Mixpanel fields (distinct_id, $user_id, utm_source, utm_campaign, revenue) and the event timestamp; adjust the keys to match your own properties.
SELECT
   id,
   event,
   time,
   from_unixtime(time / 1000) AS event_time,
   json_extract_scalar(properties, '$.distinct_id')     AS distinct_id,
   json_extract_scalar(properties, '$.$user_id')        AS user_id,
   json_extract_scalar(properties, '$.utm_source')      AS utm_source,
   json_extract_scalar(properties, '$.utm_campaign')     AS utm_campaign,
   CAST(json_extract_scalar(properties, '$.revenue') AS DOUBLE) AS revenue
FROM
   nekt_raw.mixpanel_events
WHERE
   time >= to_unixtime(current_date - interval '30' day) * 1000
Replace nekt_raw.mixpanel_events with your actual layer and table name. Use json_extract_scalar() for string properties; for numeric properties use CAST(json_extract_scalar(properties, '$.key') AS DOUBLE). Property names with $ (e.g. $user_id) are referenced as '$.$user_id' in the JSON path.
You can run this as an ad-hoc query in Explorer or turn it into a transformation that writes to a new table so you have a flattened view of Mixpanel events for reporting and joins.

Troubleshooting

IssuePossible causeSolution
Authentication errorsWrong username or secretVerify service account (or user) and secret in Mixpanel (e.g. Project Settings > Service Accounts). Ensure the account has Data Export access.
No data / empty tableStart date in the future or no events in rangeCheck Start date and the project’s time zone. Confirm that the project has events in the requested date range in Mixpanel.
Duplicate or missing eventsState or date range issueFor incremental syncs, check that state is being saved (e.g. successful run completion). For full syncs, ensure Start date covers the desired period.
Large extractions slow or timing outVery high event volume in one projectThe connector already paginates by month. If needed, reduce the sync frequency or split the project (e.g. by data pipeline) in Mixpanel.
Need custom properties in columnsProperties are in JSON stringAdd a transformation (SQL or notebook) that parses properties and selects or flattens the fields you need.