
Configuring Mixpanel as a Source
In the Sources tab, click on the “Add source” button located on the top right of your screen. Then, select the Mixpanel option from the list of connectors. Click Next and you’ll be prompted to add your access.1. Add account access
You need Mixpanel API credentials with access to the Data Export API. Use a service account (username) and its secret for authentication. The following configurations are available:- Project ID: The Mixpanel project ID you want to extract events from. You can find it in your Mixpanel project settings (e.g. Settings > Project Settings > Project ID).
- Username: The username used to authenticate against the API. This is typically a service account username. Mixpanel recommends using a service account for programmatic access.
- Secret: The secret (password) for the service account or user. This value is stored securely and is never displayed after saving.
-
Start date: The earliest date from which records will be synced. Used for the first full sync and when no previous state exists. Format:
YYYY-MM-DD.
2. Select streams
The Mixpanel connector exposes a single stream: events. Choose whether to sync it. Select the stream and click Next.3. Configure data streams
Customize how you want your data to appear in your catalog. Select the desired layer where the data will be placed, a folder to organize it inside the layer, a name for the table (which will contain the fetched events), and the type of sync.- Layer: Choose between the existing layers on your catalog. This is where you will find your new extracted table once the extraction runs successfully.
- Folder: A folder can be created inside the selected layer to group all tables from this data source.
- Table name: A default name is suggested; you can change it. You can add a prefix to all tables at once to speed up configuration.
- Sync Type: You can choose between INCREMENTAL and FULL_TABLE.
- Incremental: Each run fetches only events since the last replicated timestamp. Recommended for ongoing syncs so you keep every event without re-reading full history.
- Full table: Each run re-exports events from the configured start date (or from the beginning). Use when you need to backfill or fully refresh the dataset.
4. Configure data source
Describe your data source for easy identification within your organization, not exceeding 140 characters. To define your Trigger, consider how often you want data to be extracted. For event data, common choices are:- Daily: Typical for analytics and reporting.
- Every 12 hours or hourly: For more up-to-date event pipelines.
- Weekly: For lighter reporting needs.
- Delta Log Retention: How long to keep old states of the table. See Resource control.
- Additional Full Sync: Run a full export periodically in addition to incremental syncs.
5. Check your new source
You can view your new source on the Sources page. If needed, manually trigger the source extraction by clicking on the arrow button. Once the run completes successfully, data will appear in your Catalog.Streams and Fields
The Mixpanel connector exposes one stream: events. It corresponds to the Mixpanel Export API and returns one row per event.Events
Events
Stream of raw analytics events from your Mixpanel project. Each record is a single event (e.g. page view, signup, purchase) with a unique insert ID, event name, timestamp, and a JSON blob of event properties.Key fields:
Notes:In the stream,
| Field | Type | Description |
|---|---|---|
id | String | Unique identifier for the event. Sourced from Mixpanel’s $insert_id property. |
event | String | The event name (e.g. "Page View", "Sign Up", "Purchase"). |
time | Integer | Event timestamp. Stored in milliseconds (API is called with time_in_ms: true). |
properties | String | Full set of event properties as a JSON string. Includes Mixpanel reserved properties (e.g. $insert_id, time, distinct_id, $user_id) and any custom properties you send. |
- Primary key:
id(from$insert_id). Ensures one row per event in the Export API response. - Replication: The stream uses
timeas the replication key. Incremental syncs request only events from the last replicated timestamp onward. - Pagination: The connector requests data in monthly date ranges (from
start_dateor state to “now”) to respect API behavior and avoid timeouts on large projects. - Properties: All event properties (including nested objects) are serialized into the
propertiesJSON string. To use them in SQL or BI tools, parse JSON in your transformations (e.g.JSON_EXTRACT/json_extractor flatten into columns).
properties (conceptual):id will be "abc-123", event will be the event name, time will be 1704067200000, and the full object above will appear in properties as a string.Data Model
The connector has a single stream. Events are identified byid and ordered by time.
Transformation example: pivoting properties to columns
Theproperties column stores all event attributes as a single JSON string. To analyze dimensions like user ID, UTM fields, or revenue in Explorer or downstream models, parse the JSON and expose the keys as separate columns. The example below selects common Mixpanel fields (distinct_id, $user_id, utm_source, utm_campaign, revenue) and the event timestamp; adjust the keys to match your own properties.
SQL transformation (AWS Athena / GCP BigQuery)
SQL transformation (AWS Athena / GCP BigQuery)
- AWS (Athena)
- GCP (BigQuery)
nekt_raw.mixpanel_events with your actual layer and table name. Use json_extract_scalar() for string properties; for numeric properties use CAST(json_extract_scalar(properties, '$.key') AS DOUBLE). Property names with $ (e.g. $user_id) are referenced as '$.$user_id' in the JSON path.Troubleshooting
| Issue | Possible cause | Solution |
|---|---|---|
| Authentication errors | Wrong username or secret | Verify service account (or user) and secret in Mixpanel (e.g. Project Settings > Service Accounts). Ensure the account has Data Export access. |
| No data / empty table | Start date in the future or no events in range | Check Start date and the project’s time zone. Confirm that the project has events in the requested date range in Mixpanel. |
| Duplicate or missing events | State or date range issue | For incremental syncs, check that state is being saved (e.g. successful run completion). For full syncs, ensure Start date covers the desired period. |
| Large extractions slow or timing out | Very high event volume in one project | The connector already paginates by month. If needed, reduce the sync frequency or split the project (e.g. by data pipeline) in Mixpanel. |
| Need custom properties in columns | Properties are in JSON string | Add a transformation (SQL or notebook) that parses properties and selects or flattens the fields you need. |