When to use this
Many API sources return columns that contain arrays — for example, a CRM that stores deal labels as["hot", "enterprise", "renewal"] in a single column, or an e-commerce platform that nests line items inside each order.
To filter, aggregate, or join on these values you first need to unnest (or explode) the array so that each element becomes its own row.
Sample input
Imagine acrm_deals table ingested from your CRM into the Raw layer:
| deal_id | deal_name | amount | labels |
|---|---|---|---|
| 1 | Acme Corp | 50000 | [“hot”, “enterprise”] |
| 2 | Globex Inc | 12000 | [“renewal”] |
| 3 | Initech | 8500 | [“hot”, “smb”, “trial”] |
labels column that is an array of strings. We want to produce one row per deal-label pair while keeping the other columns intact.
Implementation
- Nekt Express / BigQuery
- Athena SQL
- Python (Nekt SDK)
BigQuery uses
CROSS JOIN UNNEST directly on the array column. The alias becomes the new column name.Expected output
After unnesting, the table contains one row for every deal-label combination:| deal_id | deal_name | amount | label |
|---|---|---|---|
| 1 | Acme Corp | 50000 | hot |
| 1 | Acme Corp | 50000 | enterprise |
| 2 | Globex Inc | 12000 | renewal |
| 3 | Initech | 8500 | hot |
| 3 | Initech | 8500 | smb |
| 3 | Initech | 8500 | trial |
Tips and gotchas
Unnesting multiplies the row count — if a deal has 3 labels, it becomes 3 rows. Keep this in mind when calculating aggregates like
SUM(amount), as you’ll need to deduplicate or aggregate at the right grain to avoid inflated numbers.