Documentation Index
Fetch the complete documentation index at: https://docs.nekt.com/llms.txt
Use this file to discover all available pages before exploring further.
When to use this
Data from a single source is often split across multiple tables — a CRM hasdeals and companies, an e-commerce platform has orders and customers. Joining these tables lets you enrich one entity with data from another (e.g., adding the company name to each deal, or the customer email to each order).
Sample input
Two tables in the Raw layer:orders
| order_id | customer_id | amount | order_date |
|---|---|---|---|
| 1001 | 201 | 249.99 | 2024-03-15 |
| 1002 | 202 | 89.50 | 2024-03-16 |
| 1003 | 999 | 150.00 | 2024-03-17 |
customers
| customer_id | name | country | |
|---|---|---|---|
| 201 | Alice Johnson | alice@acme.com | BR |
| 202 | Bob Smith | bob@globex.com | US |
| 203 | Carol Lee | carol@initech.com | GB |
1003 references customer_id = 999, which doesn’t exist in the customers table. Customer 203 has no orders. This makes it a good scenario to compare join types.
Implementation
- Nekt Express / BigQuery
- Athena SQL
- Python (Nekt SDK)
The same
LEFT JOIN syntax applies in BigQuery.Expected output (LEFT JOIN)
| order_id | amount | order_date | customer_name | customer_email | customer_country |
|---|---|---|---|---|---|
| 1001 | 249.99 | 2024-03-15 | Alice Johnson | alice@acme.com | BR |
| 1002 | 89.50 | 2024-03-16 | Bob Smith | bob@globex.com | US |
| 1003 | 150.00 | 2024-03-17 | NULL | NULL | NULL |
1003 is kept with NULL customer fields. Customer 203 (Carol) doesn’t appear because she has no matching orders.
Tips and gotchas
When joining large tables, performance matters. Both Athena and BigQuery optimize joins automatically, but you can help by:
- Filtering each table before joining (use CTEs or subqueries to reduce row counts)
- Joining on indexed or partitioned columns when possible
- Avoiding
SELECT *— only select the columns you need