Skip to main content

When to use this

API sources often return data with multiple levels of nesting — a user object that contains an address object which itself contains a coordinates object. While Parse JSON fields works for a single level, deeply nested structures require a systematic approach to flatten everything into a single, query-friendly row.

Sample input

An events table in the Raw layer where each row has nested JSON:
event_idevent_typepayload
1purchase{“user”: {“name”: “Alice”, “address”: {“city”: “São Paulo”, “country”: “BR”}}, “amount”: 250.00}
2signup{“user”: {“name”: “Bob”, “address”: {“city”: “New York”, “country”: “US”}}, “amount”: null}
3purchase{“user”: {“name”: “Carol”, “address”: {“city”: “London”, “country”: “GB”}}, “amount”: 180.50}
We want to flatten all nested fields into top-level columns.

Implementation

Use dot-notation with JSON_VALUE to traverse the nested paths.
SELECT
  event_id,
  event_type,
  JSON_VALUE(payload, '$.user.name')            AS user_name,
  JSON_VALUE(payload, '$.user.address.city')    AS user_city,
  JSON_VALUE(payload, '$.user.address.country') AS user_country,
  CAST(JSON_VALUE(payload, '$.amount') AS FLOAT64) AS amount
FROM `raw.events`
If payload is stored as a native JSON type (not a string), you can access fields directly: payload.user.name. For string columns, JSON_VALUE is required.

Expected output

event_idevent_typeuser_nameuser_cityuser_countryamount
1purchaseAliceSão PauloBR250.00
2signupBobNew YorkUSNULL
3purchaseCarolLondonGB180.50

Tips and gotchas

When flattening, naming collisions can happen. If both the root and a nested object have a field called name, alias them explicitly (e.g., user_name vs event_name) to avoid ambiguity.
For very wide or deeply nested JSON (10+ fields, 3+ levels), consider flattening incrementally — create an intermediate table with the first level flattened, then flatten further in a second transformation. This makes debugging and maintenance much easier.