Pivot / unpivot data

When to use this

Sometimes your data is in the wrong shape for the analysis you need. Pivoting turns distinct row values into columns (e.g., turning monthly rows into one column per month). Unpivoting does the reverse — turning columns into rows (e.g., turning q1_revenue, q2_revenue, q3_revenue columns into quarter and revenue rows).

Sample input — Pivot

A monthly_sales table in the Raw layer, with one row per product per month:

product	month	revenue
Widget A	January	5000
Widget A	February	7200
Widget A	March	6100
Widget B	January	3200
Widget B	February	4100
Widget B	March	3800

We want one row per product with a column for each month.

Sample input — Unpivot

A quarterly_revenue table where each quarter is a separate column:

product	q1_revenue	q2_revenue	q3_revenue
Widget A	18300	22100	19500
Widget B	11100	13200	12800

We want to normalize this into product, quarter, and revenue columns.

Implementation — Pivot

Nekt Express / BigQuery
Athena SQL
Python (Nekt SDK)

BigQuery supports a native PIVOT clause for cleaner syntax.

SELECT *
FROM (
  SELECT product, month, revenue
  FROM `raw.monthly_sales`
)
PIVOT (
  SUM(revenue)
  FOR month IN ('January', 'February', 'March')
)

The values in the IN clause must be known at query time. If your months are dynamic, use the conditional aggregation approach shown in the Athena tab.

Athena doesn’t have a native PIVOT keyword. Use conditional aggregation instead.

SELECT
  product,
  SUM(CASE WHEN month = 'January'  THEN revenue ELSE 0 END) AS january,
  SUM(CASE WHEN month = 'February' THEN revenue ELSE 0 END) AS february,
  SUM(CASE WHEN month = 'March'    THEN revenue ELSE 0 END) AS march
FROM raw.monthly_sales
GROUP BY product

In PySpark, groupBy + pivot handles this natively.

import nekt

df = nekt.load_table(layer_name="Raw", table_name="monthly_sales")

pivoted_df = (
    df
    .groupBy("product")
    .pivot("month", ["January", "February", "March"])
    .sum("revenue")
)

nekt.save_table(
    df=pivoted_df,
    layer_name="Trusted",
    table_name="sales_by_month"
)

Passing the list of values to pivot() is optional but strongly recommended for performance — without it, Spark must scan the data to discover all distinct values first.

Pivot — Expected output

product	January	February	March
Widget A	5000	7200	6100
Widget B	3200	4100	3800

Implementation — Unpivot

Nekt Express / BigQuery
Athena SQL
Python (Nekt SDK)

BigQuery supports a native UNPIVOT clause.

SELECT *
FROM `raw.quarterly_revenue`
UNPIVOT (
  revenue FOR quarter IN (
    q1_revenue AS 'Q1',
    q2_revenue AS 'Q2',
    q3_revenue AS 'Q3'
  )
)

Athena doesn’t have a native UNPIVOT. Use CROSS JOIN UNNEST with arrays of column names and values.

SELECT
  product,
  quarter,
  revenue
FROM raw.quarterly_revenue
CROSS JOIN UNNEST(
  ARRAY['Q1', 'Q2', 'Q3'],
  ARRAY[q1_revenue, q2_revenue, q3_revenue]
) AS t(quarter, revenue)

In PySpark, use stack to unpivot columns into rows.

import nekt
from pyspark.sql import functions as F

df = nekt.load_table(layer_name="Raw", table_name="quarterly_revenue")

unpivoted_df = df.select(
    "product",
    F.expr("""
        stack(3,
            'Q1', q1_revenue,
            'Q2', q2_revenue,
            'Q3', q3_revenue
        ) AS (quarter, revenue)
    """)
)

nekt.save_table(
    df=unpivoted_df,
    layer_name="Trusted",
    table_name="revenue_by_quarter"
)

Unpivot — Expected output

product	quarter	revenue
Widget A	Q1	18300
Widget A	Q2	22100
Widget A	Q3	19500
Widget B	Q1	11100
Widget B	Q2	13200
Widget B	Q3	12800

Tips and gotchas

Pivoting requires knowing the distinct values upfront (months, quarters, etc.). If the values change over time, you’ll need to update the query. For fully dynamic pivots, the Python approach is the most flexible — you can read distinct values from the data and build the pivot programmatically.

After pivoting, column names come from data values and may contain spaces, special characters, or start with numbers. Clean them up with Rename columns if needed.

​When to use this

​Sample input — Pivot

​Sample input — Unpivot

​Implementation — Pivot

​Pivot — Expected output

​Implementation — Unpivot

​Unpivot — Expected output

​Tips and gotchas

When to use this

Sample input — Pivot

Sample input — Unpivot

Implementation — Pivot

Pivot — Expected output

Implementation — Unpivot

Unpivot — Expected output

Tips and gotchas