Rename columns

When to use this

Different data sources follow different naming conventions. A CRM might return ContactFirstName, an advertising platform might use campaign-id, and a database export might use TOTAL AMT. Before joining or analyzing data from multiple sources, renaming columns to a consistent convention (typically snake_case) makes everything easier to work with downstream.

Sample input

A hubspot_contacts table in the Raw layer with mixed naming conventions:

ContactId	FirstName	LastName	emailAddress	Phone Number
101	Alice	Johnson	alice@acme.com	+1-555-0101
102	Bob	Smith	bob@globex.com	+1-555-0102
103	Carol	Lee	carol@initech.com	+1-555-0103

We want to rename all columns to snake_case:

Implementation

Nekt Express / BigQuery
Athena SQL
Python (Nekt SDK)

BigQuery also uses column aliases. Use backticks for columns with spaces or special characters.

SELECT
  ContactId      AS contact_id,
  FirstName      AS first_name,
  LastName       AS last_name,
  emailAddress   AS email_address,
  `Phone Number` AS phone_number
FROM `raw.hubspot_contacts`

BigQuery is case-insensitive for column names by default, but using explicit aliases ensures consistent naming for BI tools and downstream consumers.

In Athena, use column aliases in a SELECT statement to rename each column explicitly.

SELECT
  "ContactId"      AS contact_id,
  "FirstName"      AS first_name,
  "LastName"       AS last_name,
  "emailAddress"   AS email_address,
  "Phone Number"   AS phone_number
FROM raw.hubspot_contacts

Athena requires double quotes around column names that contain spaces, special characters, or are case-sensitive. Once aliased, downstream queries can use the clean snake_case names.

In PySpark, use withColumnRenamed for individual columns or toDF to rename all columns at once.

import nekt

df = nekt.load_table(layer_name="Raw", table_name="hubspot_contacts")

renamed_df = (
    df
    .withColumnRenamed("ContactId", "contact_id")
    .withColumnRenamed("FirstName", "first_name")
    .withColumnRenamed("LastName", "last_name")
    .withColumnRenamed("emailAddress", "email_address")
    .withColumnRenamed("Phone Number", "phone_number")
)

nekt.save_table(
    df=renamed_df,
    layer_name="Trusted",
    table_name="hubspot_contacts_clean"
)

For bulk renaming to snake_case, you can automate it with a helper that converts all column names at once:

import re

def to_snake_case(name):
    s = re.sub(r'([A-Z])', r'_\1', name).lower()
    s = re.sub(r'[\s\-]+', '_', s)
    return re.sub(r'_+', '_', s).strip('_')

renamed_df = df.toDF(*[to_snake_case(c) for c in df.columns])

Expected output

contact_id	first_name	last_name	email_address	phone_number
101	Alice	Johnson	alice@acme.com	+1-555-0101
102	Bob	Smith	bob@globex.com	+1-555-0102
103	Carol	Lee	carol@initech.com	+1-555-0103

Tips and gotchas

Renaming columns doesn’t change the underlying data — it only affects how the column is referenced in the output. Always save the renamed table to a new layer (e.g., Trusted) so the raw data remains untouched.

Some BI tools are case-sensitive when referencing column names. Sticking to snake_case with all lowercase avoids surprises when connecting Looker Studio, Power BI, or Metabase to your data.

​When to use this

​Sample input

​Implementation

​Expected output

​Tips and gotchas

When to use this

Sample input

Implementation

Expected output

Tips and gotchas