Schema and column operations
Rename columns
Standardize column names across sources — apply snake_case, remove prefixes, or align naming conventions.
Cast or convert data types
Convert strings to dates, numbers, or booleans and handle format differences across sources.
Handle NULL values
Replace NULLs with defaults using COALESCE and clean up missing data for downstream consumers.
Structured data handling
Unnest arrays
Explode array columns into individual rows — essential for tags, labels, and line items from APIs.
Parse JSON fields
Extract values from JSON string columns into proper, typed columns ready for analysis.
Flatten nested structures
Normalize deeply nested objects into flat, query-friendly rows.
Expand custom fields
Pivot a key-value custom fields table into columns on the main entity — common for CRMs like Pipedrive and HubSpot.
Date parsing and calculations
Extract date parts, calculate differences, and format dates consistently across SQL engines.
Data quality
Deduplicate rows
Remove exact or near-duplicate records using ROW_NUMBER, DISTINCT, and other dedup strategies.
Filter invalid rows
Remove or flag rows that fail validation rules before they reach your trusted layer.