Google Drive is Google’s cloud storage and collaboration platform. This connector supports two operating modes:

Records mode (default): Downloads Microsoft Excel files (.xlsx and .xls) from Google Drive and extracts each worksheet as a separate data stream, so you can load spreadsheet data from Drive, including files in Shared Drives, into Nekt.
Unstructured mode: Downloads any file type (PDFs, images, XML, etc.) from a Google Drive folder and uploads them to a Nekt volume, emitting one metadata record per file. Useful for materializing a catalog of unstructured assets into your warehouse.

For native Google Sheets (not Excel files), use the Google Sheets connector instead.

Records mode (Excel extraction)

Configuring Google Drive Excel as a Source

In the Sources tab, click on the Add source button on the top right, then select Google Drive (Excel) from the list of connectors. Click Next and you’ll be prompted to add your access.

1. Add account access

You need to authorize Nekt to read files from Google Drive and, optionally, pick the file or folder to extract from.

Authentication: Use the Google Authorization flow. Sign in with a Google account that has access to the Drive (and, if applicable, the Shared Drive) where your Excel files are stored. The connector uses OAuth and stores a refresh token so it can keep accessing Drive without re-authorizing.
File or folder: Use the in-app picker to select the file or folder; the connector will resolve the correct ID.

Once you’re done, click Next.

2. Select streams

The connector discovers streams dynamically from the file or folder you selected:

One stream per sheet: Each worksheet in each Excel file becomes one stream.
Stream names: {file_name}_{sheet_name} (lowercase, special characters replaced with underscores), e.g. monthly_sales_january, budget_2024_summary.
Tab config: You can optionally configure per-stream behavior for specific sheets:
Range: A sheet range (e.g. A:D or A1:E100) so only part of the sheet is read. If not set, the full sheet is used.
Skip rows: Number of rows to skip from the top before treating the next row as the header. Use this when the sheet has title or empty rows above the data. If headers are on the first row, leave this at 0.

Stream names are generated as {sanitized_file_name}_{sanitized_sheet_name} (e.g. sales_report_2024_sheet1). When tab config is provided, only the streams you configure use their custom range/skip_rows; others use full sheet and no skip.

Choose which streams you want to sync. You can select all or a subset.

Tip: You can search for a stream by typing its name.

Select the streams and click Next.

3. Configure data streams

Customize how you want the data to appear in your catalog: layer, folder, table names, and sync type.

Layer: Choose the layer where the new tables will live.
Folder: Optionally create or select a folder inside the layer to group tables from this source.
Table name: A default name is suggested per stream; you can change it or add a prefix for all tables at once.
Sync type: Only FULL_TABLE is supported. Each run re-downloads the Excel file(s) and re-reads the selected sheets, so your tables always reflect the current content of the files.

Click Next when done.

4. Configure data source

Add a short description of the source (e.g. what data it brings or which team owns it), and define your Trigger (how often the extraction runs). Optionally:

Configure Delta Log Retention for how long old table states are kept. See Resource control.
Schedule an Additional Full Sync if you want periodic full refreshes in addition to your normal schedule.

Click Next to finish.

5. Check your new source

Your new source appears on the Sources page. Trigger a run manually if needed; after a successful run, the tables will appear in your Catalog.

You need at least one successful source run to see the tables in your Catalog.

Unstructured mode (raw file extraction)

Unstructured mode downloads files from a Google Drive folder, uploads them to a Nekt volume, and emits one metadata record per file through a google_drive_files stream. This is useful for extracting PDFs, images, XML files, or any other file type that doesn’t fit a tabular format.

Configuring unstructured mode

In the Sources tab, click Add source and select Google Drive from the list of connectors.

1. Add account access

Same as Records mode: authorize Nekt via Google Authorization and select the folder containing your files.

2. Select mode

Under Advanced Settings, set the Mode to Unstructured.

3. Configure volume

When Unstructured mode is selected, the Attachments Volume setting becomes available.

Attachments Volume: Use the in-app volume picker to directly select the destination Nekt volume where you want the downloaded Google Drive files to be uploaded. This is required when operating in Unstructured mode.

4. Optional: filter files

You can optionally set a File name filter (under Advanced Settings) to only process files matching a wildcard pattern (e.g. *.pdf, invoice_*). Files that don’t match the pattern are skipped.

5. Configure and finish

Complete the remaining steps (data source description, trigger schedule) as with any other source.

How it works

The connector lists all files in the selected Google Drive folder (with pagination for large folders).
If a File name filter is set, only matching files are processed.
Each file is downloaded and uploaded to the configured Nekt volume.
On subsequent runs, only files modified after the last extraction are processed (incremental replication via modified_at).

Streams and Fields

Sheet streams (Records mode)

Each selected Excel file contributes one stream per sheet (tab). The stream name is built from the file name (without extension) and the sheet name, sanitized (e.g. revenue_2024_q1).Schema: Column names and types are inferred from the first 1,000 rows of the sheet (after applying skip_rows and any range, if configured). Column headers in the file are slugified: spaces and special characters become underscores, and names are lowercased (e.g. Revenue (USD) → revenue_usd).Field types: The connector maps Excel/pandas types to schema types:

Inferred type	Schema type
datetime	DateTime
number	Number
integer	Integer
boolean	Boolean
other	String

Data behavior:

Rows where every cell is empty are dropped.
Excel blanks become null in the output.
Records are cleansed (e.g. invalid values normalized) before being written.

Large files: Sheets are read in memory and emitted in chunks (e.g. 5,000 rows per chunk) for efficiency. The file is only parsed once per stream.

Google Drive Files (Unstructured mode)

The unstructured mode emits a single stream google_drive_files with one metadata record per uploaded file.

Field	Type	Description
`file_id`	String	Google Drive file ID (primary key)
`file_name`	String	Original file name
`file_size`	Integer	File size in bytes
`mime_type`	String	MIME type (e.g. `application/pdf`, `image/png`)
`google_drive_url`	String	Web view URL for the file in Google Drive
`created_at`	DateTime	File creation timestamp
`modified_at`	DateTime	Last modification timestamp (replication key)
`uploaded_at`	DateTime	Timestamp of when the file was uploaded to the Nekt volume
`nekt_file_id`	String	ID assigned by Nekt after upload
`nekt_layer`	String	Destination layer slug
`nekt_volume`	String	Destination volume slug

Implementation notes

Authentication

Google OAuth: The connector uses Google OAuth (client ID, client secret, refresh token) to obtain access tokens for the Google Drive API. Credentials are stored securely.
Scopes: The connected account must have read access to the chosen file or folder (and to the Shared Drive, if applicable).
Shared Drives: Supported via supportsAllDrives and includeItemsFromAllDrives when resolving the item and listing folder contents.

File and folder behavior

Records mode: Only Excel files are processed: .xlsx and .xls. Other files in a selected folder are skipped. If item_id points to a single file, it must be one of the supported Excel types. If it points to a folder, the connector lists its contents (including from Shared Drives), filters to Excel files only, and for each file discovers one stream per sheet.
Unstructured mode: Any file type is accepted. The connector lists all files in the folder (with pagination), optionally filters by search_pattern, downloads each file, uploads it to a Nekt volume via the SDK’s Volume Attachment system, and emits one metadata record per file. Google Workspace files (Docs, Sheets, Slides) that cannot be downloaded as binary are skipped.

Schema and tab config

Discovery: Schema is built from the first 1,000 rows (after skip_rows and optional range). If your header row is not in the first row, set Skip rows so the correct row is used as the header.
Tab config: Optional. Keys are the stream names (e.g. my_file_my_sheet). For each key you can set range and/or skip_rows. Streams not present in tab config use the full sheet and no skip.
Column names: All column names are slugified (lowercase, underscores) for consistency and compatibility with the catalog.

Sync type

Records mode (FULL_TABLE): There is no replication key. Every run re-downloads the file(s) and re-reads the sheets, so sync type is effectively full table.
Unstructured mode (INCREMENTAL): Uses modified_at as the replication key. Only files modified after the last successful run are downloaded and uploaded.

Best practices

Use a dedicated service account or folder: Prefer a Google account or folder used only for this integration, so permissions are clear and revocable.
Set skip rows when needed: If the first rows of the sheet are empty or contain titles, set Skip rows so the header row is detected correctly.
Use range for large sheets: If only a subset of columns or rows is needed, set Range in tab config to reduce payload and improve performance.
Pick only needed streams: Selecting only the sheets you need keeps runs faster and the catalog simpler.
Schedule according to updates: Run the source as often as your Excel files are updated (e.g. daily or weekly).

Troubleshooting

Issue	Possible cause	Solution
Auth or token errors	Invalid or expired OAuth credentials	Re-run Google Authorization and ensure the account has access to the file or folder.
”Cannot use a file that is not xlsx extension”	`item_id` points to a non-Excel file	Select an `.xlsx` or `.xls` file, or a folder that contains only Excel files when using a single-file selection.
Wrong columns or headers	Header not in first row or wrong range	Set Skip rows and/or Range in tab config for that stream.
Missing or wrong types	Schema inferred from first 1,000 rows	Ensure the first 1,000 rows are representative; mixed or later-format changes can cause type mismatches.
Slow or timeout	Very large Excel file or many sheets	Reduce the number of streams (sheets) or use Range to limit data; ensure the Drive account has good network access.
Empty or partial data	Range too narrow or skip_rows too high	Check Range and Skip rows for the stream; verify the sheet has data in the selected area.
No files processed in unstructured mode	File name filter too restrictive	Check the File name filter pattern; use `*` to match all files.

Skills for agents

Download Google Drive Excel skills file

Google Drive Excel connector documentation as plain markdown, for use in AI agent contexts.

Documentation

Get started

Using Nekt

Workspace

Resources

Google Drive as a data source

Records mode (Excel extraction)

Configuring Google Drive Excel as a Source

1. Add account access

2. Select streams

3. Configure data streams

4. Configure data source

5. Check your new source

Unstructured mode (raw file extraction)

Configuring unstructured mode

1. Add account access

2. Select mode

3. Configure volume

4. Optional: filter files

5. Configure and finish

How it works

Streams and Fields

Implementation notes

Authentication

File and folder behavior

Schema and tab config

Sync type

Best practices

Troubleshooting

Skills for agents

Download Google Drive Excel skills file

​Records mode (Excel extraction)

​Configuring Google Drive Excel as a Source

​1. Add account access

​2. Select streams

​3. Configure data streams

​4. Configure data source

​5. Check your new source

​Unstructured mode (raw file extraction)

​Configuring unstructured mode

​1. Add account access

​2. Select mode

​3. Configure volume

​4. Optional: filter files

​5. Configure and finish

​How it works

​Streams and Fields

​Implementation notes

​Authentication

​File and folder behavior

​Schema and tab config

​Sync type

​Best practices

​Troubleshooting

​Skills for agents

Download Google Drive Excel skills file

Records mode (Excel extraction)

Configuring Google Drive Excel as a Source

1. Add account access

2. Select streams

3. Configure data streams

4. Configure data source

5. Check your new source

Unstructured mode (raw file extraction)

Configuring unstructured mode

1. Add account access

2. Select mode

3. Configure volume

4. Optional: filter files

5. Configure and finish

How it works

Streams and Fields

Implementation notes

Authentication

File and folder behavior

Schema and tab config

Sync type

Best practices

Troubleshooting

Skills for agents