.xlsx and .xls) stored in Google Drive. The connector downloads these files via the Google Drive API and extracts each worksheet as a separate data stream, so you can load spreadsheet data from Drive — including files in Shared Drives — into Nekt. For native Google Sheets (not Excel files), use the Google Sheets connector instead.
Configuring Google Drive Excel as a Source
In the Sources tab, click on the Add source button on the top right, then select Google Drive (Excel) from the list of connectors. Click Next and you’ll be prompted to add your access.1. Add account access
You need to authorize Nekt to read files from Google Drive and, optionally, pick the file or folder to extract from.- Authentication: Use the Google Authorization flow. Sign in with a Google account that has access to the Drive (and, if applicable, the Shared Drive) where your Excel files are stored. The connector uses OAuth and stores a refresh token so it can keep accessing Drive without re-authorizing.
- File or folder: Use the in-app picker to select the file or folder; the connector will resolve the correct ID.
2. Select streams
The connector discovers streams dynamically from the file or folder you selected:- One stream per sheet: Each worksheet in each Excel file becomes one stream.
- Stream names:
{file_name}_{sheet_name}(lowercase, special characters replaced with underscores), e.g.monthly_sales_january,budget_2024_summary. - Tab config: You can optionally configure per-stream behavior for specific sheets:
- Range: A sheet range (e.g.
A:DorA1:E100) so only part of the sheet is read. If not set, the full sheet is used. - Skip rows: Number of rows to skip from the top before treating the next row as the header. Use this when the sheet has title or empty rows above the data. If headers are on the first row, leave this at 0.
Stream names are generated as
{sanitized_file_name}_{sanitized_sheet_name} (e.g. sales_report_2024_sheet1). When tab config is provided, only the streams you configure use their custom range/skip_rows; others use full sheet and no skip.Tip: You can search for a stream by typing its name.Select the streams and click Next.
3. Configure data streams
Customize how you want the data to appear in your catalog: layer, folder, table names, and sync type.- Layer: Choose the layer where the new tables will live.
- Folder: Optionally create or select a folder inside the layer to group tables from this source.
- Table name: A default name is suggested per stream; you can change it or add a prefix for all tables at once.
- Sync type: Only FULL_TABLE is supported. Each run re-downloads the Excel file(s) and re-reads the selected sheets, so your tables always reflect the current content of the files.
4. Configure data source
Add a short description of the source (e.g. what data it brings or which team owns it), and define your Trigger (how often the extraction runs). Optionally:- Configure Delta Log Retention for how long old table states are kept. See Resource control.
- Schedule an Additional Full Sync if you want periodic full refreshes in addition to your normal schedule.
5. Check your new source
Your new source appears on the Sources page. Trigger a run manually if needed; after a successful run, the tables will appear in your Catalog.Streams and Fields
Streams are discovered from your chosen file or folder. There is no fixed list of streams or fields: each stream corresponds to one worksheet, and its columns are inferred from the Excel file.Sheet streams (one per worksheet)
Sheet streams (one per worksheet)
Each selected Excel file contributes one stream per sheet (tab). The stream name is built from the file name (without extension) and the sheet name, sanitized (e.g.
Data behavior:
revenue_2024_q1).Schema: Column names and types are inferred from the first 1,000 rows of the sheet (after applying skip_rows and any range, if configured). Column headers in the file are slugified: spaces and special characters become underscores, and names are lowercased (e.g. Revenue (USD) → revenue_usd).Field types: The connector maps Excel/pandas types to schema types:| Inferred type | Schema type |
|---|---|
| datetime | DateTime |
| number | Number |
| integer | Integer |
| boolean | Boolean |
| other | String |
- Rows where every cell is empty are dropped.
- Excel blanks become
nullin the output. - Records are cleansed (e.g. invalid values normalized) before being written.
Implementation notes
Authentication
- Google OAuth: The connector uses Google OAuth (client ID, client secret, refresh token) to obtain access tokens for the Google Drive API. Credentials are stored securely.
- Scopes: The connected account must have read access to the chosen file or folder (and to the Shared Drive, if applicable).
- Shared Drives: Supported via
supportsAllDrivesandincludeItemsFromAllDriveswhen resolving the item and listing folder contents.
File and folder behavior
- Supported formats: Only Excel files are processed:
.xlsx(application/vnd.openxmlformats-officedocument.spreadsheetml.sheet) and.xls(application/vnd.ms-excel). Other files in a selected folder are skipped. - Single file: If
item_idpoints to a file, it must be one of the supported Excel types; otherwise the connector exits with an error. - Folder: If
item_idpoints to a folder, the connector lists its contents (including from Shared Drives), filters to Excel files only, and for each file discovers one stream per sheet. Stream names are unique across files and sheets.
Schema and tab config
- Discovery: Schema is built from the first 1,000 rows (after
skip_rowsand optionalrange). If your header row is not in the first row, set Skip rows so the correct row is used as the header. - Tab config: Optional. Keys are the stream names (e.g.
my_file_my_sheet). For each key you can setrangeand/orskip_rows. Streams not present in tab config use the full sheet and no skip. - Column names: All column names are slugified (lowercase, underscores) for consistency and compatibility with the catalog.
Sync type
- FULL_TABLE only: There is no replication key. Every run re-downloads the file(s) and re-reads the sheets, so sync type is effectively full table. Incremental is not available.
Best practices
- Use a dedicated service account or folder: Prefer a Google account or folder used only for this integration, so permissions are clear and revocable.
- Set skip rows when needed: If the first rows of the sheet are empty or contain titles, set Skip rows so the header row is detected correctly.
- Use range for large sheets: If only a subset of columns or rows is needed, set Range in tab config to reduce payload and improve performance.
- Pick only needed streams: Selecting only the sheets you need keeps runs faster and the catalog simpler.
- Schedule according to updates: Run the source as often as your Excel files are updated (e.g. daily or weekly).
Troubleshooting
| Issue | Possible cause | Solution |
|---|---|---|
| Auth or token errors | Invalid or expired OAuth credentials | Re-run Google Authorization and ensure the account has access to the file or folder. |
| ”Cannot use a file that is not xlsx extension” | item_id points to a non-Excel file | Select an .xlsx or .xls file, or a folder that contains only Excel files when using a single-file selection. |
| Wrong columns or headers | Header not in first row or wrong range | Set Skip rows and/or Range in tab config for that stream. |
| Missing or wrong types | Schema inferred from first 1,000 rows | Ensure the first 1,000 rows are representative; mixed or later-format changes can cause type mismatches. |
| Slow or timeout | Very large Excel file or many sheets | Reduce the number of streams (sheets) or use Range to limit data; ensure the Drive account has good network access. |
| Empty or partial data | Range too narrow or skip_rows too high | Check Range and Skip rows for the stream; verify the sheet has data in the selected area. |
Skills for agents
Download Google Drive Excel skills file
Google Drive Excel connector documentation as plain markdown, for use in AI agent contexts.