Google Sheets is a cloud-based spreadsheet application that allows users to create, edit, and collaborate on spreadsheets in real-time. It’s part of Google Workspace and provides tools for data organization, analysis, and sharing, making it popular for business data management and team collaboration.

1. Add your Google Sheets access

  1. In the Sources tab, click on the “Add source” button located on the top right of your screen. Then, select the Gooogle Sheets option from the list of connectors.
  2. Click Next and you’ll be prompted to add your access. First of all, authorize Nekt through Google Authorization button.
  3. Then, paste the Spreadsheet link.
Ensure that all columns you want to extract contain at least one alphanumeric character in their names. A column named _, for example, will cause an error.
  1. In the Advanced Configs section, you can filter specific tab names. Useful when you have a spreadsheet with multiple tabs, but you just want to extract a small subset of them. This config is optional - if not provided, all tabs will be available for extraction.
  2. Click Next.

2. Select your Google Sheets streams

  1. The next step is letting us know which streams you want to bring. In this case, you can select which tabs of your spreadsheet will be extracted (and later become your tables!) and the columns you want too.
You can select entire groups of streams or only a subset of them.
Tip: The stream can be found more easily by typing its name.
  1. Click Next.

3. Configure your Google Sheets data streams

  1. Customize how you want your data to appear in your catalog. Select a name for each table (which will contain the fetched data).
  • Layer: choose between the existing layers on your catalog. This is where you will find your new extracted tables as the extraction runs successfully.
  • Folder: a folder can be created inside the selected layer to group all tables being created from this new data source.
  • Table name: we suggest a name, but feel free to customize it. You have the option to add a prefix to all tables at once and make this process faster!
  • Sync Type: depending on the data you are bringing to the lake, you can choose between INCREMENTAL and FULL_TABLE sync. Read more about Sync Types here.
  • Cell range: If your table contains cells that are not relevant, please make sure to specify the cell range that should be considered for extraction.
  1. Click Next.

4. Configure your Google Sheets data source

  1. Describe your data source for easy identification within your organization. You can inform things like what data it brings, to which team it belongs, etc.
  2. To define your Trigger, consider how often you want data to be extracted from this source. This decision usually depends on how frequently you need the new table data updated (every day, once a week, or only at specific times).

Check your new source!

  1. Click Next to finalize the setup. Once completed, you’ll receive confirmation that your new source is set up!
  2. You can view your new source on the Sources page. Now, for you to be able to see it on your Catalog, you have to wait for the pipeline to run. You can now monitor it on the Sources page to see its execution and completion. If needed, manually trigger the pipeline by clicking on the refresh icon. Once executed, your new table will appear in the Catalog section.
If you encounter any issues, reach out to us via Slack, and we’ll gladly assist you!