Bring data from Firestore to your Lakehouse.
updated_at
field as incremental key, you must create an index for that same key to allow Collection Group queries.
More information about this can be found on the official documentation.
Subcollection extraction mode
chosen.
Nested documents
mode, you should use the notation collection.sub_collection
, for example conversations.messages
if you want to extract the subcollection messages
from the top-level collection conversations
. A wildcard is also accepted if you want to get all nested subcollections, for example conversations.messages.*
will extract all nested subcollections under conversations -> messages
.Collection group
mode you can simply enter the name of each subcollection you want to extract. Please note subcollections with the same name under different root level collections will be mapped to the same stream. It’s a good practice to use unique names for subcollections to avoid this behavior.Collection group
queries for subcollections whenever possible. This significantly improves extraction time, saving time and resources.Indexes
tab.Single field
tab.Exemptions
section click on Add exemption
.Collection ID
field, enter the name of the subcollection.Field path
field, enter the name of the attribute you want to use for the incremental index (generally a timestamp or date field).Query scope
mark the Collection group
checkbox.Save
.
It takes a while for the changes to propagate, but once it’s done you’re good to go.Tip: The stream can be found more easily by typing its name.
If you encounter any issues, reach out to us via Slack, and we’ll gladly assist you!