Google Drive connector

Important

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Azure Databricks previews.

The managed Google Drive connector in Lakeflow Connect allows you to ingest files from Google Drive into Azure Databricks. Ingest unstructured files as binary data, parse structured formats (CSV, JSON, XML, EXCEL, and more) into Delta tables, or capture file metadata without loading file contents.

For the standard Google Drive connector that uses Spark reader APIs (read_files, spark.read, Auto Loader), see Ingest files from Google Drive.

What to know before you start

Topic Why it matters
Azure Databricks user persona The workflow depends on your Azure Databricks user persona:
  • Single-user: An administrator user creates a Unity Catalog connection and an ingestion pipeline.
  • Multi-user: An administrator user creates a connection for non-administrator users to create pipelines with.
Authentication method The steps to create a connection depend on the authentication method you select.
Interface The steps to create a pipeline depend on the interface.
Ingestion frequency The pipeline schedule depends on your latency and cost requirements.
Common patterns Depending on your ingestion needs, the pipeline might use configurations like history tracking, column selection, and row filtering. Supported configurations vary by connector. See Feature availability.

Start ingesting from Google Drive

The following table has an overview of the end-to-end Google Drive ingestion flow, based on user type:

User Steps
Administrator
Non-administrator Use any supported interface to create a pipeline from an existing connection. See Ingest data from Google Drive.

Feature availability

Feature Availability
UI-based pipeline authoring Green check icon Supported
API-based pipeline authoring Green check icon Supported
Declarative Automation Bundles Green check icon Supported
Incremental ingestion Green check icon Supported
Unity Catalog governance Green check icon Supported
Orchestration using Databricks Workflows Green check icon Supported
SCD type 2 Red X icon Not supported
Schema evolution Green check icon Supported
Configurable via schema_evolution_mode. See Google Drive connector reference.
API-based column selection and deselection Red X icon Not supported
API-based row filtering Red X icon Not supported

Authentication methods

Authentication method Availability
OAuth U2M Green check icon Supported
OAuth M2M Red X icon Not supported
OAuth (manual refresh token) Red X icon Not supported
Basic authentication (username/password) Red X icon Not supported
Basic authentication (API key) Red X icon Not supported
Basic authentication (service account JSON key) Red X icon Not supported