GitHub connector

Important

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Azure Databricks previews.

The managed GitHub connector in Lakeflow Connect allows you to ingest data from GitHub into Azure Databricks.

What to know before you start

Topic Why it matters
Azure Databricks user persona The workflow depends on your Azure Databricks user persona:
  • Single-user: An admin user creates a Unity Catalog connection and an ingestion pipeline.
  • Multi-user: An admin user creates a connection for non-admin users to create pipelines with.
Authentication method The steps to create a connection depend on the authentication method you choose.
Interface The steps to create a pipeline depend on the interface.
Ingestion frequency The pipeline schedule depends on your latency and cost requirements.
Common patterns Depending on your ingestion needs, the pipeline might use configurations like history tracking, column selection, and row filtering. Supported configurations vary by connector. See Feature availability.

Start ingesting from GitHub

The following table summarizes the end-to-end GitHub ingestion flow, based on user type:

User Steps
Admin
Non-admin Use any supported interface to create a pipeline from an existing connection. See Ingest data from GitHub.

Feature availability

Feature Availability
UI-based pipeline authoring check marked yes Supported
API-based pipeline authoring check marked yes Supported
Declarative Automation Bundles check marked yes Supported
Incremental ingestion check marked yes Partially supported
Some tables support incremental ingestion. Other tables require a full refresh. See Supported data.
Unity Catalog governance check marked yes Supported
Lakeflow Jobs check marked yes Supported
SCD type 2 check marked yes Supported
Column selection and deselection check marked yes Supported
API-based row filtering x mark no Not supported
Automated schema evolution: New and deleted columns x mark no Not supported
Automated schema evolution: Data type changes x mark no Not supported
Automated schema evolution: Column renames x mark no Not supported
Automated schema evolution: New tables x mark no Not supported

Authentication methods

Authentication method Availability
OAuth U2M check marked yes Supported
OAuth M2M x mark no Not supported
OAuth (manual refresh token) x mark no Not supported
Basic authentication (username/password) x mark no Not supported
Basic authentication (API key) x mark no Not supported
Basic authentication (service account JSON key) x mark no Not supported