Tindakan GitHub

Penting

GitHub Actions memicu eksekusi alur CI/CD dari repositori GitHub Anda dan memungkinkan Anda mengotomatiskan alur CI/CD build, pengujian, dan penyebaran Anda.

Halaman ini menyediakan informasi tentang GitHub Actions yang dikembangkan oleh Databricks dan contoh untuk kasus penggunaan umum. Untuk informasi tentang fitur CI/CD dan praktik terbaik lainnya di Databricks, lihat CI/CD di Azure Databricks dan Praktik terbaik dan alur kerja CI/CD yang direkomendasikan di Databricks.

Tindakan Databricks GitHub

Databricks telah mengembangkan GitHub Actions berikut untuk alur kerja CI/CD Anda di GitHub. Tambahkan file YAML GitHub Actions ke direktori repositori .github/workflows Anda.

Nota

Artikel ini membahas GitHub Actions, yang dikembangkan oleh pihak ketiga. Untuk menghubungi penyedia, lihat Dukungan Tindakan GitHub.

Tindakan GitHub	Deskripsi
databricks/setup-cli	Tindakan komposit yang menyiapkan Databricks CLI dalam alur kerja GitHub Actions.

Menjalankan alur kerja CI/CD yang memperbarui folder Git

File YAML GitHub Actions berikut ini memperbarui folder Git di ruang kerja saat branch jarak jauh diperbarui. Untuk informasi tentang pendekatan folder Git untuk CI/CD, lihat Alat lain untuk kontrol sumber.

Persyaratan

Contoh ini menggunakan federasi identitas beban kerja untuk GitHub Actions untuk keamanan yang ditingkatkan, dan mengharuskan Anda menambahkan perwakilan layanan di akun Anda dengan kebijakan federasi GitHub Actions. Lihat Mengaktifkan federasi identitas beban kerja untuk GitHub Actions.

Penting

Subjek kebijakan federasi (identitas token federasi) harus sama persis dengan subjek token yang diharapkan. Untuk contoh ini, jenis dan nama entitas adalah Lingkungan dan Prod. Subjek yang dibangun harus dalam bentuk repo:my-github-org-or-user/my-repo:environment:Prod.

Setelah Anda membuat perwakilan layanan dengan kebijakan federasi, atur DATABRICKS_HOST variabel lingkungan ke ruang kerja host Azure Databricks Anda dan DATABRICKS_CLIENT_ID variabel lingkungan ke UUID perwakilan layanan. Variabel DATABRICKS_AUTH_TYPE lingkungan diatur dalam tindakan. Untuk informasi tentang variabel lingkungan Databricks, lihat Variabel dan bidang lingkungan untuk autentikasi terpadu.

Membuat Tindakan

Sekarang tambahkan file .github/workflows/sync_git_folder.yml ke repositori Anda dengan YAML berikut:

name: Sync Git Folder

concurrency: prod_environment

on:
  push:
    branches:
      # Set your base branch name here
      - git-folder-cicd-example

permissions:
  # These permissions are required for workload identity federation.
  id-token: write
  contents: read

jobs:
  deploy:
    runs-on: ubuntu-latest
    name: 'Update git folder'
    environment: Prod
    env:
      DATABRICKS_AUTH_TYPE: github-oidc
      DATABRICKS_HOST: ${{ vars.DATABRICKS_HOST }}
      DATABRICKS_CLIENT_ID: ${{ secrets.DATABRICKS_CLIENT_ID }} # This is the service principal UUID.

    steps:
      - uses: actions/checkout@v3
      - uses: databricks/setup-cli@main
      - name: Update git folder
        # Set your workspace path and branch name here
        run: databricks repos update /Workspace/<git-folder-path> --branch git-folder-cicd-example

Menjalankan alur kerja CI/CD dengan bundel yang menjalankan pembaruan pipeline

Contoh file GITHub Actions YAML berikut memicu penyebaran pengujian yang memvalidasi, menyebarkan, dan menjalankan pekerjaan yang ditentukan dalam bundel dalam target pra-produksi bernama dev sebagaimana didefinisikan dalam file konfigurasi bundel.

Persyaratan

Contoh ini mengharuskan ada:

Variabel lingkungan yang didefinisikan pengguna DATABRICKS_BUNDLE_ENV.

File konfigurasi bundel di akar repositori, yang secara eksplisit dideklarasikan melalui pengaturan working-directory: . file YAML GitHub Actions File konfigurasi bundel ini harus menentukan alur kerja Azure Databricks bernama sample_job dan target bernama dev. Contohnya:

# This is a Databricks asset bundle definition for pipeline_update.
bundle:
  name: pipeline_update

include:
  - resources/*.yml

variables:
  catalog:
    description: The catalog to use
  schema:
    description: The schema to use

resources:
  jobs:
    sample_job:
      name: sample_job

      parameters:
        - name: catalog
          default: ${var.catalog}
        - name: schema
          default: ${var.schema}

      tasks:
        - task_key: refresh_pipeline
          pipeline_task:
            pipeline_id: ${resources.pipelines.sample_pipeline.id}

      environments:
        - environment_key: default
          spec:
            environment_version: '4'

  pipelines:
    sample_pipeline:
      name: sample_pipeline
      catalog: ${var.catalog}
      schema: ${var.schema}
      serverless: true
      root_path: '../src/sample_pipeline'

      libraries:
        - glob:
            include: ../src/sample_pipeline/transformations/**

      environment:
        dependencies:
          - --editable ${workspace.file_path}

targets:
  dev:
    mode: development
    default: true
    workspace:
      host: <dev-workspace-url>
    variables:
      catalog: my_catalog
      schema: ${workspace.current_user.short_name}
  prod:
    mode: production
    workspace:
      host: <production-workspace-url>
      root_path: /Workspace/Users/someone@example.com/.bundle/${bundle.name}/${bundle.target}
    variables:
      catalog: my_catalog
      schema: prod
    permissions:
      - user_name: someone@example.com
        level: CAN_MANAGE

Untuk informasi selengkapnya tentang konfigurasi bundel, lihat Konfigurasi Bundel Aset Databricks.

Rahasia GitHub bernama SP_TOKEN, mewakili token akses Azure Databricks untuk perwakilan layanan Azure Databricks yang terkait dengan ruang kerja Azure Databricks tempat bundel ini disebarkan dan dijalankan. Untuk membuat token:
1. Buat Service Principal Databricks. Lihat Menambahkan perwakilan layanan ke akun Anda.
2. Buat rahasia untuk prinsipal layanan. Lihat Langkah 1: Membuat rahasia OAuth. Salin nilai kunci rahasia dan ID klien.
3. Buat token akses Databricks secara manual (akun atau ruang kerja) menggunakan nilai rahasia dan ID klien yang disalin. Lihat Membuat token akses tingkat akun.
4. access_token Salin nilai dari respons JSON. Tambahkan rahasia GitHub bernama SP_TOKEN ke Tindakan di repositori Anda dan gunakan token akses Databricks sebagai nilai rahasia. Lihat Rahasia terenkripsi.
DATABRICKS_TOKEN Variabel lingkungan autentikasi terpadu diatur dalam tindakan ke SP_TOKEN yang Anda konfigurasikan.

Membuat Tindakan

Sekarang tambahkan file .github/workflows/pipeline_update.yml ke repositori Anda dengan YAML berikut:

# This workflow validates, deploys, and runs the specified bundle
# within a pre-production target named "dev".
name: 'Dev deployment'

# Ensure that only a single job or workflow using the same concurrency group
# runs at a time.
concurrency: 1

# Trigger this workflow whenever a pull request is opened against the repo's
# main branch or an existing pull request's head branch is updated.
on:
  pull_request:
    types:
      - opened
      - synchronize
    branches:
      - main

jobs:
  # Used by the "pipeline_update" job to deploy the bundle.
  # Bundle validation is automatically performed as part of this deployment.
  # If validation fails, this workflow fails.
  deploy:
    name: 'Deploy bundle'
    runs-on: ubuntu-latest

    steps:
      # Check out this repo, so that this workflow can access it.
      - uses: actions/checkout@v3

      # Download the Databricks CLI.
      # See https://github.com/databricks/setup-cli
      - uses: databricks/setup-cli@main

      # Deploy the bundle to the "dev" target as defined
      # in the bundle's settings file.
      - run: databricks bundle deploy
        working-directory: .
        env:
          DATABRICKS_TOKEN: ${{ secrets.SP_TOKEN }}
          DATABRICKS_BUNDLE_ENV: dev

  # Validate, deploy, and then run the bundle.
  pipeline_update:
    name: 'Run pipeline update'
    runs-on: ubuntu-latest

    # Run the "deploy" job first.
    needs:
      - deploy

    steps:
      # Check out this repo, so that this workflow can access it.
      - uses: actions/checkout@v3

      # Use the downloaded Databricks CLI.
      - uses: databricks/setup-cli@main

      # Run the Databricks workflow named "sample_job" as defined in the
      # bundle that was just deployed.
      - run: databricks bundle run sample_job --refresh-all
        working-directory: .
        env:
          DATABRICKS_TOKEN: ${{ secrets.SP_TOKEN }}
          DATABRICKS_BUNDLE_ENV: dev

Anda mungkin juga ingin memulai penerapan produksi. File YAML GitHub Actions berikut dapat ada di repositori yang sama dengan file sebelumnya. File ini memvalidasi, menyebarkan, dan menjalankan bundel yang ditentukan dalam target produksi bernama "prod" seperti yang didefinisikan dalam file konfigurasi bundel.

# This workflow validates, deploys, and runs the specified bundle
# within a production target named "prod".
name: 'Production deployment'

# Ensure that only a single job or workflow using the same concurrency group
# runs at a time.
concurrency: 1

# Trigger this workflow whenever a pull request is pushed to the repo's
# main branch.
on:
  push:
    branches:
      - main

jobs:
  deploy:
    name: 'Deploy bundle'
    runs-on: ubuntu-latest

    steps:
      # Check out this repo, so that this workflow can access it.
      - uses: actions/checkout@v3

      # Download the Databricks CLI.
      # See https://github.com/databricks/setup-cli
      - uses: databricks/setup-cli@main

      # Deploy the bundle to the "prod" target as defined
      # in the bundle's settings file.
      - run: databricks bundle deploy
        working-directory: .
        env:
          DATABRICKS_TOKEN: ${{ secrets.SP_TOKEN }}
          DATABRICKS_BUNDLE_ENV: prod

  # Validate, deploy, and then run the bundle.
  pipeline_update:
    name: 'Run pipeline update'
    runs-on: ubuntu-latest

    # Run the "deploy" job first.
    needs:
      - deploy

    steps:
      # Check out this repo, so that this workflow can access it.
      - uses: actions/checkout@v3

      # Use the downloaded Databricks CLI.
      - uses: databricks/setup-cli@main

      # Run the Databricks workflow named "sample_job" as defined in the
      # bundle that was just deployed.
      - run: databricks bundle run sample_job --refresh-all
        working-directory: .
        env:
          DATABRICKS_TOKEN: ${{ secrets.SP_TOKEN }}
          DATABRICKS_BUNDLE_ENV: prod

Menjalankan alur kerja CI/CD yang membangun JAR dan menyebarkan bundel

Jika Anda memiliki ekosistem berbasis Java, GitHub Action Anda perlu membuat dan mengunggah JAR sebelum menyebarkan bundel. Contoh file GITHub Actions YAML berikut memicu penyebaran yang membangun dan mengunggah JAR ke volume, lalu memvalidasi dan menyebarkan bundel ke target produksi bernama "prod" seperti yang didefinisikan dalam file konfigurasi bundel. Ini mengkompilasi JAR berbasis Java, tetapi langkah-langkah kompilasi untuk proyek berbasis Scala serupa.

Persyaratan

Contoh ini mengharuskan ada:

File konfigurasi bundel di akar repositori, yang secara eksplisit dideklarasikan melalui pengaturan file YAML GitHub Actions working-directory: .
DATABRICKS_TOKEN Variabel lingkungan yang mewakili token akses Azure Databricks yang terkait dengan ruang kerja Azure Databricks tempat bundel ini disebarkan dan dijalankan.
DATABRICKS_HOST Variabel lingkungan yang mewakili ruang kerja host Azure Databricks.

Membuat Tindakan

Sekarang tambahkan file .github/workflows/build_jar.yml ke repositori Anda dengan YAML berikut:

name: Build JAR and deploy with bundles

on:
  pull_request:
    branches:
      - main
  push:
    branches:
      - main

jobs:
  build-test-upload:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Java
        uses: actions/setup-java@v4
        with:
          java-version: '17' # Specify the Java version used by your project
          distribution: 'temurin' # Use a reliable JDK distribution

      - name: Cache Maven dependencies
        uses: actions/cache@v4
        with:
          path: ~/.m2/repository
          key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}
          restore-keys: |
            ${{ runner.os }}-maven-

      - name: Build and test JAR with Maven
        run: mvn clean verify # Use verify to ensure tests are run

      - name: Databricks CLI Setup
        uses: databricks/setup-cli@v0.9.0 # Pin to a specific version

      - name: Upload JAR to a volume
        env:
          DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
          DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }} # Add host for clarity
        run: |
          databricks fs cp target/my-app-1.0.jar dbfs:/Volumes/artifacts/my-app-${{ github.sha }}.jar --overwrite

  validate:
    needs: build-test-upload
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Databricks CLI Setup
        uses: databricks/setup-cli@v0.9.0

      - name: Validate bundle
        env:
          DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
          DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
        run: databricks bundle validate

  deploy:
    needs: validate
    if: github.event_name == 'push' && github.ref == 'refs/heads/main' # Only deploy on push to main
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Databricks CLI Setup
        uses: databricks/setup-cli@v0.9.0

      - name: Deploy bundle
        env:
          DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
          DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
        run: databricks bundle deploy --target prod

Sumber daya tambahan

Saran dan Komentar

Apakah halaman ini membantu?

Last updated on 2026-02-01

Bagikan melalui

Tindakan GitHub

Tindakan Databricks GitHub

Menjalankan alur kerja CI/CD yang memperbarui folder Git

Persyaratan

Membuat Tindakan

Menjalankan alur kerja CI/CD dengan bundel yang menjalankan pembaruan pipeline

Persyaratan

Membuat Tindakan

Menjalankan alur kerja CI/CD yang membangun JAR dan menyebarkan bundel

Persyaratan

Membuat Tindakan

Sumber daya tambahan

Saran dan Komentar

Sumber Daya Tambahan: