Upload .txt and .sh files to databricks Workspace folder.

Shriram Sethi 25 Reputation points
2025-03-27T06:53:40.9333333+00:00

I am trying to upload .txt and .sh files in the databricks workspace folder so that while starting a cluster the init script can be referred from the workspace folder, manually I am able to create the files but from azure devops pipeline its not uploading

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,406 questions
{count} votes

Accepted answer
  1. Chandra Boorla 11,745 Reputation points Microsoft External Staff
    2025-04-02T17:59:51.79+00:00

    @Shriram Sethi

    I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to accept the answer.

    Issue:

    Upload .txt and .sh files to databricks Workspace folder.

    Solution:

    "We can utilize the Databricks API to upload .txt, .sh, or other file formats to the Databricks Workspace. This can be achieved through an inline script in the Azure pipeline or by referencing an external script file.

    # Variables
    $DATABRICKS_URL = "$(hostName)" #referred from environment variables
    $TOKEN =  "$(PAT)" #referred from environment variables
    $SOURCE_PATH_SCRIPT = "source_path/file_name.sh"
    $DEST_PATH_SCRIPT = "/Workspace/folder/file_name.sh"
    $SOURCE_PATH_TXT = "source_path/file_name.txt"
    $DEST_PATH_TXT = "/Workspace/folder/file_name.txt"
    $FILE_TYPE = "AUTO"  # AUTO will detect the file type
    
    # Function to upload a file to Databricks workspace
    function Upload-FileToDatabricks {
        param (
            [string]$sourcePath,
            [string]$destPath
        )
    
    # Encode the file content to base64
        $content = [Convert]::ToBase64String([System.IO.File]::ReadAllBytes($sourcePath))
        
    	# Create the JSON payload
        $jsonPayload = @{
            path = $destPath
            content = $content
            format = $FILE_TYPE
            overwrite = $true
        } | ConvertTo-Json
        
    	# Make the API request
        Invoke-RestMethod -Method Post -Uri "$DATABRICKS_URL/api/2.0/workspace/import" -Headers @{Authorization = "Bearer $TOKEN"; "Content-Type" = "application/json"} -Body $jsonPayload
    }
    
    # Upload the script file
    Upload-FileToDatabricks -sourcePath $SOURCE_PATH_SCRIPT -destPath $DEST_PATH_SCRIPT
    
    # Upload the txt file
    Upload-FileToDatabricks -sourcePath $SOURCE_PATH_TXT -destPath $DEST_PATH_TXT
    

    The script mentioned above uploads .sh and .txt files to the Databricks Workspace. After uploading, you can reference the script in Databricks Compute and Workflow Compute. To ensure the init script runs when the cluster starts, update the Workflow JSON file accordingly."

    "init_scripts": [
                {
                  "workspace": {
                    "destination": "/folder/file_name.sh"
                  }
                }
              ]
    

    If I missed anything please let me know and I'd be happy to add it to my answer, or feel free to comment below with any additional information.

    Hope this helps. Do let us know if you have any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    0 comments No comments

2 additional answers

Sort by: Most helpful
  1. Vinodh247 32,361 Reputation points MVP
    2025-03-27T16:59:18.2633333+00:00

    Hi ,

    Thanks for reaching out to Microsoft Q&A.

    To upload .txt and .sh files to a Databricks Workspace folder via Azure DevOps pipeline, you cannot use the DBFS CLI for Workspace uploads. Instead, use the Databricks REST API or the Databricks CLI (v0.205+) with workspace commands.

    Preferred Approach: Use Databricks CLI with workspace API

    1. Install Databricks CLI (v0.205 or later) in your pipeline:

    Make sure your pipeline agent installs the latest CLI and sets up authentication. Use Databricks PAT (Personal Access Token) as DATABRICKS_TOKEN.

    • script: | pip install databricks-cli --upgrade databricks configure --token <<EOF $(DATABRICKS_HOST) $(DATABRICKS_TOKEN) EOF displayName: 'Install and Configure Databricks CLI'

    2.Upload files to Workspace path

    Example: Upload a shell script to /Workspace/Shared/init/init.sh. Make sure init.sh and myfile.txt are in your repository or available in the working directory.

    • script: | databricks workspace mkdirs /Shared/init databricks workspace import init.sh /Shared/init/init.sh --format AUTO --language SHELL databricks workspace import myfile.txt /Shared/init/myfile.txt --format AUTO --language TEXT displayName: 'Upload files to Databricks Workspace'
    • Workspace path is not the same as DBFS. Init scripts in Workspace can be referred using the path: /Workspace/Shared/init/init.sh
    • Use workspace path when cluster init script source is Workspace in UI or JSON.

    Validate: Cluster JSON Snippet (Workspace-based init script). You can use this in your cluster definition (via REST API or Terraform) to attach the init script.

    "init_scripts": [
      {
        "workspace": {
          "destination": "/Shared/init/init.sh"
        }
      }
    ]
    
    

    Please feel free to click the 'Upvote' (Thumbs-up) button and 'Accept as Answer'. This helps the community by allowing others with similar queries to easily find the solution.

    1 person found this answer helpful.

  2. Shriram Sethi 25 Reputation points
    2025-04-02T10:04:21.0366667+00:00

    By using Databricks API we are able to upload .txt and .sh or any other format to databricks workspace.
    Create a inline script in the azure pipeline or can be referred from a script file.

    # Variables
    $DATABRICKS_URL = "$(hostName)" #referred from environment variables
    $TOKEN =  "$(PAT)" #referred from environment variables
    $SOURCE_PATH_SCRIPT = "source_path/file_name.sh"
    $DEST_PATH_SCRIPT = "/Workspace/folder/file_name.sh"
    $SOURCE_PATH_TXT = "source_path/file_name.txt"
    $DEST_PATH_TXT = "/Workspace/folder/file_name.txt"
    $FILE_TYPE = "AUTO"  # AUTO will detect the file type
    
    # Function to upload a file to Databricks workspace
    function Upload-FileToDatabricks {
        param (
            [string]$sourcePath,
            [string]$destPath
        )
    
        # Encode the file content to base64
        $content = [Convert]::ToBase64String([System.IO.File]::ReadAllBytes($sourcePath))
        
    	# Create the JSON payload
        $jsonPayload = @{
            path = $destPath
            content = $content
            format = $FILE_TYPE
            overwrite = $true
        } | ConvertTo-Json
        
    	# Make the API request
        Invoke-RestMethod -Method Post -Uri "$DATABRICKS_URL/api/2.0/workspace/import" `
            -Headers @{Authorization = "Bearer $TOKEN"; "Content-Type" = "application/json"} `
            -Body $jsonPayload
    }
    
    # Upload the script file
    Upload-FileToDatabricks -sourcePath $SOURCE_PATH_SCRIPT -destPath $DEST_PATH_SCRIPT
    
    # Upload the txt file
    Upload-FileToDatabricks -sourcePath $SOURCE_PATH_TXT -destPath $DEST_PATH_TXT
    

    Now the above script will upload the .sh and .txt files.
    After that you can refer the uploaded script in the databricks compute. and in the workflow compute.
    To update the workflow job's compute to run the init script while the cluster is initialised we can update the Workflow JSON file.

    
    "init_scripts": [
                {
                  "workspace": {
                    "destination": "/folder/file_name.sh"
                  }
                }
              ]
    
    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.