How to save a py spark dataframe created from azure comosdb document to ADB mounitng point?

Arindam Pandit 1 Reputation point

I am using Azure Cosmos DB Collection. I have a complex nested document and I am loading it using Azure CosmosDB SQL API SQL Client.

I am doing something like

from azure.cosmos import CosmosClient
import os

url = os.environ['ACCOUNT_URI']
key = os.environ['ACCOUNT_KEY']
client = CosmosClient(url, credential=key)
database_name = 'testDatabase'
database = client.get_database_client(database_name)
container_name = 'products'
container = database.get_container_client(container_name)

Enumerate the returned items

import json
for item in container.query_items(
query='SELECT * FROM mycontainer r WHERE"item3"',

My problem is to convert the item into a json file and save it in ADB mounting point.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,956 questions
Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,467 questions
{count} votes

1 answer

Sort by: Most helpful
  1. HimanshuSinha-msft 19,381 Reputation points Microsoft Employee

    Hello @Arindam Pandit ,
    Thanks for the question and using MS Q&A platform.

    As we understand the ask here is to read the records from the cosmosdb and right them to the JSON , please do let us know if its not accurate.
    To write the data back to the cloud storage , you can follow the document here .
    << extract from the link >>

    Upload a file to a directory

    First, create a file reference in the target directory by creating an instance of the DataLakeFileClient class. Upload a file by calling the DataLakeFileClient.append_data method. Make sure to complete the upload by calling the DataLakeFileClient.flush_data method.

    This example uploads a text file to a directory named my-directory.

    def upload_file_to_directory():  
            file_system_client = service_client.get_file_system_client(file_system="my-file-system")  
            directory_client = file_system_client.get_directory_client("my-directory")  
            file_client = directory_client.create_file("uploaded-file.txt")  
            local_file = open("C:\\file-to-upload.txt",'r')  
            file_contents =  
            file_client.append_data(data=file_contents, offset=0, length=len(file_contents))  
        except Exception as e:  

    Please do let me know if you have any further questions .

    For the clarity of the other community , I am sharing the code base to enumerate the records in a contaner below which worked for me .

    from azure.cosmos import CosmosClient  
    import azure.cosmos.cosmos_client as cosmos_client  
    import os  
    import json   
    url = ""  
    key ="accesskey"  
    client = CosmosClient(url, credential=key)  
    database_name = 'himashu'  
    database = client.get_database_client(database_name)  
    container_name = 'container1'  
    container = database.get_container_client(container_name)  
    client = cosmos_client.CosmosClient(url, {'masterKey': key})  
    dbClient = client.get_database_client(database_name)  
    containerClient = dbClient.get_container_client(container_name)  
    for items in containerClient.query_items(  
            query='SELECT * FROM c',  
            enable_cross_partition_query = True):      


    Please do let me if you have any queries.

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
      • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    0 comments No comments