multiple json files load through api

Question

multiple json files load through api

Shambhu Rai 1,411

Hi Expert,

How to load multiple json files using API URL in blob or databricks

1 answer

Your answer

Answer 1

PRADEEPCHEEKATLA 91,496 Moderator

@Shambhu Rai - Thanks for the question and using MS Q&A platform.

To load multiple JSON files using an API URL in Azure Databricks, you can use the dbutils.fs.mount command to mount the Blob storage container that contains the JSON files, and then use the spark.read.json command to read the files into a DataFrame.

First, you need to mount the data lake storage account to Databricks. You can do this by following the instructions in the Databricks documentation: Connect to Azure Data Lake Storage Gen2 and Blob Storage or

Here's an example code snippet that demonstrates how to do this:

# Mount the Blob storage container
storage_account_name = "<your-storage-account-name>"
container_name = "<your-container-name>"
storage_account_access_key = "<your-storage-account-access-key>"
mount_point = "/mnt/<your-mount-point>"
dbutils.fs.mount(
  source=f"wasbs://{container_name}@{storage_account_name}.blob.core.windows.net",
  mount_point=mount_point,
  extra_configs={
    f"fs.azure.account.key.{storage_account_name}.blob.core.windows.net": storage_account_access_key
  }
)

# Read the JSON files into a DataFrame
json_files_path = f"{mount_point}/path/to/json/files/*.json"
df = spark.read.json(json_files_path)

In this example, you'll need to replace the placeholders <your-storage-account-name>, <your-container-name>, <your-storage-account-access-key>, <your-mount-point>, and /path/to/json/files/ with your own values.

Alternatively, if you don't want to mount the Blob storage container, you can use the spark.read.json command to read the JSON files directly from the API URL. Here's an example code snippet that demonstrates how to do this:

# Read the JSON files into a DataFrame
json_files_url = "<your-api-url>"
df = spark.read.json(json_files_url)

In this example, you'll need to replace the placeholder <your-api-url> with the URL of the API that returns the JSON files.

As per the repro, I had tested in my environment.

I have 3 json blob files inside the subfolder of my container in storage account. I am able to read all the blob json files in a single data frame.

User's image

You can use the below code to display all json the files from the subfolder in a single data frame

df = spark.read.json("wasbs://container_name@blob_storage_account.blob.core.windows.net/sub_folder/*.json")
df.show()

User's image

For more details, refer toAzure Databricks - JSON File.

Hope this helps. Do let us know if you any further queries.

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Shambhu Rai 1,411 Reputation points

2023-07-07T08:53:50.5033333+00:00

the question is when we have multiple API URL

site.com/api1

site.com/api2

site.com/api4

site.com/api5

site.com/api6

and wants to load it in one batch with append data
PRADEEPCHEEKATLA 91,496 Reputation points Moderator

2023-07-10T05:35:50.6566667+00:00

@Shambhu Rai -You may checkout this article: How to call web API from an Azure Data-bricks notebook to an Azure Blob Storage.

Hope this helps. Do let us know if you any further queries.
Shambhu Rai 1,411 Reputation points

2023-07-10T05:53:05.0133333+00:00

the same method is explain above.. how we can call in bulk
Shambhu Rai 1,411 Reputation points

2023-07-10T21:09:58.2633333+00:00

suggestion please
Shambhu Rai 1,411 Reputation points

2023-07-12T19:25:15.2+00:00

suggestion please
PRADEEPCHEEKATLA 91,496 Reputation points Moderator

2023-07-17T08:04:36.3766667+00:00

@Shambhu Rai - Have you tried the above suggestion?
Shambhu Rai 1,411 Reputation points

2023-07-17T08:09:36.4166667+00:00

yes this is for single file
PRADEEPCHEEKATLA 91,496 Reputation points Moderator

2023-07-20T09:32:00.5133333+00:00

@PRADEEPCHEEKATLA -Have you tried passing with comma seperated?
Shambhu Rai 1,411 Reputation points

2023-07-21T10:07:15.57+00:00

Sir,

it means number of json files
PRADEEPCHEEKATLA 91,496 Reputation points Moderator

2023-07-24T08:59:25.0433333+00:00

@Shambhu Rai - I mean passing API using comma separated?

Share via

multiple json files load through api

1 answer

Your answer