Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Manage ingestion jobs in Agentic Retrieval in Foundry Local. The ingestion API ingests documents from NFS or SharePoint data sources into a collection. The API parses, chunks, embeds, and stores documents in the collection's vector collections and database tables.
Important
Agentic Retrieval in Foundry Local is currently in PREVIEW. See the Supplemental Terms of Use for Microsoft Azure Previews for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.
API information
All endpoints require the api-version=2024-10-01-preview query parameter.
| Property | Value |
|---|---|
| API version | 2024-10-01-preview |
| Service | Ingestion API (ingestionapi) |
| Port | 8000 |
| Dapr App ID | ingestionapi |
Access Methods
You can reach the Ingestion API through three access methods. The access method determines whether authentication is required.
External access (via ingress)
https://<cluster-domain>/edgeai/ingestion/...
Requires a valid JWT token (see Authentication). The ingress adds the X-External-Request header which triggers Entra Auth sidecar validation.
Port forwarding (for development and testing)
kubectl port-forward deployment/ingestionapi-deployment 8000:8000 -n arc-rag
Then call http://localhost:8000/edgeai/ingestion/.... No Authorization header needed.
Internal access (from a pod via Dapr)
curl -H "dapr-app-id: ingestionapi" http://localhost:3500/edgeai/ingestion/...
No Authorization header needed for internal Dapr calls.
Authentication
Required only for external access (via ingress). Port forwarding and internal Dapr calls bypass authentication.
All external API calls require a valid Entra ID JWT token with the EdgeRAGDeveloper app role.
az account clear
az login --tenant <your-tenant-id> --output none
TOKEN=$(az account get-access-token \
--resource "api://<your-app-client-id>" \
--query accessToken -o tsv)
Tokens expire after about one hour. If you receive a 401 Unauthorized, acquire a fresh token.
Create
Starts an ingestion job that parses, chunks, embeds, and stores documents from an NFS or SharePoint data source into a collection.
Request
POST /edgeai/ingestion/jobs/{job_id}?api-version=2024-10-01-preview
URI parameters
| Name | In | Required | Type | Description |
|---|---|---|---|---|
job_id |
path | True | String | A user-chosen name for the ingestion job (for example, nfs-quarterly-reports). Reusing the same job_id creates a new "run" under that job. |
api-version |
query | True | String | The API version to use. |
Request headers
| Name | Required | Type | Description |
|---|---|---|---|
Content-Type |
True | String | application/json |
Authorization |
Conditional | String | Bearer {token}. Required for external access only. |
Request body
| Name | Required | Type | Description |
|---|---|---|---|
datasource.kind |
True | String | "nfsv3" or "sharepoint". |
datasource.connection.nfsServerPath |
Conditional | String | NFS server path in format server_ip:/path. Required when kind is "nfsv3". |
datasource.connection.nfsUid |
False | Integer | NFS user ID. Default: 0. |
datasource.connection.nfsGid |
False | Integer | NFS group ID. Default: 0. |
datasource.connection.sharePointUrl |
Conditional | String | SharePoint site URL. Required when kind is "sharepoint". |
datasource.connection.sharePointFolderPath |
Conditional | String | SharePoint folder path. Required when kind is "sharepoint". |
datasource.connection.sharePointUsername |
Conditional | String | SharePoint username in DOMAIN\username format. Required when kind is "sharepoint". |
datasource.connection.sharePointPassword |
Conditional | String | RSA-encrypted base64-encoded password. Required when kind is "sharepoint". |
datasource.chunking.chunkSize |
True | String | Chunk size in characters. Must be a string, not an integer. |
datasource.chunking.chunkOverlap |
True | String | Overlap between chunks in characters. Must be a string, not an integer. |
collectionName |
False | String | Target collection name. Must exist before ingestion (create via Collections API). Default: "edgeragapp". The alias embeddingIndexName is also accepted for backward compatibility. |
dataRefreshIntervalInHours |
False | Integer | -1 = one-time ingestion, 1/4/24 = recurring sync. Default: -1. |
parsingMode |
False | String | "basic" (LangChain) or "advanced" (Docling). Default: "basic". Use "advanced" for better results. |
excludeLgrFailureUpdate |
False | Boolean | If true, don't mark job as FAILED when LazyGraphRAG build fails. Default: false. |
Example: NFS data source
Request
POST /edgeai/ingestion/jobs/q1-ingest?api-version=2024-10-01-preview
Content-Type: application/json
Authorization: Bearer eyJ0eX...FWSXfwtQ
{
"datasource": {
"kind": "nfsv3",
"connection": {
"nfsServerPath": "10.244.3.70:/exports/my-data"
},
"chunking": {
"chunkSize": "2000",
"chunkOverlap": "200"
}
},
"collectionName": "my-collection",
"parsingMode": "advanced"
}
Response
Status code: 200 OK
Response headers:
operation-location: /edgeai/ingestion/jobs/q1-ingest/runs/{run_id}?api-version=2024-10-01-preview
Response body:
{
"status": "SUCCESS",
"message": {
"datasource": { "..." : "..." },
"dataRefreshIntervalInHours": -1,
"embeddingIndexName": "my_collection",
"originalCollectionName": "my-collection",
"ingestionGroupName": "",
"parsingMode": "advanced",
"excludeLgrFailureUpdate": false
}
}
The embeddingIndexName in the response shows the sanitized storage prefix with underscores. The originalCollectionName shows the original name you provided.
Example: SharePoint data source
Request
POST /edgeai/ingestion/jobs/sp-ingest?api-version=2024-10-01-preview
Content-Type: application/json
Authorization: Bearer eyJ0eX...FWSXfwtQ
{
"datasource": {
"kind": "sharepoint",
"connection": {
"sharePointUrl": "https://company.sharepoint.com/sites/docs",
"sharePointFolderPath": "/sites/site-name/Shared Documents",
"sharePointUsername": "DOMAIN\\username",
"sharePointPassword": "<RSA-encrypted-base64>"
},
"chunking": {
"chunkSize": "2000",
"chunkOverlap": "200"
}
},
"collectionName": "my-collection",
"parsingMode": "advanced"
}
Response codes
| Code | Description |
|---|---|
| 200 | OK. The service accepts the ingestion job. The operation-location response header contains the URI to monitor the run. |
| 400 | Bad Request. Collection not found (COLLECTION_NOT_FOUND) or invalid NFS path. |
| 422 | Unprocessable Entity. Missing required fields (Pydantic validation error). |
List
Returns all ingestion job runs.
Request
GET /edgeai/ingestion/jobs?api-version=2024-10-01-preview
URI parameters
| Name | In | Required | Type | Description |
|---|---|---|---|---|
api-version |
query | True | String | The API version to use. |
Request headers
| Name | Required | Type | Description |
|---|---|---|---|
Authorization |
Conditional | String | Bearer {token}. Required for external access only. |
Example
Request
GET /edgeai/ingestion/jobs?api-version=2024-10-01-preview
Authorization: Bearer eyJ0eX...FWSXfwtQ
Response
Status code: 200 OK
{
"jobRuns": [
{
"jobId": "my-job",
"runId": "abc123def456",
"status": "COMPLETED",
"startTime": "2026-03-04T12:00:00+00:00",
"endTime": "2026-03-04T12:05:00+00:00",
"createdBy": "<your-display-name>",
"collectionName": "my-collection"
}
]
}
Response codes
| Code | Description |
|---|---|
| 200 | OK. The response contains the list of job runs. |
Job status values
| Status | Description |
|---|---|
PENDING |
The job is queued and hasn't started. |
RUNNING |
The job is currently processing documents. |
COMPLETED |
The job finished successfully. |
FAILED |
The job encountered an error. |
Get Run
Gets detailed information about a specific ingestion run, including progress and error information.
Request
GET /edgeai/ingestion/jobs/{job_id}/runs/{run_id}?api-version=2024-10-01-preview
URI parameters
| Name | In | Required | Type | Description |
|---|---|---|---|---|
job_id |
path | True | String | The ingestion job name. |
run_id |
path | True | String | The auto-generated run identifier (returned in the operation-location header). |
api-version |
query | True | String | The API version to use. |
Request headers
| Name | Required | Type | Description |
|---|---|---|---|
Authorization |
Conditional | String | Bearer {token}. Required for external access only. |
Response codes
| Code | Description |
|---|---|
| 200 | OK. The response contains detailed run info including totalItems, processedItems, failedItems, and errorInfo. |
Delete Runs
Deletes all runs for a given job. You can't delete runs while jobs are in PENDING or RUNNING status.
Request
DELETE /edgeai/ingestion/jobs/{job_id}/runs?api-version=2024-10-01-preview
URI parameters
| Name | In | Required | Type | Description |
|---|---|---|---|---|
job_id |
path | True | String | The ingestion job name. |
api-version |
query | True | String | The API version to use. |
Request headers
| Name | Required | Type | Description |
|---|---|---|---|
Authorization |
Conditional | String | Bearer {token}. Required for external access only. |
Response codes
| Code | Description |
|---|---|
| 200 | OK. The job runs were deleted. |
| 409 | Conflict. Cannot delete while jobs are in PENDING or RUNNING status. |
Related content
- Collections API Reference — Create and manage collections before ingesting
- Inference API Reference — Query collections using RAG after ingestion