Create a Dockerfile for a Microsoft Discovery tool

Containerizing your tool with Docker ensures it runs consistently across different hardware, compute pools, and environments within Microsoft Discovery. This article walks through organizing your tool's project, writing a Dockerfile, and validating the container image locally.

Note

This article assumes you understand tool's requirements and have written any action scripts. See Plan tool requirements and Write action scripts.

Prerequisites

Before creating a Dockerfile, ensure the following are installed and configured on your local machine:

Docker Desktop (version 20.10.0 or later), or Docker Engine. Download from Docker Official Website. Verify the Docker daemon is running: docker info.
Azure CLI (version 2.0.80 or later), configured with an active login: az login. Required when pushing to Azure Container Registry in a later step.
A text editor or IDE with Dockerfile support. VS Code with the Docker extension is recommended.

Step 1: Organize your project directory

Organize your tool's files before writing the Dockerfile. A consistent structure makes the Dockerfile easier to write and maintain.

Recommended structure:

<tool-name>/
    ├── Dockerfile                          # Container definition (created in this article)
    ├── README.md                           # Documentation
    ├── tool.yaml                           # Tool definition (created in a later step)
    ├── app/                                # Core implementation scripts
        ├── entrypoint.py                   # Main entrypoint for all actions
        ├── io_utils.py                     # I/O utilities and logging helpers
        └── <action_module>.py              # One module per action

Step 2: Create the Dockerfile

Create a file named Dockerfile (no extension) in the root of your tool directory.

The structure of your Dockerfile depends on your tool's language runtime and dependencies.

Step 3: Adapt for GPU or MPI tools

If your tool requires GPU acceleration or MPI-based distributed computing, the base image and system dependencies change accordingly.

GPU tools (CUDA):

FROM nvidia/cuda:12.2.0-runtime-ubuntu22.04

# Install CUDA-compatible Python environment
RUN apt-get update && apt-get install -y python3 python3-pip && \
    rm -rf /var/lib/apt/lists/*

MPI tools (tightly coupled parallel workloads):

FROM ubuntu:22.04

RUN apt-get update && apt-get install -y \
      openmpi-bin \
      libopenmpi-dev \
      python3 python3-pip \
 && rm -rf /var/lib/apt/lists/*

Step 4: Build the container image locally

Navigate to your tool directory and build the image:

cd <tool-name>

docker build -t <tool-name>:latest .

Build time varies depending on the number and size of dependencies. Subsequent builds are faster because Docker caches layers that aren't changed.

Step 5: Test the container image locally

Before publishing, verify the container image runs correctly with test inputs.

Verify the environment:

docker run --rm <tool-name>:latest python -c "import rdkit; print('RDKit OK')"

Run an action with mounted test data:

docker run --rm \
  -v "$(pwd)/input:/input" \
  -v "$(pwd)/output:/output" \
  <tool-name>:latest \
  python3 /app/entrypoint.py \
    --action <action-name> \
    --input /input \
    --output /output

Inspect the output:

ls ./output/
cat ./output/results.json

Confirm that:

The container exits with code 0
The expected output files are present in /output/

Next steps

Once your container image builds and passes local tests, proceed to publish it to Azure Container Registry:

Build and publish a tool container to Azure Container Registry

Feedback

Was this page helpful?

Last updated on 2026-04-20