Sandboxes
Applies to: ✅ Azure Data Explorer
Kusto can run sandboxes for specific flows that must be run in a secure and isolated environment. Examples of these flows are user-defined scripts that run using the Python plugin or the R plugin.
Sandboxes are run locally (meaning, processing is done close to the data), with no extra latency for remote calls.
Prerequisites and limitations
- Sandboxes must run on VM sizes supporting nested virtualization, which implemented using Hyper-V technology and have no limitations.
- The image for running the sandboxes is deployed to every cluster node and requires dedicated SSD space to run.
- The estimated size is between 10-20 GB.
- This affects the cluster's data capacity, and may affect the cost of the cluster.
Runtime
- A sandboxed query operator may use one or more sandboxes for its execution.
- A sandbox is only used for a single query and is disposed of once that query completes.
- When a node is restarted, for example, as part of a service upgrade, all running sandboxes on it are disposed of.
- Each node maintains a predefined number of sandboxes that are ready for running incoming requests.
- Once a sandbox is used, a new one is automatically made available to replace it.
- If there are no pre-allocated sandboxes available to serve a query operator, it will be throttled until new sandboxes are available. For more information, see Errors. New sandbox allocation could take up to 10-15 seconds per sandbox, depending on the SKU and available resources on the data node.
Sandbox parameters
Some of the parameters can be controlled using a cluster-level sandbox policy, for each kind of sandbox.
- Number of sandboxes per node: The number of sandboxes per node is limited.
- Requests that are made when there's no available sandbox will be throttled.
- Initialize on startup: if set to
false
(default), sandboxes are lazily initialized on a node, the first time a query requires a sandbox for its execution. Otherwise, if set totrue
, sandboxes are initialized as part of service startup.- This means that the first execution of a plugin that uses sandboxes on a node will include a short warm-up period.
- CPU: The maximum rate of CPU a sandbox can consume of its host's processors is limited (default is
50%
).- When the limit is reached, the sandbox's CPU use is throttled, but execution continues.
- Memory: The maximum amount of RAM a sandbox can consume of its host's RAM is limited.
- Default memory for Hyper-V technology is 1 GB, and for legacy sandboxes 20 GB.
- Reaching the limit results in termination of the sandbox, and a query execution error.
Sandbox limitations
- Network: A sandbox can't interact with any resource on the virtual machine (VM) or outside of it.
- A sandbox can't interact with another sandbox.
Note
The resources used with sandbox depend not only on the size of the data being processed as part of the request,
but also on the logic that runs in the sandbox, and the implementation of libraries being used by it.
For example, for the python
and r
plugins, the latter means the user-provided script and the Python or R libraries it consumes at runtime.
Errors
ErrorCode | Status | Message | Potential reason |
---|---|---|---|
E_SB_QUERY_THROTTLED_ERROR | TooManyRequests (429) | The sandboxed query was aborted because of throttling. Retrying after some backoff might succeed | There are no available sandboxes on the target node. New sandboxes should become available in a few seconds |
E_SB_QUERY_THROTTLED_ERROR | TooManyRequests (429) | Sandboxes of kind '{kind}' haven't yet been initialized | The sandbox policy has recently changed. New sandboxes obeying the new policy will become available in a few seconds |
InternalServiceError (520) | The sandboxed query was aborted due to a failure in initializing sandboxes | An unexpected infrastructure failure. |
VM Sizes supporting nested virtualization
The following table lists all modern VM sizes that support Hyper-V sandbox technology.
Name | Category |
---|---|
Standard_L8s_v3 | storage-optimized |
Standard_L16s_v3 | storage-optimized |
Standard_L8as_v3 | storage-optimized |
Standard_L16as_v3 | storage-optimized |
Standard_E8as_v5 | storage-optimized |
Standard_E16as_v5 | storage-optimized |
Standard_E8s_v4 | storage-optimized |
Standard_E16s_v4 | storage-optimized |
Standard_E8s_v5 | storage-optimized |
Standard_E16s_v5 | storage-optimized |
Standard_E2ads_v5 | compute-optimized |
Standard_E4ads_v5 | compute-optimized |
Standard_E8ads_v5 | compute-optimized |
Standard_E16ads_v5 | compute-optimized |
Standard_E2d_v4 | compute-optimized |
Standard_E4d_v4 | compute-optimized |
Standard_E8d_v4 | compute-optimized |
Standard_E16d_v4 | compute-optimized |
Standard_E2d_v5 | compute-optimized |
Standard_E4d_v5 | compute-optimized |
Standard_E8d_v5 | compute-optimized |
Standard_E16d_v5 | compute-optimized |
Standard_D32d_v4 | compute-optimized |