Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Note
Apache Airflow job is powered by Apache Airflow.
You can configure and manage the runtime settings of Apache Airflow in Apache Airflow Job and the default Apache Airflow runtime for the workspace. Apache Airflow job offers the two types of the environment settings, that is, Starter pool and Custom pool. You can use the starter pool which is configured by default or create custom pools for your workspace. If the setting for customizing compute configurations for items is disabled, the starter pool is used for all environments within the workspace. Starter pools offer an instant Apache Airflow runtime, which is automatically deprovisioned when not in use. On the other hand, custom pools provide more flexibility and offer an always-on Apache Airflow runtime. This article outlines each setting and suggests ideal scenarios for their respective usage.
Starter Pool and Custom Pool
The following table contains the list the properties of both the pools.
Property | Starter Pool (Default) | Custom Pool |
---|---|---|
Size | Compute Node Size: Large | Offers flexibility in size; You can configure 'Compute node size,' 'Extra nodes,' 'Enable autoscale' |
Startup latency | Instantaneous | Starts in the stopped stage |
Resume latency | Takes up to 5 minutes | Takes up to 5 minutes |
Pool uptime behavior | Shuts down after 20 minutes of inactivity on Airflow Environment | Always on until manually paused |
Suggested Environments | Developer | Production |
Configure Custom Pool
Go to your workspace settings.
In the 'Data Factory' section, click on 'Data Workflow Settings.'
You find that the Default Data Workflow Setting is currently set to Starter Pool. To switch to a Custom Pool, expand the dropdown menu labeled 'Default Data Workflow Setting' and select 'New Pool.'
Customize the following properties according to your needs:
- Name: Give a suitable name to your pool.
- Compute node size: The size of compute node you want to run your environment on. You can choose the value
Large
for running complex or production DAGs andSmall
for running simpler Directed Acyclic Graphs (DAGs). - Enable autoscale: This feature allows your Apache Airflow pool to scale nodes up or down as needed.
- Extra nodes: Extra nodes enable the pool to run more DAGs concurrently. Each node provides the capacity to run three more workers.
Click on 'Create' to finalize your configuration.