Jobs system table reference

आलेख
07/29/2024

Important

This article provides an reference of how to use the workflow system tables to monitor jobs in your account. These tables include records from all workspaces in your account deployed within the same cloud region. To see records from another regions, you need to view the tables from a workspace deployed in that region.

system.workflow must be enabled by an account admin. You can enable it using the SystemSchemas API.

For examples on using these tables for job cost and health observability, see Monitor job costs with system tables.

Available job tables

All jobs-related system tables live in the system.workflow schema. Currently, the schema hosts four tables:

jobs: Tracks creation, deletion, and basic information of jobs.
job_tasks: Tracks creation, deletion, and basic information of job tasks.
job_run_timeline: Records the start, end, and resulting state of job runs.
job_task_run_timeline: Records the start, end, and resulting state of job tasks.

Job table schema

The jobs table is a slowly changing dimension table. When a row changes, a new row is emitted, logically replacing the previous one.

The table uses the following schema:

Column name	Data type	Description
`account_id`	string	The ID of the account this job belongs to.
`workspace_id`	string	The ID of the workspace this job belongs to.
`job_id`	string	The ID of the job. This key is only unique within a single workspace.
`name`	string	The user-supplied name of the job.
`creator_id`	string	The ID of the principal who created the job.
`tags`	string	The user-supplied custom tags associated with this job.
`change_time`	timestamp	The time when the job was last modified. Timezone information is recorded at the end of the value with `+00:00` representing UTC.
`delete_time`	timestamp	The time when the job was deleted by the user. Timezone information is recorded at the end of the value with `+00:00` representing UTC.
`run_as`	string	The ID of the user or service principal whose permissions are used for the job run.

Job task table schema

The job_tasks table is a slowly changing dimension table. When a row changes, a new row is emitted, logically replacing the previous one.

The table uses the following schema:

Column name	Data type	Description
`account_id`	string	The ID of the account this job belongs to.
`workspace_id`	string	The ID of the workspace this job belongs to.
`job_id`	string	The ID of the job. This key is only unique within a single workspace.
`task_key`	string	The reference key for a task in a job. This key is only unique within a single job.
`depends_on_keys`	array	The task keys of all upstream dependencies of this task.
`change_time`	timestamp	The time when the task was last modified. Timezone information is recorded at the end of the value with `+00:00` representing UTC.
`delete_time`	timestamp	The time when a task was deleted by the user. Timezone information is recorded at the end of the value with `+00:00` representing UTC.

Job run timeline table schema

The job_run_timeline table is immutable and complete at the time it is produced.

The table uses the following schema:

Column name	Data type	Description
`account_id`	string	The ID of the account this job belongs to.
`workspace_id`	string	The ID of the workspace this job belongs to.
`job_id`	string	The ID of the job. This key is only unique within a single workspace.
`run_id`	string	The ID of the job run.
`period_start_time`	timestamp	The start time for the run or for the time period. Timezone information is recorded at the end of the value with `+00:00` representing UTC.
`period_end_time`	timestamp	The end time for the run or for the time period. Timezone information is recorded at the end of the value with `+00:00` representing UTC.
`trigger_type`	string	The type of trigger that can fire a run.
`result_state`	string	The outcome of the job run. See the possible values below this table.

The possible values for the result_state column are:

SUCCEEDED
FAILED
SKIPPED
CANCELLED
TIMED_OUT
ERROR
BLOCKED

The possible values for the trigger_type column are:

CONTINUOUS
CRON
FILE_ARRIVAL
ONETIME
ONETIME_RETRY

Job task run timeline table schema

The job_task_run_timeline table is immutable and complete at the time it is produced.

The table uses the following schema:

Column name	Data type	Description
`account_id`	string	The ID of the account this job belongs to.
`workspace_id`	string	The ID of the workspace this job belongs to.
`job_id`	string	The ID of the job. This key is only unique within a single workspace.
`run_id`	string	The ID of the task run.
`period_start_time`	timestamp	The start time for the task or for the time period. Timezone information is recorded at the end of the value with `+00:00` representing UTC.
`period_end_time`	timestamp	The end time for the task or for the time period. Timezone information is recorded at the end of the value with `+00:00` representing UTC.
`task_key`	string	The reference key for a task in a job. This key is only unique within a single job.
`compute_ids`	array	Array containing the IDs of the non-serverless jobs compute and non-serverless SQL warehouses used by the job task.
`result_state`	string	The outcome of the job task run.

The possible values for the result_state column are:

SUCCEEDED
FAILED
SKIPPED
CANCELLED
TIMED_OUT
ERROR
BLOCKED

Example queries

This section includes sample queries you can use to get the most out of the workflow tables.

Get the most recent version of jobs
Daily job count by workspace
Daily job status distribution by workspace
Longest running jobs overview
Job run analysis
Jobs running on all-purpose compute
Retried job runs

Get the most recent version of jobs

Because the jobs and job_tasks tables are slowly changing dimension tables, a new record is created every time a change is made. To get the most recent version of a job, you can order by the change_time column.

SELECT
  *,
  ROW_NUMBER() OVER(PARTITION BY workspace_id, job_id ORDER BY change_time DESC) as rn
FROM
  system.workflow.jobs QUALIFY rn=1

Daily job count by workspace

This query gets the daily job count by workspace for the last 7 days:

SELECT
  workspace_id,
  COUNT(DISTINCT run_id) as job_count,
  to_date(period_start_time) as date
FROM system.workflow.job_run_timeline
WHERE
  period_start_time > CURRENT_TIMESTAMP() - INTERVAL 7 DAYS
GROUP BY ALL

Daily job status distribution by workspace

This query returns the daily job count by workspace for the last 7 days, distributed by the outcome of the job run. The query removes any records where the jobs are in a pending or running state.

SELECT
  workspace_id,
  COUNT(DISTINCT run_id) as job_count,
  result_state,
  to_date(period_start_time) as date
FROM system.workflow.job_run_timeline
WHERE
  period_start_time > CURRENT_TIMESTAMP() - INTERVAL 7 DAYS
  AND result_state IS NOT NULL
GROUP BY ALL

Longest running jobs overview

This query returns the average time of job runs, measured in seconds. The records are organized by job. A top 90 and a 95 percentile column show the average lengths of the job’s longest runs.

with job_run_duration as (
    SELECT
        workspace_id,
        job_id,
        run_id,
        CAST(SUM(period_end_time - period_start_time) AS LONG) as duration
    FROM
        system.workflow.job_run_timeline
    WHERE
      period_start_time > CURRENT_TIMESTAMP() - INTERVAL 7 DAYS
    GROUP BY ALL
),
most_recent_jobs as (
  SELECT
    *,
    ROW_NUMBER() OVER(PARTITION BY workspace_id, job_id ORDER BY change_time DESC) as rn
  FROM
    system.workflow.jobs QUALIFY rn=1
)
SELECT
    t1.workspace_id,
    t1.job_id,
    first(t2.name, TRUE) as name,
    COUNT(DISTINCT t1.run_id) as runs,
    MEAN(t1.duration) as mean_seconds,
    AVG(t1.duration) as avg_seconds,
    PERCENTILE(t1.duration, 0.9) as p90_seconds,
    PERCENTILE(t1.duration, 0.95) as p95_seconds
FROM
    job_run_duration t1
    LEFT OUTER JOIN most_recent_jobs t2 USING (workspace_id, job_id)
GROUP BY ALL
ORDER BY mean_seconds DESC
LIMIT 100

Job run analysis

This query provides a historical runtime for a specific job. For the query to work, you must set a workspace_id and job_id.

You can also edit the length of time for analysis by updating the amount of days in the INTERVAL 60 DAYS section.

with job_run_duration as (
    SELECT
        workspace_id,
        job_id,
        run_id,
        min(period_start_time) as run_start,
        max(period_start_time) as run_end,
        CAST(SUM(period_end_time - period_start_time) AS LONG) as duration,
        FIRST(result_state, TRUE) as result_state
    FROM
        system.workflow.job_run_timeline
    WHERE
      period_start_time > CURRENT_TIMESTAMP() - INTERVAL 60 DAYS
      AND workspace_id={workspace_id}
      AND job_id={job_id}
    GROUP BY ALL
    ORDER BY run_start DESC
),
most_recent_jobs as (
  SELECT
    *,
    ROW_NUMBER() OVER(PARTITION BY workspace_id, job_id ORDER BY change_time DESC) as rn
  FROM
    system.workflow.jobs QUALIFY rn=1
)
SELECT
    t1.workspace_id,
    t1.job_id,
    t2.name,
    t1.run_id,
    t1.run_start,
    t1.run_end,
    t1.duration,
    t1.result_state
FROM job_run_duration t1
    LEFT OUTER JOIN most_recent_jobs t2 USING (workspace_id, job_id)

Jobs running on all-purpose compute

This query joins with the compute.clusters system table to return recent jobs that are running on all-purpose compute instead of jobs compute.

with clusters AS (
  SELECT
    *,
    ROW_NUMBER() OVER(PARTITION BY workspace_id, cluster_id ORDER BY change_time DESC) as rn
  FROM system.compute.clusters
  WHERE cluster_source="UI" OR cluster_source="API"
  QUALIFY rn=1
),
job_tasks_exploded AS (
  SELECT
    workspace_id,
    job_id,
    EXPLODE(compute_ids) as cluster_id
  FROM system.workflow.job_task_run_timeline
  WHERE period_start_time >= CURRENT_DATE() - INTERVAL 30 DAY
),
all_purpose_cluster_jobs AS (
  SELECT
    t1.*,
    t2.cluster_name,
    t2.owned_by,
    t2.dbr_version
  FROM job_tasks_exploded t1
    INNER JOIN clusters t2 USING (workspace_id, cluster_id)
)
SELECT * FROM all_purpose_cluster_jobs LIMIT 10;

Retried job runs

This query collects a list of retried job runs with the number of retries for each run.

with repaired_runs as (
  SELECT
    workspace_id, job_id, run_id, COUNT(*) - 1 as retries_count
  FROM system.workflow.job_run_timeline
  WHERE result_state IS NOT NULL
  GROUP BY ALL
  HAVING retries_count > 0
)
SELECT
  *
FROM repaired_runs
ORDER BY retries_count DESC
LIMIT 10;

इसके माध्यम से साझा किया गया

Jobs system table reference

Available job tables

Job table schema

Job task table schema

Job run timeline table schema

Job task run timeline table schema

Example queries

Get the most recent version of jobs

Daily job count by workspace

Daily job status distribution by workspace

Longest running jobs overview

Job run analysis

Jobs running on all-purpose compute

Retried job runs

प्रतिक्रिया

प्रतिक्रिया

अतिरिक्त संसाधन