Task Pipeline Execution

In summary

  • Executing a Schedule amounts to starting the execution of each of the tasks in the Pipeline

  • DC-Maestro does not execute tasks itself but sends execution requests

  • Only the Schedule’s Owner and contributors can manage its execution

  • It is not possible to run an ina

  • DataChain tasks are always executed on behalf of the Schedule’s Owner, and therefore according to their DataChain rights and permissions

  • Each execution updates global execution status and task status

  • The execution log details each task execution and the errors potentially encountered

  • The properties of the Workflow, including tasks and their connections, define the order of execution of Pipeline tasks

  • A failed task blocks the execution of the following tasks

  • Execution can be mmanual and automatic

  • The execution history allows to consult past executions

  • There is a difference between the "logical" order of execution, visible in the Workflow and the list of executions, and the actual order of execution of tasks, visible in the execution history.

When creating a Schedule, the orchestrator must synchronize the tasks in the pipeline. It is not possible to run the schedule during this phase.

Synchronizing the schedule with the orchestrator image::planning-synchronization.svg[title=Planning being synchronized]

If a synchronisation error occurs, try saving the schedule again or wait a few minutes.

Execution status

Execution statuses provide information on the state of tasks and Schedules execution.

There are 2 execution statuses:

  • An execution status specific to each task

  • A global execution status, carried by the Schedule and which depends on the result of all its tasks statuses

Statuses are updated with each execution.
The status of past executions can be viewed from the Schedule execution history tab.

Overall running status

The execution status of the Schedule is updated with each execution, depending on the result of the tasks execution statuses.
This status is visible from the Schedules list and the Schedule run history.
It always indicates the status of the most impactful task.

Running 1. Global Run Status
Status Meaning

status schedule waiting

Schedule was never executed

status schedule in progress

At least one task is running

status schedule success

All tasks completed successfully

status schedule failed

At least one task failed

Task Execution Status

Each task has its own execution status, updated at each execution.

Specific Execution Status Meaning

status none

The task has not yet been executed

status waiting

Task is being prepared for execution

status in progress

Task is running

status success

Task completed successfully

status failed

The task encountered a fatal error and was not executed

status blocked

The task was not executed because the previous task failed

Execution Log

Each task execution generates a file called execution log.
To consult it, click on the status of a task, from the Schedules list, or from a Schedule history tab.
The log contains a lot of information and can tell you about problems encountered during run.

A slight time lag is possible for exemple if the Schedule has just been executed. In this case, wait a few minutes.
It is not possible to display the log of a job that has been deleted.

Execution history

The execution history allows you to appreciate the evolution of executions over time and to consult the log of a task during a past execution.

Execution history.

Order of execution

It is important to note that DC-Maestro does not execute the tasks itself but sends execution requests (eg to DataChain).
Therefore, task execution order is not totally dependent of DC-Maestro.

If tasks are not related to each other, execution requests are sent in parallel and in a random order.
For example, when DC-Maestro sends 3 DataChain tasks to be executed in parallel, the order of execution is defined by DataChain, at the time of receipt of the request and according to DataChain server availability.

To define a precise sending order, it is necessary to link the tasks together using the Workflow.
If the Workflow includes both linked and unlinked tasks, DC-Maestro sends a request for parallel execution of the unlinked tasks and the Pipeline whose tasks are executed in the order defined in the Workflow.

In order to offer a harmonized and shared view for all the Schedules, the tasks are automatically reorganized each time you save, or when you click on the Reorganize button.

In the example below, the execution of the last task was blocked due to the failure of one of the preceding tasks.

workflow task order.

The order is defined by the entry/exit link (left and right) but not top and bottom.
Two tasks linked to the same previous task will be executed independently of each other (simultaneously, before or after)

Logical execution order and effective execution order

The logical execution order visible in the execution list is calculated from the scheduling of existing tasks defined in the workflow.

This order may differ from the actual order of a given execution, which is visible in the history +. The history provides precise information about the order in which the tasks scheduled for a given execution were actually executed by the orchestrator.

Execution type

Execution type can be

  • manual only: the user must be logged in and click on Play.

  • automatic: the schedule runs independently of logged-in users, at the defined frequency, and can also be run manually.

The type can be changed in the "Settings" tab.

Manual Execution

A Manual schedule can be run freely, when needed*.
This is the ideal choice to run Schedules punctually or when frequency is impossible to predict in advance.

We recommend choosing this mode when creating a Schedule in order to test the task Pipeline freely.

Autorun

An active automatic Schedule runs according to the set frequency, regardless of logged-in users.
It’s the perfect choice when moving into production data processing of a stable Pipeline.

From the list of executions, the CRON frequency and the date of the next scheduled execution are indicated in local time and UTC.

image:interface/execution-next-run-detail.svg [Detail of CRON and next scheduled execution].

If Scheduling is running, it is possible to disable it and find out the date of the next run.