-
Notifications
You must be signed in to change notification settings - Fork 12
Home
- Clone this repository -
git clone [email protected]:snowplow/factotum.git
cd factotum
- Set up a vagrant box and ssh into it -
vagrant up && vagrant ssh
- This will take a few minutes
cd /vagrant
- Compile and run a demo -
cargo run -- samples/echo.factotum
-
Install rust
- on Linux/Mac -
curl -sSf https://static.rust-lang.org/rustup.sh | sh
- on Linux/Mac -
- Clone this repository -
git clone [email protected]:snowplow/factotum.git
cd factotum
- Compile and run a demo -
cargo run -- samples/echo.factotum
Factotum factfiles must adhere to this self describing JSON schema which defines jobs in factotum.
Field | Description |
---|---|
name |
A user defined title for the job |
tasks |
An array of the tasks that this job is comprised from. A task is a single task that a job must complete (a single node in the DAG) |
tasks/*/name |
A title for the task |
tasks/*/executor |
A method of execution for the task. This is reserved as "shell" for now |
tasks/*/command |
The command to invoke. For example with the executor "shell" this can be a bash script, or an executable on your path (e.g. echo ) |
tasks/*/arguments |
Arguments to pass to your command |
tasks/*/dependsOn |
A list of the tasks (by their name) this task depends on (and so must be executed after). For example, if task B depends on task A, task A will aways be executed before task B |
tasks/*/onResult/terminateJobWithSuccess |
A list of return codes that if returned by this task will cause any running jobs to finish and factotum to stop processing the rest of the job. This is sometimes described as a "no-op". |
tasks/*/onResult/continueJob |
A list of expected return codes for the task. If the task returns a code in this set, the job will continue normally. |
tasks/*/onResult/terminateJobWithSuccess
and tasks/*/onResult/continueJob
are mutually exclusive. If a task returns a code in the list specified by "continueJob" the
task is considered a success and the job will continue. If a task returns a code in the list specified by "terminateJobWithSuccess" the job will finish running the jobs it's currently running, and break out of the job early
without considering the job a failure. If a return code is encountered that is not present in either "continueJob" or "terminateJobWithSuccess" the task is regarded as erroring, and the job will exit as an error after any other running tasks complete.
Factotum supports variables in the majority of field values (for example, for task arguments). Variables are in
the form {{ variable_name }}
as in this example. Nested variables are also possible, for example the following:
"arguments": "{{ snowplow.message }}"
will replace "arguments" with whatever is defined in the "snowplow" object as "message". If no JSON is supplied the task is assumed to contain no variables.
These variables are given to Factotum via the --env JSON
option. In the nested example, we could run Factotum with the following options --env '{"snowplow":{ "message":"hello world" }}'
to pass "hello world" as the argument to the task.
Variable substitution works using mustache - any valid mustache is valid for Factotum jobs - providing it appears inside a task's fields (the whole file is not templated).
Factotum 0.2.0 includes functionality to (re)start a job from a given point, allowing you to skip tasks that have already been run.
This functionality is provided using the "--start" (or "-s") command line option. Given the Factfile below:
{
"schema":"iglu:com.snowplowanalytics.factotum/factfile/jsonschema/1-0-0",
"data":{
"name":"echo order demo",
"tasks":[
{
"name":"echo alpha",
"executor":"shell",
"command":"echo",
"arguments":[
"alpha"
],
"dependsOn":[
],
"onResult":{
"terminateJobWithSuccess":[
],
"continueJob":[
0
]
}
},
{
"name":"echo beta",
"executor":"shell",
"command":"echo",
"arguments":[
"beta"
],
"dependsOn":[
"echo alpha"
],
"onResult":{
"terminateJobWithSuccess":[
],
"continueJob":[
0
]
}
},
{
"name":"echo omega",
"executor":"shell",
"command":"echo",
"arguments":[
"and omega!"
],
"dependsOn":[
"echo beta"
],
"onResult":{
"terminateJobWithSuccess":[
],
"continueJob":[
0
]
}
}
]
}
}
You can start from the "echo beta" task using the following:
$ factotum samples/echo.factotum --start "echo beta"
Which skips the task "echo alpha", and starts from "echo beta".
In more complicated DAGs, there are some tasks which cannot currently be the starting point for jobs. Resuming a job from such tasks would be ambiguous, typically because the DAG has parallel execution branches and a single start point does not tell Factotum enough about the start state of all of the branches.
This edge case is discussed in https://github.com/snowplow/factotum/issues/54
A more verbose log is created under .factotum/factotum.log
in the working directory in which factotum was invoked. If this directory is not-writable factotum will fail to run.
This list is not exhaustive, but everything listed here will be fixed soon.
- Tasks cannot "forward reference" each other with dependencies - #31
- In some cases, factotum will execute jobs with a sub-optimal order - #30
- Factotum only really supports Linux - #29