Features
-
Support for "generic" airflow operators: you can now use regular python
operators as part of your config files. -
Support for “dbt docs” command to generate documentation for all dbt
tasks: Users can now add “docs generate” as a target in their DOP
configuration and additionally specify a GCS bucket with the--bucket
and--bucket-path
options where documents are copied to. -
Serve dbt docs: Documents generated by dbt can be served as a web page by
deploying the provided app on GAE. Note that deploying is an additional step
that needs to be carried out after docs have been generated. See
infrastructure/dbt-docs/README.md
for details. -
dbt tasks artifacts
run_results
created by dbt tasks saved to BigQuery:
This json file contains information on completed dbt invocations and is saved
in the BQ table “run_results” for analysis and debugging. -
Add support for Airflow
v1.10.14
andv1.10.15
local environments:
Users can specify which version they want to use by setting
theAIRFLOW_VERSION
environment variable. -
Pre-commit linters: added pre-commit hooks to ensure python, yaml and some
support for plain text file consistency in formatting and style throughout DOP
codebase.
Changes
-
Ensure DAGs using the same DBT project do not run concurrently: Safety
feature to safely allow selective execution of workflows by calling specific
commands or tags (e.g.dbt run --m
) within a single dbt project. This avoids
creating inter-dependant workflows to avoid overriding each other's artifacts,
since they will share the same target location (within the dbt container). -
Test time-partitioning: Time-partitioning of datetime type properly
validated as part of schema validation. -
Use Python 3.7 and dbt 0.19.1 in Composer K8s Operator
-
Add Dataflow example task: with the introduction of "regular" in the yaml
config Airflow Operators, it is now possible to run compute intensive Dataflow
jobs. Checkexample_dataflow_template
for an example on how to implement a
Dataflow pipeline.