Branchpythonoperator Concepts
BranchPythonOperator
There isn't very good documentation for this feature yet.
Why?
The BranchPythonOperator
is used to create control flow in a DAG
.
Towards Automating Changes to Data
If there is unexpected input from a data source, then automate preprocessing the new data format so the automation can be improved.
NB: This type of branch would probably trigger another DAG.
Inter-task communication
There are two options:
variables
XComs
Variables sit in a global variables
object, where as XComs
are pushed and pulled between tasks.
For task-to-task communication, we will use XComs
.
Passing Context to BranchPythonOperator
BranchPythonOperator
inherits from PythonOperator
and is defined in the same module. Both share use of provide_context=True
as a keyword argument. If this is passed, the python_callable
function must receive **kwargs
. The context
includes task_instance.xcom_pull
which pulls information from other tasks.
Notes
Each branch must point to a task; if there isn't anything to do, use a DummyOperator
.
References
General
- apache-airflow on git
- Airflow 'gitter' chat
- Common Pitfalls (Official)
- medium/handy-tech: Airflow Tips, Tricks, and Pitfalls
- Branching & XComs
- NB:
subdags
are being deprecated in the future in favor of triggering another dag
- NB:
Examples
Concepts & Tutorials
- Concepts
- Concepts: Branching
- Concepts: XComs
- Tutorial: Instantiate a DAG
- Default Arguments
- Template Variables & Macros
- ETL with Airflow: Principles
- Airflow CLI - Command Line Interface
API Reference
- BaseOperator
- TaskInstance.xcom_pull
BaseOperator.xcom_pull
importsTaskInstance.xcom_pull
fromcontext
- You must pass
key=None
or any desired key/value to get tasks that are fromxcom_push
- By default tasks from
xcom_push
are ignored
Source Code
- BashOperator
- BaseOperator
- PythonOperator & BranchPythonOperator
- SSHExecuteOperator
xcom_push
inherited fromBaseOperator
BaseOperator
importsTaskInstance.xcom_push
fromcontext