Branchpythonoperator Concepts
BranchPythonOperator
There isn't very good documentation for this feature yet.
Why?
The BranchPythonOperator is used to create control flow in a DAG.
Towards Automating Changes to Data
If there is unexpected input from a data source, then automate preprocessing the new data format so the automation can be improved.
NB: This type of branch would probably trigger another DAG.
Inter-task communication
There are two options:
variablesXComs
Variables sit in a global variables object, where as XComs are pushed and pulled between tasks.
For task-to-task communication, we will use XComs.
Passing Context to BranchPythonOperator
BranchPythonOperator inherits from PythonOperator and is defined in the same module. Both share use of provide_context=True as a keyword argument. If this is passed, the python_callable function must receive **kwargs. The context includes task_instance.xcom_pull which pulls information from other tasks.
Notes
Each branch must point to a task; if there isn't anything to do, use a DummyOperator.
References
General
- apache-airflow on git
- Airflow 'gitter' chat
- Common Pitfalls (Official)
- medium/handy-tech: Airflow Tips, Tricks, and Pitfalls
- Branching & XComs
- NB:
subdagsare being deprecated in the future in favor of triggering another dag
- NB:
Examples
Concepts & Tutorials
- Concepts
- Concepts: Branching
- Concepts: XComs
- Tutorial: Instantiate a DAG
- Default Arguments
- Template Variables & Macros
- ETL with Airflow: Principles
- Airflow CLI - Command Line Interface
API Reference
- BaseOperator
- TaskInstance.xcom_pull
BaseOperator.xcom_pullimportsTaskInstance.xcom_pullfromcontext- You must pass
key=Noneor any desired key/value to get tasks that are fromxcom_push - By default tasks from
xcom_pushare ignored
Source Code
- BashOperator
- BaseOperator
- PythonOperator & BranchPythonOperator
- SSHExecuteOperator
xcom_pushinherited fromBaseOperatorBaseOperatorimportsTaskInstance.xcom_pushfromcontext