UsefulLinks
Computer Science
Big Data
Apache Airflow
1. Introduction to Apache Airflow
2. Core Concepts of Airflow
3. Airflow Architecture and Components
4. Setting Up an Airflow Environment
5. Authoring Your First DAG
6. Comprehensive Guide to Operators
7. Managing Workflows with the Airflow UI
8. Scheduling and Triggers
9. Data Sharing and Communication
10. Advanced DAG Authoring Techniques
11. Airflow Providers and Extensibility
12. Testing and Debugging Airflow DAGs
13. Operational Airflow Management
14. Scaling and Production Deployment
15. Airflow Best Practices and Patterns
5.
Authoring Your First DAG
5.1.
DAG File Structure and Organization
5.1.1.
File Naming Conventions
5.1.2.
Directory Structure
5.1.3.
Python Module Organization
5.2.
Basic DAG Definition
5.2.1.
Required Imports
5.2.2.
Instantiating the DAG Object
5.2.3.
DAG Parameters
5.2.4.
Setting Default Arguments
5.2.4.1.
owner
5.2.4.2.
start_date
5.2.4.3.
retries
5.2.4.4.
retry_delay
5.2.4.5.
email_on_failure
5.2.4.6.
email_on_retry
5.3.
Defining Tasks with Operators
5.3.1.
BashOperator
5.3.1.1.
Syntax and Parameters
5.3.1.2.
Command Execution
5.3.1.3.
Environment Variables
5.3.2.
PythonOperator
5.3.2.1.
Syntax and Parameters
5.3.2.2.
Function Definition
5.3.2.3.
Passing Arguments
5.3.3.
Task Naming Conventions
5.3.4.
Task Configuration
5.4.
Setting Task Dependencies
5.4.1.
Using set_upstream and set_downstream
5.4.2.
Using Bitshift Operators
5.4.3.
Chaining Dependencies for Linear Workflows
5.4.4.
Complex Dependency Patterns
5.4.5.
Fan-out and Fan-in Patterns
5.5.
DAG Testing and Validation
5.5.1.
Syntax Validation
5.5.2.
Import Testing
5.5.3.
Dependency Validation
5.6.
Running Your First DAG
5.6.1.
Manual Triggers via UI
5.6.2.
Manual Triggers via CLI
5.6.3.
Scheduled Triggers
5.6.4.
Monitoring DAG Runs
5.6.5.
Interpreting Results
Previous
4. Setting Up an Airflow Environment
Go to top
Next
6. Comprehensive Guide to Operators