Creating a DAG to Schedule a Report to Google Sheets via Airflow
A step-by-step guide
This tutorial walks through building an Airflow DAG that automates report generation by querying Amazon Redshift, combining data, and uploading results to Google Sheets — using Astronomer, Docker, and Airflow running locally.
What You'll Learn
- Setting up Docker and Astronomer locally
- Configuring Airflow to query Amazon Redshift
- Setting up Google Cloud Console for Sheets API access
- Writing and scheduling an Airflow DAG to automate report generation
Prerequisites
- Docker installed
- Astronomer CLI installed
- Access to an Amazon Redshift cluster with read, write, and create table privileges
- Google account with Google Sheets access
- Basic Python and SQL familiarity
Step 1: Setting Up Amazon Redshift
Create a schema for staging tables, then prepare three SQL queries:
- company_metrics_WTD.sql — Generates Week-to-Date metrics
- company_metrics_MTD.sql — Generates Month-to-Date metrics
- company_metrics_Union.sql — Unions WTD and MTD results into a single table
Test all queries in your Redshift cluster before proceeding.
Step 2: Setting Up Google Cloud Console
- Access Google Cloud Console
- Create or select a project
- Enable the Google Sheets API
- Generate a Service Account Key and save the JSON file locally
- Share your Google Sheet with the Service Account email
Step 3: Installing Docker and Astronomer Locally
Download Docker from Docker's website and verify your installation with:
docker --version
Install the Astronomer CLI via Homebrew:
brew install astro
Verify with astro version.
Step 4: Setting Up Airflow Locally
Initialize an Astronomer project, which scaffolds the essential directories and files:
astro dev init
This creates dags/, include/, plugins/, and requirements.txt.
Add the following to requirements.txt:
pandas apache-airflow-providers-amazon google-api-python-client google-auth apache-airflow-providers-google
Start Airflow (Docker must be running):
astro dev start
Access the Airflow UI at localhost:8080.
Then create two connections in the Airflow UI:
- redshift_default — your Redshift cluster details
- google_sheets_default — path to your Service Account JSON
Step 5: Writing the DAG
Create a Python script in the dags/ folder defining four tasks:
- Load WTD metrics via SQLExecuteQueryOperator
- Load MTD metrics via SQLExecuteQueryOperator
- Union results via SQLExecuteQueryOperator
- Upload to Google Sheets via SQLToGoogleSheetsOperator
Schedule runs using a CRON expression. Note that Airflow schedules in UTC, so account for your timezone offset when setting the schedule.
Step 6: Running the DAG
- Upload your DAG script to the dags/ folder
- Restart Astronomer: astro dev restart
- Enable the DAG in the Airflow UI
- Trigger manually to test, or await the scheduled execution
Conclusion
That's it — a complete local Airflow environment automating a data pipeline from Redshift to Google Sheets. Once the DAG is running on a schedule, your reports update themselves. No more manual exports, no more stale spreadsheets.