Skip to content

SarahSchoonmaker/customer-analytics-bigquery-dbt-looker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Customer Analytics Pipeline

Looker Dashboard – Top Customers by Sales

Top Customers by Sales

Tech Stack: BigQuery • dbt • Looker Studio

Overview

This project demonstrates an end-to-end analytics engineering pipeline using Google BigQuery, dbt, and Looker Studio.
Raw sales data is ingested into BigQuery, transformed through bronze / silver / gold layers using dbt, and visualized in Looker to surface business insights such as Top Customers by Total Sales.

Tech Stack

  • Python (pandas)
  • Google BigQuery
  • dbt
  • Looker Studio
  • GitHub

Architecture


CSV → BigQuery (Bronze)
→ dbt (Silver)
→ dbt (Gold)
→ Looker Studio Dashboard


Prerequisites

  • Python 3.10+
  • Google Cloud project with BigQuery enabled
  • dbt (pip install dbt-bigquery)
  • Looker Studio access

1. Environment Setup

Create and activate virtual environment

python -m venv .venv
source .venv/bin/activate     # Mac/Linux
.venv\Scripts\activate        # Windows

Install dependencies

pip install -r requirements.txt

2. Configure Environment Variables

Create a .env file in the project root:

GCP_PROJECT_ID= insert your project id
BQ_DATASET_BRONZE=bronze
BQ_DATASET_SILVER=silver
BQ_DATASET_GOLD=gold
SOURCE_CSV=storeanalytics.csv

.env is gitignored and should not be committed.


3. Load Raw Data into BigQuery (Bronze)

python -m ingestion.load_superstore

This step:

  • Creates the bronze table if it does not exist
  • Loads ~9,800 rows of sales data
  • Partitions the table by ingestion date

4. Run dbt Models (Silver & Gold)

cd dbt
dbt debug --profiles-dir .
dbt run   --profiles-dir .
dbt test  --profiles-dir .

Data Layers

  • Bronze: Raw ingested data
  • Silver: Cleaned and standardized data
  • Gold: Analytics-ready tables

Example gold models:

  • top_customers
  • sales_by_category
  • daily_sales

5. Visualize in Looker Studio

  1. Open Looker Studio

  2. Add data source → BigQuery

  3. Select:

    • Project: dataengineering-387413
    • Dataset: gold
    • Table: top_customers
  4. Create a horizontal bar chart

    • Dimension: customer_name
    • Metric: total_sales
    • Sort: Descending
    • Limit: Top 10

The resulting dashboard highlights the highest revenue-generating customers.

Screenshots are available in the /dashboards folder.


Example Insight

A small number of customers account for a disproportionate share of total revenue, indicating opportunities for targeted retention and account growth strategies.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages