This repository contains code and datasets for analyzing restaurant data sourced from Google BigQuery. The aim is to provide insights into customer check-in behaviors, restaurant popularity, market analysis based on geographical and temporal data points, and so on.
- Data extraction from Google BigQuery.
- Data preprocessing including data cleaning and data transformation.
- Analytical visualizations of check-in data across different cities and price ranges.
- Implementation of data aggregation and analysis using Python Pandas and visualization using Tableau.
To set up this project locally, follow these steps:
-
Clone the repository:
git clone https://github.com/Eins51/RestaurantAnalytics.git
-
Navigate to the project directory:
cd RestaurantAnalytics
-
Install required Python packages:
pip install -r requirements.txt
-
Run the Jupyter Notebook for data preprocessing and analysis:
jupyter notebook notebooks/restaurant_data_analysis.ipynb
-
Explore the Tableau dashboards for interactive visualizations:
- Dashboard 1: Market Trends and Spending Insights
- Analyze restaurant distribution, spending trends, and high-potential regions.
- Video Demo
- Dashboard 2: Peak and Off-Peak Customer Behavior
- Understand peak dining hours, months, and consumer trends.
- Video Demo
- Dashboard 1: Market Trends and Spending Insights
- Data Acquisition: Data was sourced from public Google Cloud Storage buckets and loaded into Google BigQuery.
- Data Cleaning: Removed duplicates, handled missing values, and conducted feature engineering (e.g., geographic classification, temporal data enrichment).
- Data Transformation: Time data was binned for granular analysis, and datasets were simplified for focused analysis.
-
Key Metrics:
- Total Restaurants: 8.5 million
- Operational Restaurants: 6.8 million
- Total Check-ins: 12 billion
- Total Reviews: 4.5 billion
- Average Rating: 4/5
-
Visualizations:
- Geographic distribution of restaurant activity.
- Spending trends by state and city.
- Seasonal consumer behavior insights.
-
Insights:
- High-potential regions include Pennsylvania, Florida, and Louisiana.
- Seasonal trends show spending peaks in March, May, and July.
-
Link: https://public.tableau.com/app/profile/yi.wang4922/viz/MarketTrendsandSpendingInsights/Overview
-
Key Metrics:
- Peak Months: March, May, July
- Off-Peak Months: January, September, November
- Peak Hours: 11 PM to 1 AM
- Off-Peak Hours: 8 AM to 10 AM
-
Visualizations:
- Hourly and monthly check-in patterns.
- Peak traffic periods by location.
-
Insights:
- Late-night dining trends in urban areas.
- Inventory and staffing optimization based on peak/off-peak patterns.
Special thanks to:
- The Google Cloud Platform team for providing the data.
- Tableau for powerful visualization tools.
- Contributors and collaborators who supported this project.