This repository contains my final project for the MAT234 course, which focuses on statistical analysis of video game sales data from 1997-2019. The project includes hypothesis testing, confidence interval construction, and regression analysis to explore trends and relationships in sales data.
This project analyzes video game sales data using statistical techniques. The dataset includes:
- Game titles
- Platforms (e.g., Xbox, PlayStation)
- Genres (e.g., Racing, Role-Playing, Platform)
- Regional sales (North America, Europe, Japan, and Others)
- Global sales figures
- Random sampling of video games by genre and region.
- Calculation of sample and population statistics (mean, variance, standard deviation).
- Construction of confidence intervals.
- Hypothesis testing to evaluate sample mean estimations.
- Regression analysis to identify relationships between sales in different regions.
-
Random Sampling:
- Used Excel's RAND function to sample games from specific genres and the global dataset.
-
Statistical Analysis:
- Computed sample and population statistics (mean, variance, standard deviation).
- Constructed 95% and 97% confidence intervals.
-
Hypothesis Testing:
- Formulated null and alternative hypotheses.
- Calculated test statistics, critical values, and p-values.
- Interpreted results to accept or reject hypotheses.
-
Regression Analysis:
- Analyzed relationships between sales in different regions (e.g., North America vs. Europe).
- Used scatter plots, regression lines, and r-values to determine trends.
MAT234_Final.xlsx
: Contains all computations, visualizations, and analyses for the project. Key sheets include:Global Video Game Sales
Platform Video Games
Racing Video Games
Role Playing Video Games
- Comprehensive statistical analysis for global video game sales and specific genres.
- Insights into sales trends by region and genre.
- Regression analysis reveals significant relationships between regional sales data.
- Microsoft Excel: Data manipulation, computations, and visualizations.
- Statistical formulas: Used for hypothesis testing and confidence intervals.
This project is licensed under the GNU General Public License v3.0. This ensures the work remains open-source and is primarily intended for educational use. Commercial use is restricted unless derivative works are also open-source under the same terms.
This project was completed as part of the MAT234 course.