This is a repository for HKU COMP2501 2024 Spring Project.
This project aims at inspecting the the costs of chasing for higher education and comparing it with the gains to draw a conclusion.
- 2024-03-28: Start writing the proposal
- 2024-04-11: Proposal deadline and start researching
- 2024-05-09: Presentation video deadline
Economic Impact Analysis of Higher Education: A Data Science Approach to Understanding Costs and Returns
In an era where the cost of higher education is escalating, it's crucial to analyze the economic returns of investing in higher degrees quantitatively. This project aims to explore the financial implications of pursuing bachelor's, master's, and doctoral degrees, juxtaposing the rising costs against the potential gains in earning capacity and career advancement opportunities.
Despite the acknowledged value of higher education, its increasing financial burden on individuals raises questions about its economic viability. This project seeks to investigate how the costs of obtaining higher education degrees have evolved over time and how these costs compare with the long-term financial benefits.
Primary Objective: To quantify the financial costs and gains associated with pursuing higher education degrees over the last few decades.
- Data on tuition fees and related costs from educational institutions, government databases, and educational research organizations.
- Income data for individuals with varying levels of education, obtained from labor statistics bureaus, salary survey databases, and census data.
- Data on factors that might affect education outcomes, such as field of study, geographic location, and type of institution.
- ROI Calculation: Develop models to calculate the return on investment for different degrees.
- Direct Costs: Tuition fees, books, and materials, living expenses.
- Indirect Costs: Opportunity costs, such as loss of income during study periods.
- Loan-related Costs: Interest rates on student loans, average debt at graduation.
- Immediate Financial Gains: Starting salary post-graduation, salary progression over 5, 10, and 20 years.
- Long-term Financial Gains: Lifetime earnings differential compared to a lower education degree.
- Non-financial Gains: Employment rate, job satisfaction, health benefits, and other quality-of-life indicators associated with higher education levels.
- Data Collection and Preprocessing:
- Gather detailed cost and gain metrics from a range of sources, ensuring data is standardized and comparable across different time periods and demographics.
- Exploratory Data Analysis (EDA):
- Visualize trends in both costs and gains, identifying patterns and outliers.
- Conduct preliminary comparisons across degree levels, fields of study, and other key factors.
- Cost Analysis:
- Calculate total costs of education, incorporating both direct and indirect costs, for different degree paths. Analyze trends in student loan debt and its impact on financial outcomes post-graduation.
- Gain Analysis:
- Analyze salary data to calculate immediate and long-term financial gains, adjusting for inflation and economic conditions.
- Evaluate non-financial gains using available metrics, assessing the broader impacts of higher education on individual well-being and societal benefits.
- Cost-Benefit Analysis (CBA):
- Employ statistical models to compare costs and gains, determining the net economic value of different higher education degrees.
- Calculate the return on investment (ROI) for various degrees, considering both financial and non-financial benefits over time.
- Sensitivity Analysis:
- Test the robustness of findings by varying key assumptions, such as interest rates, economic conditions, and job market trends.
- Final Report and Recommendations:
- Synthesize findings into actionable insights, highlighting degrees and fields of study with the highest ROI and those that present challenges under current economic conditions.
- Python is used for data processing and analysis, and libraries such as Pandas, NumPy, and Matplotlib are utilized for data manipulation and visualization.
- R for statistical analysis, using packages like ggplot2 for advanced visualizations and lm for regression analysis.
This project aims to provide empirical evidence on the economic value of higher education, aid prospective students in making informed decisions, and contribute to policy discussions on education funding and structure.
- Data Privacy: Ensuring compliance with data protection regulations.
- Data Availability and Quality: Potential limitations in accessing comprehensive and reliable data across different regions and time periods.
- Model Complexity: Balancing the comprehensiveness of the analysis with interpretability and simplicity.
This project aspires to illuminate the economic value of higher education by systematically analyzing its costs and benefits, providing stakeholders with actionable insights to navigate the complexities of educational investment.