- Project Overview
- Objective
- Dataset
- Features
- Technologies
- Analysis Breakdown
- Results
- Tableau Dashboard
- Comprehensive Report
This project involves a comprehensive analysis of an internship dataset obtained from Kaggle. It focuses on cleaning and preprocessing the dataset, followed by exploratory data analysis (EDA), and conducting statistical, sectoral, and geospatial analysis. The key insights are visualized using an interactive Tableau dashboard.
The aim of this project is to:
- Clean and preprocess the internship dataset.
- Perform detailed statistical, sectoral, and geospatial analysis.
- Derive actionable insights such as:
- Top locations providing internships.
- Highest-paying sectors in terms of stipends.
- Average stipend trends across different locations and sectors.
- Geospatial distribution of internships to identify regional hotspots.
These insights are presented through an interactive Tableau dashboard, enabling better visualization and decision-making.
The dataset was obtained from Kaggle and consists of the following fields:
- Organization
- Location
- Start Date
- Duration (in Months)
- Average Monthly Stipend
- Added Incentives
- Data Cleaning & Preprocessing
- Exploratory Data Analysis
- Statistical & Geospatial Analysis
- Interactive Tableau Dashboard
- Excel and SQL: Cleaning and preprocessing the dataset.
- Python: Scripting and data manipulation.
- Pandas: Data processing and analysis.
- Tableau: Interactive dashboard creation.
The data was cleaned to remove null values, duplicate records, and invalid entries, ensuring consistent data quality for the analysis. Furthermore, the data was preprocessed and insights were pulled out using Microsoft Excel and Google BigQuery SQL.
The original dataset had several discrepancies, which were addressed by thoroughly cleaning, adding new fields, and eliminating redundant ones. This process was carried out using Microsoft Excel. Various insights were derived using Excel pivot tables and Google BigQuery SQL. The updated dataset Excel sheet features the various details obtained in individual sheets.
EDA was conducted to identify trends, distributions, and relationships between various fields in the dataset, such as average stipend across sectors and locations.
The following key analyses were performed:
- Statistical: Analyzing average monthly stipends and duration trends across various sectors.
- Sectoral: Identifying which sectors offer the most internships and which provide the highest stipends.
- Geospatial: Using maps to display the concentration of internship opportunities across different geographic locations.
The analysis provided key insights into:
- Popular internship locations.
- High-paying sectors.
- Geographic distribution of internships.
The Tableau dashboard created for this project showcases the key insights:
- Top Locations for internships.
- Sector-based Stipend Comparison.
- Geospatial Heatmap showing regional distributions of internships.
- Sectoral Breakdown of internships based on stipend distribution.
To view the Tableau dashboard, visit: View Tableau Dashboard
You can view the comprehensive analysis report,visit Comprehensive Analysis Report.