This data analysis project examines the Titanic dataset to explore various information about the passengers on board, such as the distribution of ages, survival based on passenger class, and the relationship between variables like age and ticket fare.
-
titanic.csv
: The dataset used for analysis, containing information about the Titanic passengers.- PassengerId: Unique identifier for each passenger
- Survived: Survival status (0 = Not survived, 1 = Survived)
- Pclass: Passenger class (1 = 1st class, 2 = 2nd class, 3 = 3rd class)
- Name: Passenger's name
- Sex: Passenger's gender
- Age: Passenger's age in years
- SibSp: Number of siblings/spouses aboard
- Parch: Number of parents/children aboard
- Ticket: Ticket number
- Fare: Fare paid for the ticket
- Cabin: Cabin number
- Embarked: Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton)
-
titanic_disaster_analysis.ipynb
: The Jupyter notebook containing the Python code for data analysis.
The code was implemented using the Python programming language and the following libraries:
- Pandas: Used for data manipulation and analysis.
- Matplotlib: Used for creating charts and visualizations.
- Seaborn: Used for creating more aesthetically pleasing and detailed graphs.
During the analysis, various questions about the Titanic data were explored, such as the distribution of passenger ages, survival based on passenger class, the correlation between age and ticket fare, and more. The results were displayed through graphs, tables, and descriptive statistics.