Data exploration and preprocessing on Netflix Dataset
- Pandas
- Numpy
- Matplotlib
- Seaborn
In this notebook we use the dataset 'Netflix titles' available at this link
This dataset contains Unlabelled text data of around 9000 Netflix Shows and Movies along with Full details like Cast, Release Year, Rating, Description, etc.
Columns of dataset:
show_id
Unique ID for every Movie / Tv Showtype
Identifier a Movie or TV Showtitle
Title of the Movie / Tv Showdirector
Director of the Moviecast
Actors involved in the movie / showcountry
Country where the movie / show was produceddate_added
Date it was added on Netflixrelease_year
Actual Release year of the move / showrating
TV Rating of the movie / showduration
Total Duration - in minutes or number of seasons
Example of query on dataset
- Extract min, max, mean, median, std of a column
- View in a plot TOP 10 genres and TOP 10 actors by appearance
- Rating comparation
- Duration statistics
- ...