Skip to content

Data exploration and preprocessing on Netflix Dataset

Notifications You must be signed in to change notification settings

sbaglieri13/Netflix-EDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Netflix EDA

Data exploration and preprocessing on Netflix Dataset

Libraries needed

  • Pandas
  • Numpy
  • Matplotlib
  • Seaborn

In this notebook we use the dataset 'Netflix titles' available at this link

About Dataset

This dataset contains Unlabelled text data of around 9000 Netflix Shows and Movies along with Full details like Cast, Release Year, Rating, Description, etc.

Columns of dataset:

  • show_id Unique ID for every Movie / Tv Show
  • type Identifier a Movie or TV Show
  • title Title of the Movie / Tv Show
  • director Director of the Movie
  • cast Actors involved in the movie / show
  • country Country where the movie / show was produced
  • date_added Date it was added on Netflix
  • release_year Actual Release year of the move / show
  • rating TV Rating of the movie / show
  • duration Total Duration - in minutes or number of seasons

Example of query on dataset

  • Extract min, max, mean, median, std of a column
  • View in a plot TOP 10 genres and TOP 10 actors by appearance
  • Rating comparation
  • Duration statistics
  • ...

Releases

No releases published

Packages

No packages published