This program can:
- return random first name(s) from random years with options to specify gender (m/f), number of names to generate, and whether to generate a surname.
- retrieve information about a specified name's origin and meaning scraped from NameBerry.com as well as its predicted nationality, age, and gender gathered from the Nationalize.io, Genderize.io, and Agify.io APIs
- create data visualizations to display the changes in a name's popularity throughout a specified range of years
The gather_data.py file downloads and scrapes the following data:
Social Security Adminstration (SSA) annual name data from 1880 to the previous year under 'National data'
more information about the data qualifications
SSA number of SS card holders by year of birth and sex
note: not currently used in code but may be used in the future to visualize the relative popularity of names over time
US Census Bureau Top 1000 surnames
Return a specified number of random names with options to limit the gender to male or female and to generate random surnames.
Enter either a first name or a full name (first and last) to gather information about a name from Nameberry and retrieve the name's predicted nationalities, gender, and age using the Nationalize, Genderize, and Agify APIs. The predicted nationalities are highlighted on a world map.
note: Nationalize uses last names while the other APIs use first names. If a last name is not provided, the code will use the first name which may lead to inaccurate predictions.
Enter a first name, gender, start year, and end year to visualize the popularity (number of babies with that name each year in the given range) with a scatterplot and a heatmap.
- In your Terminal, run
git clone https://github.com/lk101101/Names
to clone this repo into your directory - Navigate to the new folder called Names
- Create a Conda environment from environment.yml file:
conda env create -f environment.yml
, thenconda activate abc
. A requirements.txt file is also provided. - Download and unzip required datasets:
python gather_data.py
(see 'Data' for more information). - Run
flask run
to start the Flask server. - Navigate to
http://127.0.0.1:5000
in your web browser.