GitHub - carliecode/census_bureau: Explore 2017 household demographics, income, and phone accessibility using Apache Spark & Python.

United States Census Bureau Data Analysis using Apache Spark

Project Overview

This project analyzes the United States Census Bureau's 2017 Basic Monthly CPS data using Apache Spark with Python. The goal is to extract relevant information, answer specific questions, and provide insights into the data.

Dataset

Source: United States Census Bureau's 2017 Basic Monthly CPS
File: December DOS/Windows zip file (extracted dat file)
Data Dictionary: Used to map and extract relevant data

Extracted Data

The following data was extracted:

Full household identifier
Time of interview in YYYY/MMM format
Final outcome of the survey
Type of housing unit
Household type
Apartment/Household has a telephone
Apartment/Household can access a telephone elsewhere
Is telephone interview acceptable for the responder
Type of interview
Family income range
Geographical division/location
Race

Analysis Questions

The following questions were answered:

Count of responders per family income range
Count of responders per geographical division/location and race (top 10)
Number of responders without telephone in their house, but can access a telephone elsewhere and telephone interview is accepted
Number of responders who can access a telephone, but telephone interview is not accepted

Code

The code is written in Python using Apache Spark and is available in this repository.
The code uses Jupyter Notebook 'main.ipynb'.
Decoding data elements and their corresponding schema information are in schema.py.

Requirements

Apache Spark
Python 3.8
pandas

Installation

Install Apache Spark
Install required Python libraries using pip install -r requirements.txt
Clone this repository

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
__pycache__		__pycache__
.gitignore		.gitignore
README.md		README.md
dec17pub.zip		dec17pub.zip
main.ipynb		main.ipynb
schema.py		schema.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

carliecode/census_bureau

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages