Skip to content

The application that extracts JSON files from the public AWS S3 bucket, transform them, and load results to the database.

Notifications You must be signed in to change notification settings

staaceyD/data_flow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Flow

The application that extracts JSON files from the public AWS S3 bucket, transform them, and load results to the database.

Instalation

To launch this project first you need to clone the repository.

git clone 'repository link'

Сreate virtual environment and activate it

python3 -m venv <name of virtual env folder>
source <name of virtual env folder>/bin/activate

After thet you can install all dependencies

pip3 install -r requirements.txt 

Install aws CLI and cofigure user for further usage:

pip3 install awscli --upgrade --user
aws configure

Here you should be good to go. To start the project run db_wrapper.py file

python3 db_wrapper.py

This will process files in the bucket, create database and insert records to the database

About

The application that extracts JSON files from the public AWS S3 bucket, transform them, and load results to the database.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages