A lightweight ETL data pipeline intended to support the operations of the Consumer Complaint Search application.
Description: This purpose of this code is to provide data for Consumer Complaint Search. This pipeline downloads scrubbed consumer complaint data and indexes that data in Elasticsearch for the Complaint Search application to display and analyze.
Status: In Production
This pipeline is intended to index data in Elasticsearch and is dependent on having an Elasticsearch instance to interface with.
Detailed instructions on how to install, configure, and get the project running are in the INSTALL document.
- Set environment variables
export ES_USERNAME=<foo>
export ES_PASSWORD=<bar>
export ENV=[ENVIRONMENT]
- where ENVIRONMENT=
dev
,staging
,prod
- where ENVIRONMENT=
make from_public
source ./activate-virtualenv.sh
- Set environment variables
export AWS_ACCESS_KEY_ID=<svc_account_access_key>
export AWS_SECRET_ACCESS_KEY=<svc_account_secret_access_key>
export ES_USERNAME=<foo>
export ES_PASSWORD=<bar>
export ENV=[ENVIRONMENT]
- where ENVIRONMENT=
dev
,staging
,prod
- where ENVIRONMENT=
export INPUT_S3_BUCKET=<bucket-name>
export INPUT_S3_KEY=<path-to-csv>
export OUTPUT_S3_BUCKET=<bucket-name>
export OUTPUT_S3_FOLDER=ccdb/test/<your initials>
make