Deep Learning and NLP in Cryptocurrency Forecasting: Integrating Financial, Blockchain, and Social Media Data
This repository encompasses the code for the acquisition and processing of the data, exploratory data analysis, model training and evaluation as well as related utilities and documentation for the forecasting of cryptocurrency prices using various deep learning and natural language processing techniques.
The repository contains the following directories:
-
1_data_acquisition
: Interfacing with APIs to collect and preprocess data. -
2_data_processing
: Preprocessing and cleaning of the data acquired in the previous step. -
3_nlp_models
: Applying our NLP approaches and post-processing the acquired scores. -
4_eda
: Exploratory analysis of numeric and textual data, as well as Granger causality analysis. -
5_time_series_models
: Training and evaluation of the time series models. -
6_feature_importance
: Assessment of the feature importances using an XGBoost model trained on daily price fluctuations. -
utils
: Utility scripts and helper functions used throughout the project.
The code in this repository was developed using Python 3.9.14. The required Python packages are listed in the requirements.txt
file.