This project integrates Cloud Computing with the Internet of Things (IoT) for comprehensive environmental monitoring, with a focus on air quality analysis. Utilising Apache Spark and Amazon Timestream, it manages large volumes of data from IoT sensors to calculate the Air Quality Index (AQI) accurately, offering insights into environmental conditions. The file report.pdf
provides a detailed overview of the project.
- Real-time Data Collection: IoT sensors gather environmental data continuously.
- Efficient Data Processing: Leverages Apache Spark for effective data handling.
- Robust Data Storage: Uses Amazon Timestream for optimised time-series data management.
- Dynamic Data Visualisation: Features a Grafana dashboard for interactive and real-time data insights.
-
Clone the repository:
git clone git@github.com:AlexisBalayre/environmental-monitoring-project.git
-
Navigate to the project directory:
cd environmental-monitoring-project
-
Set up a virtual environment:
python3 -m venv venv
-
Activate the virtual environment:
-
For Windows:
.\venv\Scripts\activate
-
For Unix or MacOS:
source venv/bin/activate
-
-
Install dependencies:
pip install -r requirements.txt
-
Start the system: Run
main.py
to begin data collection and processing.python main.py
-
Visualise the data: Access the Grafana dashboard for real-time data analysis and visualisations.
lib/
: Core library modules for data collection, processing, and storage.scripts/
: Scripts for IAM credentials retrieval and Spark job initiation.services/
: Service configurations for IAM and Spark.test/
: Testing scripts and visualisation tools.main.py
: Main executable script.requirements.txt
: Project dependencies.
The project includes comprehensive testing:
- Load testing configurations and results.
- Unit testing for data collection, processing, and storage.
- Visualisation tools for data analysis.
- Apache Spark
- Python 3.x
- Amazon Timestream
- Grafana