Skip to content

Ahmad-Akhtar/EnergyVision-Future-Forecasting-System-with-User-Interface-of-Energy-Consumption-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Time Series Analysis and Prediction: A Comprehensive Report

  1. Introduction

1.1 Background Time series data is a sequence of data points collected or recorded at specific time intervals. It is widely used in various fields such as economics, finance, weather forecasting, and energy demand prediction. Accurate time series analysis and prediction are crucial for making informed decisions and optimizing resource allocation.

1.2 Objectives The primary objective of this report is to analyze and predict time series data using multiple modeling techniques. Specifically, we aim to: • Preprocess and clean the time series data. • Implement and evaluate various time series models. • Compare the performance of these models to identify the most accurate and robust method.

1.3 Scope This report covers the analysis of hourly PJME_MW data from December 31, 2002, to January 2, 2018. We will explore and implement the following models: ARIMA, LSTM, SARIMA, ETS, Prophet, SVR, and a hybrid ARIMA-ANN model.

  1. Data Description

2.1 Dataset Overview The dataset used in this report consists of hourly energy demand (PJME_MW) recorded from December 31, 2002, to January 2, 2018. The data contains 145,366 observations.

2.2 Data Preprocessing • Cleaning: Identify and handle missing values by imputation or removal. • Normalization/Standardization: Scale the data to a uniform range using techniques like MinMaxScaler. • Stationarization: Apply transformations such as differencing and logarithms to achieve stationarity.

2.3 Exploratory Data Analysis (EDA) • Summary Statistics: Provide descriptive statistics for the dataset. • Visualizations: Generate line plots, histograms, and other visualizations to identify trends, seasonality, and anomalies.

  1. Time Series Models 3.1 ARIMA Model 3.2 LSTM Model 3.3 Hybrid ARIMA-ANN Model 3.4 SARIMA Model 3.5 Exponential Smoothing (ETS) 3.6 Prophet 3.7 Support Vector Regression (SVR) 3.8 Hybrid Model (ARIMA & ANN)

  2. Model Comparison and Analysis: This section compares the performance of various time series models applied to our dataset, focusing on key accuracy metrics. The models evaluated include ARIMA, ANN, SARIMA, ETS, Prophet, SVR, LSTM, and a hybrid ARIMA-ANN model.

4.1 Performance Comparison

ARIMA Total fit time: 1890.549 seconds Mean Squared Error (MSE): 1.3428 Root Mean Squared Error (RMSE): 1.1588 Mean Absolute Error (MAE): 0.8628 R-squared (R² Score): -0.3165

ANN Mean Squared Error (MSE): 0.3135 Root Mean Squared Error (RMSE): 0.5600 Mean Absolute Error (MAE): 0.4189 R-squared (R² Score): 0.6823

SARIMA Mean Squared Error (MSE): 1.7879 Root Mean Squared Error (RMSE): 1.3371 Mean Absolute Error (MAE): 1.1675

ETS Mean Absolute Error (MAE): 0.8778 Root Mean Squared Error (RMSE): 1.1705 Mean Absolute Percentage Error (MAPE): 526.4495 Symmetric Mean Absolute Percentage Error (SMAPE): 116.7836

Prophet Mean Absolute Error (MAE): 0.7892 Mean Squared Error (MSE): 0.9964 Root Mean Squared Error (RMSE): 0.9982 Mean Absolute Percentage Error (MAPE): 7.9359

SVR Mean Absolute Error (MAE): 0.8992 Mean Squared Error (MSE): 1.2964 Root Mean Squared Error (RMSE): 0.7520 Mean Absolute Percentage Error (MAPE): 7.9359

LSTM Mean Absolute Error (MAE): 0.0495 Mean Squared Error (MSE): 0.0081

Hybrid ARIMA-ANN Mean Absolute Error (MAE) across all folds: 1.1125 Mean Squared Error (MSE) across all folds: 2.0118

4.2 Strengths and Weaknesses

ARIMA Strengths: Captures linear trends and seasonal patterns effectively; widely used and understood. Weaknesses: Poor performance on data with complex nonlinear patterns; negative R² score indicates poor fit.

ANN Strengths: Handles complex nonlinear relationships well; high R² score indicates good fit. Weaknesses: Requires more computational resources and time for training; sensitive to hyperparameter tuning.

SARIMA Strengths: Extends ARIMA to handle seasonality explicitly; effective for seasonal data. Weaknesses: Higher error metrics compared to other models; complex model tuning.

ETS Strengths: Simple and intuitive; handles trend and seasonality well. Weaknesses: High MAPE and SMAPE values indicate poor performance on this dataset; less flexible for complex patterns.

Prophet Strengths: Handles seasonality and holidays well; easy to use and interpret. Weaknesses: Higher MSE compared to ANN and LSTM; may require manual tuning for optimal performance.

SVR Strengths: Effective for small to medium-sized datasets; captures nonlinear patterns with appropriate kernel selection. Weaknesses: Performance highly dependent on parameter tuning; higher MAE and MSE compared to ANN and LSTM.

LSTM Strengths: Excellent at capturing long-term dependencies and nonlinear patterns; lowest MAE and MSE among all models. Weaknesses: Requires significant computational power; complex architecture and tuning.

Hybrid ARIMA-ANN Strengths: Combines strengths of ARIMA and ANN; handles both linear and nonlinear patterns. Weaknesses: Higher error metrics compared to standalone ANN and LSTM; more complex implementation and tuning.

4.3 Recommendations Based on the analysis, the following recommendations can be made: Best Overall Model: The LSTM model outperforms other models with the lowest MAE and MSE, making it the best choice for this dataset. For Linear and Seasonal Patterns: The ARIMA and SARIMA models are suitable for datasets with strong linear trends and seasonal patterns. For Ease of Use: Prophet provides a good balance of ease of use and performance, especially for datasets with seasonal and holiday effects. For Hybrid Approaches: Consider using hybrid models like ARIMA-ANN for datasets with both linear and complex nonlinear patterns, though they may require more effort in implementation and tuning.

Summary of Findings: • The LSTM model demonstrated the best performance, capturing complex patterns effectively. • ANN and Prophet models also performed well, with ANN showing a strong ability to handle nonlinear relationships. • Traditional models like ARIMA and SARIMA were less effective due to their inability to capture complex nonlinear patterns present in the data. • ETS showed the highest error metrics, indicating it was not well-suited for this dataset. • Hybrid models offer potential for improved accuracy but require careful implementation and tuning.

Conclusion: This comprehensive analysis demonstrates the varying strengths and weaknesses of different time series models. The LSTM model stands out for its superior accuracy, making it the preferred choice for this dataset. However, the choice of model should consider the specific characteristics of the dataset and the computational resources available.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published