The web traffic is basically the number of sessions in a given time frame, and it varies a lot with respect to what time of the day it is, what day of the week it is, and so on, and how much web traffic of platform can withstand depends on the size of the servers that are supporting the platform.
If the traffic is more than what the servers can handle, the website might show this 404 error, which is something we don’t want to happen. It will make the visitors go away.
- One solution to this problem is to increase the number of servers. However, the downside of the solution is the cause can go up, which is again undesirable. So, what is the solution?
- You can dynamically a lot of servers based on the historical visitor’s volume data or based on the historical web traffic data. And that brings us to the data science problem, which is basically forecasting the web traffic or a number of sessions based on the historical data.
- Run the command pip install -r requirements.txt to install the necessary dependencies.
- Run the command jupyter notebook and follow along the commented Jupyter notebook.
We will work with the web traffic dataset. It is a six-month series data which you can find here.
- Basic LSTM Model :
- Basic CNN Model :
Basic LSTM Model | Basic CNN Model |
---|---|
Metric | Baseline Model | Basic LSTM Model | Basic CNN Model |
---|---|---|---|
Mean Square Error | 0.5546 | 0.01501 | 0.0138 |
- Basic LSTM Model :
- Basic CNN Model :
- Going through the results, we conclude that both the models tend to have almost similar performance. Feel free to fine tune the hyperparameters ( different batch sizes, optimizers etc. ) or customize the DL architectures?
- Transfer Learning, SOTA models? 😯
- Facebook Prophet?😏