-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathBAN421.rmd
474 lines (379 loc) · 27.5 KB
/
BAN421.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
---
title: "BAN421-report"
author: "Group 2"
date: "November 22, 2019"
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
# 1. Introduction
In this paper, we try to investigate the influence of other countries' total supply (production)
and total consumption on the electricity market spot prices in Norway, investigating prices for Bergen and Tromsø. All the input data is taken from the Nordpool webiste and is dating from 2014 to 2018.
## Economic reasoning
Comparing production and consumption volumes of electricity for a country, one can easily calculate the delta of those, representing times in which there is either overproduction or oversupply. Assuming a non-perfect energy storage capacitiy, these peaks and lows in the nation-wide electricty net provide the opportunity and need to trade electricity with economically and geographically closely related countries [citation].
Therefore, our Reasearch Question is: how depended is Norway and its electricity price from other nordic countries and their electricity production?
Before starting the analysis, we explore the most efficient way to accommodate multiple files when loading raw data into an R environment, specifically, by comparing the processing time using dataframe and tibble. After that, we will make plots to see the general consumption and production trends for each country during the past five years. In addition, the price trend in Norway will also be plotted to see how price had fluctuated during the past five years. Under the help of these plots, we will have a better understanding about the evolution of each factor. In order to investigate the relationship between consumtion, production and price, we will use baysian statistics, and specifically the Hidden Markov Model (HMM). We will implement it with a suitable R package and finally interpret the results.
For the purpose of our investigation, we decided to use packages from the so-call R "tidyverse". While the group is aware of the multiple downsides of using piped statements, we decided for the purpose of this particular investigation to make use of them.
```{r Libraries, warning = FALSE, message = FALSE, echo = FALSE}
# loading required packages
library(readxl)
library(tibble)
library(quantmod)
library(MASS)
library(reshape2)
library(scales)
library(dplyr)
library(ggplot2)
library(gridExtra)
library(magrittr)
library(depmixS4)
# rm( list=ls() )
```
# 2. Preprocessing and analysis
For our first step, we use system.time to compare the processing time of using dataframe and tibble to accomadate the data we cleaned. We get the following result:
```{r Data cleaning, warning = FALSE, message = FALSE, echo = FALSE}
## 1.Data cleaning process ----
#find the file names of consumption, spot prices and production for each year
consum_names <- list.files(path = ".", pattern = "consumption-per.*.xlsx", full.names = T)
elspot_names <- list.files(path = ".", pattern = "elspot.*.xlsx", full.names = T)
production_names <- list.files(path = ".", pattern = "production-per.*.xlsx", full.names = T)
# try to load all 5 years consumption data into one dataframe
consumption <- data.frame() # create one empty dataframe first
system.time(for(i in consum_names){
#since our data is pure xls format, read_xls fits the most
consumption <- read_xlsx(i)%>%
bind_rows(consumption)%>% #combine each year's data together by rows
na.omit()
})
```
Compare against t
```{r Time comparison, warning = FALSE, message = FALSE, echo = FALSE}
#use tibble to compare the processing time
consumption_tib<-tibble()
system.time(for(i in 1:length(consum_names)){
consumption_tib <- read_xlsx(consum_names[i])%>%
bind_rows(consumption_tib)
})
```
We cannot observe a significant difference between the processing times of either using tibble or dataframe. This might be due to the fact that our raw data is not large enough to observe a difference between the different loading algorithms.
To further compare the difference between dataframe and tibble, we also look at the structure of results produced by each. We get the following result:
```{r Check data structure, warning = FALSE, message = FALSE, echo = FALSE}
#check the data structure in each
str(consumption)
```
```{r, warning = FALSE, message = FALSE, echo = FALSE}
str(consumption_tib)
```
They have the same data structure, so we will continue to use dataframe to load production and price data.
```{r Binding dataframes, warning = FALSE, message = FALSE, echo = FALSE}
production <-data.frame()
for(i in 1:length(production_names)){
production <- read_xlsx(production_names[i])%>%
bind_rows( production)%>%
na.omit()
}
price <-data.frame()
for(i in 1:length(elspot_names)){
price<- read_xlsx(elspot_names[i])%>%
bind_rows( price)
}
price<-price[,-5]
price<-na.omit(price)
```
We continue to load the prognosis data for consumption and production in the different countries, so that we can use it to predict based on our model.
```{r Finalized R-data, warning = FALSE, message = FALSE, echo = FALSE}
consum_pre<-list.files(path = ".",
pattern = "consumption-pro.*.xlsx", full.names = T)
consum_pre<-read_xlsx(consum_pre)
produc_pre<-list.files(path = ".",
pattern = "production-pro.*.xlsx", full.names = T)
produc_pre<-read_xlsx(produc_pre)
# sum up all productions from small areas in each country and take the total only
produc_pre<- produc_pre%>%
mutate(NO=NO1+NO2+NO3+NO4+NO5)%>%
mutate(SE=SE1+SE2+SE3+SE4)%>%
mutate(DK=DK1+DK2) %>%
dplyr::select("Date","Hours","FI","DK","EE","LV","LT","NO","SE")
# specify to avoid overlap with "MASS" package
# save all these data as R.data
save(consum_pre,consumption,price,produc_pre,production,file="BAN421.Rdata")
```
# Plotting
The next step in our research is to create plots in order to have a general picture about price,production, and consumption changes over the five year timeframe. After creating the yearly amounts, we firstly look at the variety in consumption in Norway.
```{r Data Plotting, warning = FALSE, message = FALSE, echo = FALSE}
#load("BAN421.Rdata") # load data, if start only from here
# create another "year" variable
consumption<-consumption%>%
mutate(year = format(Date, "%Y"))
production<-production%>%
mutate(year = format(Date, "%Y"))
price<-price%>%
mutate(year = format(Date, "%Y"))
# i) consumption
# create the yearly plotting data
plot_c<-consumption %>%
group_by(year) %>%
dplyr::select("year","NO","SE","DK","FI","LV","LT")%>%
summarise_each(funs(sum))
# plot consumtion trend in Norway
g1 <- ggplot(data=consumption,mapping=aes(x=consumption$Date))+
geom_line(aes(y=consumption$NO, color="NO"), color="blue")+
ylab("Electricity consumption")+
xlab("")+
ggtitle("Daily Consumption in Norway")
#and all countires
g2 <- ggplot(data=plot_c,mapping=aes(x=year, group=1))+
geom_point(aes(y=plot_c$NO,color="NO"))+
geom_smooth(aes(y=plot_c$NO,color="NO"),method=lm, se=FALSE, fullrange=TRUE) +
geom_point(aes(y=plot_c$SE,color="SE"))+
geom_smooth(aes(y=plot_c$SE,color="SE"),method=lm, se=FALSE, fullrange=TRUE) +
geom_point(aes(y=plot_c$DK,color="DK"))+
geom_smooth(aes(y=plot_c$DK,color="DK"),method=lm, se=FALSE, fullrange=TRUE) +
geom_point(aes(y=plot_c$FI,color="FI"))+
geom_smooth(aes(y=plot_c$FI,color="FI"),method=lm, se=FALSE, fullrange=TRUE) +
geom_point(aes(y=plot_c$LV,color="LV"))+
geom_smooth(aes(y=plot_c$LV,color="LV"),method=lm, se=FALSE, fullrange=TRUE) +
geom_point(aes(y=plot_c$LT,color="LT"))+
geom_smooth(aes(y=plot_c$LT,color="LT"), method=lm, se=FALSE, fullrange=TRUE) +
labs(title = "Yearly consumption for all countries")+
scale_color_discrete(name="Country")+
ylab("Electricity consumption")+
xlab("Year")+
scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x),
labels = trans_format("log10", math_format(10^.x)))
grid.arrange(g1, g2,
widths = 1,
nrow = 2)
```
One can observe seasonal fluctuations, with a higher demand for electricity in winter months than in summer months. Following, we compare the yearly consumption of all countries.
Here one can see that Norway and Sweden have the highest yearly consumption. The trend for all countries is slightly increasing over the years, but mostly constant. Following, we plot the same for the production.
```{r, warning = FALSE, message = FALSE, echo = FALSE}
# 2. Production
# create the yearly plotting data
plot_p<-production %>%
group_by(year) %>%
dplyr::select("year","NO","SE","DK","FI","LV","LT")%>%
summarise_each(funs(sum)) #sum the toal production each country each year
# use ggplot to plot production trend in Norway
g1 <- ggplot(data=production,mapping=aes(x=production$Date))+
geom_line(aes(y=production$NO, color="NO"), color="blue")+
xlab("")+
ylab("Electricity production")+
ggtitle("Daily Production in Norway")
g2 <- ggplot(data=plot_p,mapping=aes(x=year, group=1))+
geom_point(aes(y=plot_p$NO,color="NO"))+
geom_smooth(aes(y=plot_p$NO,color="NO"),method=lm, se=FALSE, fullrange=TRUE) +
geom_point(aes(y=plot_p$SE,color="SE"))+
geom_smooth(aes(y=plot_p$SE,color="SE"),method=lm, se=FALSE, fullrange=TRUE) +
geom_point(aes(y=plot_p$DK,color="DK"))+
geom_smooth(aes(y=plot_p$DK,color="DK"),method=lm, se=FALSE, fullrange=TRUE) +
geom_point(aes(y=plot_p$FI,color="FI"))+
geom_smooth(aes(y=plot_p$FI,color="FI"),method=lm, se=FALSE, fullrange=TRUE) +
geom_point(aes(y=plot_p$LV,color="LV"))+
geom_smooth(aes(y=plot_p$LV,color="LV"),method=lm, se=FALSE, fullrange=TRUE) +
geom_point(aes(y=plot_p$LT,color="LT"))+
geom_smooth(aes(y=plot_p$LT,color="LT"), method=lm, se=FALSE, fullrange=TRUE) +
labs(title = "Yearly production for all countries")+
scale_color_discrete(name="Country")+
ylab("Electricity production")+
xlab("Year")+
scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x),
labels = trans_format("log10", math_format(10^.x)))
grid.arrange(g1, g2,
widths = 1,
nrow = 2)
```
Also here we can observe just slighty fluctuating volumes, with Norway and Sweden being the main producers. Now, we have a look at the development of the electricity price in Norway for two cities.
```{r Price, warning = FALSE, message = FALSE, echo = FALSE}
# Use ggplot to plot price trend in Norway
g1 <- ggplot(data=price,mapping=aes(x=price$Date))+
geom_line(aes(y=price$Tromsø, color="Tromsø"))+
geom_line(aes(y=price$Bergen, color="Bergen"))+
xlab("")+
ylab("Electricity price")+
ggtitle("Daily price in Norway")+
scale_color_discrete(name="City")
# Create the yearly plotting data
plot_price<-price %>%
group_by(year) %>%
dplyr::select("year","Bergen","Tromsø")%>%
summarise_each(funs(mean))#calculate the mean price in Norway each year in each place
# A general trend in each year
g2 <- ggplot(data=plot_price,mapping=aes(x=year, group=1))+
geom_point(aes(y=plot_price$Bergen, color="Bergen"))+
geom_line(aes(y=plot_price$Bergen, color="Bergen"))+
geom_point(aes(y=plot_price$Tromsø, color="Tromsø"))+
geom_line(aes(y=plot_price$Tromsø, color="Tromsø"))+
ggtitle("Yearly average price in Norway")+
scale_color_discrete(name="City")+
ylab("Mean elect. price")+
xlab("Year")
grid.arrange(g1, g2,
widths = 1,
nrow = 2)
```
Quite interesting to observe, there are heavy fluctuations in the electricity price, and for a few times, there are heavy spikes. Furthermore, prices in Bergen and Tromsø do differ.
From the above figures, we can see that the consumption and production in Norway differentiate from each other quite a lot, which means that Norway has been trading (import and export) electricity with other countries. This gives us a positive signal about our hypothesis that other countries' consumption and production may have an impact on the price in Norwegian market. The market spot price in Norway has been fluctuating since 2014 and presents an incresing trend since 2015. Especially, there are two peak time periods in Tromsø. The reasons behind this are not that clear and we belive that there are way more factors influencing this. However in our model, we are going to focus only on the impact by other countries' consumption and production.
# 3. Hidden Markov Model
## Theory
Hidden Markov Model (HMM) is a statistical probabilistic framework, where observed data is modelled as a series of outputs generated by several hidden states (Franzese, 2019). HMM is based on an augmenting Markov chain - model providing probabilities of sequences of random variables (Jurafsky, 2019).. Based on the researched applications of the Markov models, chain sequencing is used fairly more often for the prediction rather than the standard HMM, while hidden models are applied for the information filtering.
Unlike Markov chains, hidden Markov models are finding probabilities of the unobservable events, which are referred to as “hidden”. In our model, we define consumption-production states(difference between the produced amounts of electricity and consumed deriving two key states - deficit and glut/oversupply).
Nowadays, hidden Markov models are widely used in the biological studies (DNA sequencing), natural language processing (POS-tagging) and market analysis (defining market regimes, such as bullish/bearish market trends) . Apart from sequencing and prediction, HMM can be used for the information filtering: for instance, marine systems are applying HMM to identify vessels that are using port services, providing real-time vessel identification. In the various fields, HMM has shown comparably high accuracy (Yeo, 2019).
## Packages
To indentify suitable packages to implement this method in R, we looked through the "Comprehensive R Archive Network Task View" and found in total three packages: "depmixS4", "HMM", and "HiddenMarkov". After comparing and testing, we decided for "depmixS4", which is regulary updated and often used for the HMM implementations in R. Depmixs4 package provides functionality to fit the HMM on dependent mixture models (HMM of GLM or similar distributions).
The key functions used for the modelling are: depmix, fit and posterior. Depmix is aimed to create the "raw" hidden Markov model with the given settings, which are:
-- response (interfaces to the GLM);
-- data from where the responses are fetched;
-- family, also known as type of response, in our model it is normal (Gaussian) distribution;
-- nstates - number of states of the model (in our case just 2, which are deficit and oversupply).
## Approach
Before we get to the model itself, we need to do a bit more data preprocessing. For this matter consumption and production data is combined by the common date and hour time and new columns are added, which are as follows:
-- `NO`, `DK`, `SE`, `NORD` - difference between the consumption and production for Norway, Denmark, Sweden and Nordic countries respectively;
-- `NO_state`,`DK_state`,`SE_state`,`NORD_state` - state(deficit or oversupply), based on the found number on the aforementioned columns.
Furthermore, before the modelling itself we summarise the data by day to clean the daily "noise" and irrelevant fluctuations within the model, as well as convert parameters to the suitable for us formats: characters, date and numeric.
```{r Statistical methods, message=FALSE, warning=FALSE, include=FALSE}
## 3.Statistical methods ----
# Combine tables with all the data and preproceess the data----
comb <- inner_join(consumption, production,
by = c("Date", "Hours"))
# Add new parameters to the combined table based on the production and consumption parameters
comb <- comb %>%
mutate(NO = NO.y - NO.x,
DK = DK.y - DK.x,
SE = SE.y - SE.x,
NORD = Nordic.y - Nordic.x)
# Clean the table
comb <- comb[c("Date", "Hours", "NO", "DK", "SE", "NORD")]
# Summarise by date to cancel minor chages and noise
comb <- comb %>%
group_by(Date) %>%
summarise(NO = sum(NO),
DK = sum(DK),
SE = sum(SE),
NORD = sum(NORD))
# Add logical parameters showing the state of each country
comb <- comb %>%
mutate(NO_state = as.logical(NO > 0),
DK_state = as.logical(DK >0),
SE_state = as.logical(SE > 0),
NORD_state = as.logical(NORD > 0))
comb <- as.data.frame(lapply(comb, function(y) gsub("TRUE", "Surplus", y)))
comb <- as.data.frame(lapply(comb, function(y) gsub("FALSE", "Deficit", y)))
# Convert to the proper formatting
comb[6:9] <- lapply(comb[6:9], as.character)
comb$Date <- as.Date(comb$Date)
# Run and fit HMM----
hmm <- function (param, data){
# We will use the changes in fluctuations as dependece ground for the HMM
fluc <- diff(as.numeric(param))
fluc <- c(0, fluc)
# Set the model
model <- depmix(fluc ~ 1,
family = gaussian(), # normal (gaussian) param distribution
nstates = 2, # number of states
data = comb) # used data
# Fit the data and get it as data frame with the defined states
fit.hmm <- fit(model)
est.states <- posterior(fit.hmm)
# Analyse the results
tbl <- table(est.states$state, comb$NO_state)
colnames(est.states)[2:3] <- c("Deficit", "Surplus")
est.states$Date <- comb$Date
# Change the naming of the states based on the tbl
grep("Deficit", comb$NO_state)
grep("1", est.states$state)
est.states$state <- gsub("1", "Deficit", est.states$state)
est.states$state <- gsub("2", "Surplus", est.states$state)
return(est.states)
}
# Plot the results----
display.hmm <- function(original,
model){
# Define colours
mycols <- c("darkmagenta", "turquoise")
# Plotting the actual states for the further comparison with the HMM estimates (we look only into Norwegian states)
g1 <- ggplot(original, aes(x = Date, y = NO_state, fill = NO_state, col = NO_state)) +
geom_bar(stat = "identity", alpha = I(0.75)) +
scale_fill_manual(values = mycols, name = "State:\nProduction of\nNorway", labels = c("Deficit", "Surplus")) +
scale_color_manual(values = mycols, name = "State:\nProduction of\nNorway", labels = c("Deficit", "Surplus")) +
theme(axis.ticks = element_blank(), axis.text.y = element_blank()) +
labs(y = "Actual State")
# Visualise the HMM model estimates
g2 <- ggplot(model, aes(x = Date, y = state, fill = state, col = state)) +
geom_bar(stat = "identity", alpha = I(0.75)) +
scale_fill_manual(values = mycols, name = "State:\nProduction of\nNorway", labels = c("Deficit", "Surplus")) +
scale_color_manual(values = mycols, name = "State:\nProduction of\nNorway", labels = c("Deficit", "Surplus")) +
theme(axis.ticks = element_blank(), axis.text.y = element_blank()) +
labs(y = "Estimated State")
return(grid.arrange(g1, g2,
widths = 1,
nrow = 2))
}
```
To develop HMM model, we go through the following stages:
-- HMM creation;
-- Fitting the model;
-- Presenting the output as a dataframe;
-- Plotting the results and comparing to the actual data.
As a final result, we want to find out an interdependece between the consumption~production parameters of other countries and Norway. This way we can define the level of dependence of Norway on energy import/export, it's correlation with the neighbouring countries and region as a whole, as well as to define, which parameters of which countries affect the energy state of Norway the best.
For these matter, we developed two major functions. First function (`hmm`), is going through the first 3 steps of the model creation utilising the key functions of the depmixs4 package (depmix, fit and posterior), while the second function is aimed to visualise the results and provide plotting with the actual state for the further analysis and comparison. As for the input parameters, `hmm` function requires parameter and data to be filled in: parameter defines the variable used for the GLM, while the data defines data frame interpreting the variable used in the response. Function `display.hmm` requires the dataframe with the original data and the results of the `hmm` function.
## Results and interpretation
Both our functions are suited for the previously combined data frame `comb`, where we store production-consumption data for the Scandinavian countries and a Nordic region in total. It is important to notice that our functions are aimed to provide the results specifically for Norway and to compare it with the actual states for Norway we derived earlier during the pre-processing steps.
In order to understand the export/import interdependence, we have run the hidden Markov model based on the parameters of Nordic region, Denmark and Sweden. In order to define the differences in fluctuations rather than just changes in the differences between the consumption and productin, we used the diff functions of the production-consumption diffrences over time. It is similar to the applications of the HMM to the market regime analysis, where diff functions are applied to the closing price.
### Nordic region
```{r NORD, warning = FALSE, message = FALSE, echo = FALSE}
# Run functions for NORD
display.hmm(original = comb,
model = hmm(param = comb$NORD,
data = comb))
```
Above we can see the results based on the parameter of the Nordic countries, which is represented in the data frame `comb` in the column `NORD`. Upper graph shows the actual changes in the state over the time, while thwe bottom graph is displaying the results based on the HMM estimations.
Model's estimations are fairly low in comparison to the actual changes in state for Norway, this can be explained with the fact that we use cummulative fluctuations (as we mentioned previously, the difference in production state is taken to calculate the fluctuations), therefore, with the cummulative changes the only dependecy is between the macro trends for the Nordic region. So, it is logical that the model performs poorly, as Norwegian production state is affected more by the micro fluctuations and changes that are occuring on the day-to-day basis, so the cummulative parameter "flats" it out. Based on the given graph we can suppose that the models for the neighbouring state should be more precise as they carry the regional micro and macro trends.
### Denmark
```{r DK, warning = FALSE, message = FALSE, echo = FALSE}
# Run functions for DK
display.hmm(original = comb,
model = hmm(param = comb$DK,
data = comb))
```
Differently model works based on the single-country parameter for Denmark: we can observe that the overall estimation trend is kept, yet there are more "Surplis" states, showing that the model is closer to the actual results. The results are better distributed across the whole period in comparison to the precious model, yet there is still comparably high number of the "Deficit" state, unlike the actual state.
Second model proves earlier provided hypothesis on the better performance of the singøe-state data. In comparison to the general model, "Surplus" states are "added" to the missing periods, which are connected to the national Dannish consumption-production fluctuations tha tgrasp the market behaviour better than the macro model. Yet, Denmark as an energy state has a different position in comparison to Norway, therefore the performance can be explained with the countries being neighbouring, following the same regional patterns (as in model NORD) and smaller daily patterns.
### Sweden
```{r SE, echo= FALSE, message=FALSE, warning=FALSE}
# HMM based on SE
display.hmm(original = comb,
model = hmm(param = comb$SE,
data = comb))
```
The final graph shows us great results with the "Surplus" state dominating the most of the timeseries, yet keeping the deficit patterns quite similar to the actual results. Model for Sweden can be seen as a logical continuation of the previous two models: first model provided the key regional patters, while the second model added micro parameters for the countries different in the energy policies, and last model "fills in" the gaps.
Based on the models, we can conclude that Norwegian production state has a higher dependece and correlation with Sweden in comparison to the other states. Such conclusion is logical, with Sweden being the second biggest energy supplier in the Nordic sector. Apart from that Norway is comaparably more independent within the energy sector in comparison to the other states across the Nordic region, which also can explain the performance of the HMM models.
### Comparison of the actual models
```{r SE and NO, echo= FALSE, message=FALSE, warning=FALSE}
g1 <- ggplot(comb, aes(x = Date, y = SE_state, fill = SE_state, col = SE_state)) +
geom_bar(stat = "identity", alpha = I(0.75)) +
scale_fill_manual(values = c("darkmagenta", "turquoise"), name = "State:\nProduction of\nSweden", labels = c("Deficit", "Surplus")) +
scale_color_manual(values = c("darkmagenta", "turquoise"), name = "State:\nProduction of\nSweden", labels = c("Deficit", "Surplus")) +
theme(axis.ticks = element_blank(), axis.text.y = element_blank()) +
labs(y = "Actual State")
g2 <- ggplot(comb, aes(x = Date, y = NO_state, fill = NO_state, col = NO_state)) +
geom_bar(stat = "identity", alpha = I(0.75)) +
scale_fill_manual(values = c("darkmagenta", "turquoise"), name = "State:\nProduction of\nNorway", labels = c("Deficit", "Surplus")) +
scale_color_manual(values = c("darkmagenta", "turquoise"), name = "State:\nProduction of\nNorway", labels = c("Deficit", "Surplus")) +
theme(axis.ticks = element_blank(), axis.text.y = element_blank()) +
labs(y = "Actual State")
grid.arrange(g1, g2,
widths = 1,
nrow = 2)
```
Graph represents the comparison of the actual states of Norway (below) and Sweden (above). Both countries have the production surplus most of the time for the given period, yet production deficit distribution among the country varies with Norway "spreading" deficit state across the whole time frame, while Swedish production deficit can be seen clearly periodic.
# 4. Conclusion
Taking everything into account, we can conclude by saying that Norway is one of the least-dependent countries in the benchmark set, when it comes to electricity. This might mainly be due to the vast amount of hydropower they generate, which is a pretty flexible way of producing electricity and therefore good for balancing highs and lows in the demand of electricity, without having to rely on neighbouring countries. Those neighboaring countries, for example Denmark, on the other hand, are much more dependent on Norway and also Sweden for importing electricity during their peak periods.
# References
Iuliano.A, Franzese.M. 2019. “Correlation Analysis”. In Encyclopedia of Bioinformatics and Computational Biology,2019, Edited by Shoba Ranganathan, Michael Gribskov, Kenta Nakai and Christian Schönbach. Elsevier
Jurafsky.D & Martin.H.J. 2019. Speech and Language Processing,3rd ed. draft.
Hassan, M. R., & Nath, B. 2005. Stock market forecasting using hidden Markov model: a new approach. Proceedings of 5th International Conference on Intelligent Systems Design and Applications (ISDA’05), IEEE, pp. 192-196.
Zucchini, W., MacDonald, I. L., & Langrock, R. 2017. Hidden Markov models for time series: an introduction using R. Chapman and Hall/CRC.
Gavin Yeo et al (2019), MPA-IBM Project SAFER: Sense-making Analytics for Maritime Event Recognition, INFORMS Journal on Applied Analytics;
Teusch, J., Behrens, A., & Egenhofer, C. (2012). The benefits of investing in electricity transmission: lessons from Northern Europe. CEPS Special Reports, Forthcoming.
Christie, R. D., & Wangensteen, I. (1998). The energy market in Norway and Sweden: Introduction. IEEE Power Engineering Review, 18(2), 44-45.