-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathproblems.rtf
executable file
·1431 lines (1390 loc) · 109 KB
/
problems.rtf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
{\rtf1\ansi\ansicpg1252\cocoartf1404\cocoasubrtf470
\cocoascreenfonts1{\fonttbl\f0\fswiss\fcharset0 Helvetica;\f1\fnil\fcharset128 HiraKakuProN-W3;\f2\froman\fcharset0 Times-Roman;
\f3\fnil\fcharset134 STHeitiSC-Light;\f4\froman\fcharset0 TimesNewRomanPSMT;\f5\fnil\fcharset0 Tahoma;
\f6\fswiss\fcharset0 ArialMT;\f7\fnil\fcharset0 Verdana;\f8\fmodern\fcharset0 Courier;
}
{\colortbl;\red255\green255\blue255;\red154\green154\blue154;\red179\green179\blue179;\red128\green128\blue128;
\red164\green8\blue0;\red255\green39\blue18;\red217\green11\blue0;\red63\green105\blue30;\red206\green59\blue0;
\red155\green44\blue1;\red223\green223\blue223;\red45\green109\blue141;\red0\green57\blue161;\red7\green78\blue230;
\red12\green82\blue6;\red195\green196\blue195;\red247\green247\blue247;\red73\green0\blue0;\red67\green10\blue31;
\red110\green5\blue0;\red85\green142\blue40;}
\paperw11900\paperh16840\margl1440\margr1440\vieww10800\viewh7720\viewkind0
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\f0\fs24 \cf0 Self-managed and fine grained SLA guarantee in the cloud\
adaptive fine grandee SLA pareto optimisation and economic equilibrium enforcement\
\
Problems list:\
1) no consider relationship/interference amongst different attributes, and services\
no relationship consideration of attributes and service cause possible reactive to enforce SLA needs to be predefined. what parameter needs to consider can be defined e.g. bandwidth, CPU but what action needs to perform should be avoid.\
\
2) no consider business reason towards such attributes\
\
3) no consideration of consistency\
\
4) no prediction, thus how much resource we need to increase? Notice when add/remove resource we still consider the granularity of per application since it is really difficult to ensure such resource are used by certain services\
\
\
5) only focus driven on single/limited attribute, such as resource and performance\
\
6) not many to define on fine grained service, but rather see an application has a constant SLA
\f1 \'81\'43
\f0 thus not fine grained.\
\
7) most approach suffer look backward issue - based on past environmental condition to cnofig new interval, expecting the same violation does not occur. May consequently result in too many re-condifg over time. \
\
8) solutions for MOO may lack of consideration of distributed architecture while other solution suffer issue 7).\
\
9
\f1 \'81\'6a
\f0 market price fluctuations such as spot instance can result int SLA violation as they can not determine the demand. Consumers need to manually estimate their need, which is unrealistic, currently only based on bid price, which could not fully represent demand.\
\
10) in on-demand manner, the iaas SLA may not be need as no need to specify how much resource needed but how much willing to pay for. (exception would be reserve resource)\
\
11) no consideration of interference (such as static modeling of SLA), not intend to handle such interference but try to avoid it when modelling (such as those dynamic modelling papers)\
\
12) for resource allocation paper, usually do not consider user friendly SLA terms.\
\
ANN may be better for regression for long time prediction, however, regression is more efficient and accurate enough on short time period.\
\
Our approach consider both consumer (pareto optimal and minimum cost via change control value) and provider (maximum profit via change price)\
\
Helpful techniques from literature:\
use of Exponentially Weighted Moving Average (EWMA) therefore to avoid occasionally peak in a very short period of time.\
\
0596173, ccgrid, ec2price and CIT2009 could be helpful for question 4)\
\
AutonomicSLA-Cloud could be helpful to determine monitor interval\
\
HPCS - IWCMC Vincent, Compsac 2010 I. Brandic and Emeakaroha_CloudComp2010 could be useful for overall architecture of self managing SLA\
\
HPCS - IWCMC Vincent can also benefits to transfer SLA parameter to monetary cost\
\
ccgrid could be useful to adapt a coefficient strategy for load balancing. This may be useful when considering which node to perform resizing\
\
Cloud11_Autoscaling provide good prediction model autoregressive moving average method (ARMA) can be used for predict workload. 6119065 also use queue theory to predict latency.\
\
iwcs, 6119065 can be used for multi-objective optimisation problem.\
\
service demand law may be helpful for determine resource used per service, this is mentioned in ICPE11_MAQPRO which also can be useful for MVA\
\
various papers in dynamic modelling can be used for create dynamic model of relations. such as q-cloud.\
\
GECON10, TR10 can be useful to unifying resource such as cpu\
\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\fs30 \cf0 Potential theory involved (each subject to minor modifications):\
\
relations of attribute: (for fitness function)
\fs24 \
Mean Value Analysis, queue theory, \uc0\u8234 Birth\'96death process\uc0\u8236 , Erlang
\fs38 \
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\f1\fs24 \cf0 \
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\f0 \cf0 \
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\fs30 \cf0 Finding optimal config of services and attributes:
\fs24 \
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\fs26 \cf0 Multi-objective optimization (popular)
\f1 \'81\'43
\f0 genetic algorithm
\fs24 \
\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\fs30 \cf0 resource efficiency:
\fs24 \
supply demand theory, partial equilibrium \
\
\fs30 Adaptive systems:\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\fs24 \cf0 adaptive MAPE loop, MPC, CBR, prediction theory,\
\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\fs30 \cf0 Managing SLA in cloud:
\fs24 \
Heuristics-base:\
Market based theory, CBR, queue theory\
\
Control-theoretic:\
control theory\
\
\
\
\
\
\
(fine grande service could be helpful to determine what is the consumer really demand)\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\fs28 \cf0 Current though:
\fs24 \
\
predict or not or hybrid? estimate != predict\
\
0. SLA model could be expose model that handles \{SLA-infrastrcutre, SLA-platform\} - (model) - SLA-software (to end user). It can be extended to form end to end SLA as well, but each type of user group (i.e. gold, silver) need to use different instance of service (2 different instance of service A)\
\
Pareto optimal fined grained SLA means for each service, it contains conflict objective such as least adj to achieve highest mon, most strict consistency to best performance etc.\
\
1. (application of queuing theory to model consistency queue) can used a list of global workload time interval e.g. a day or a year for each region queue. also record occurrence of whole region and percentage of occurrence of each related services. When adaption get the expected number of request to that region (using non-homogeneous poisson process) then make trade-off to make the SLA >= response time of mixture of consistency level of related service with number of expected - 1 (assume the request to A occur at last, worst case)\
\
1.1 if not predict, observe queue size for each region and calculate the timeliness of observe last request for each services, if violate trigger GA , still can apply Litter's law to adjust monitoring interval. Or a monitoring would be trigger by each request\
\
(or using simp weight average for global workload time interval for each service)\
\
\
2. trigger can be based on violation found or global monitoring time interval reach, on a predictable-manner even when the SLA parameters are actively changed.\
\
3. We define component level region (for cpu memory) and service level region (CR and SCR for consistency), which are hierarchical. Each region may have different type such as CR, SCR. We specify what service should be consistent and translate to CDS in SSOR. of course it would have many CDS for a service in SCR. Each service region may have different components. We do not considering consistency between components, as it can not ensure what sequence a node see if it only being deployed one component.\
\
3.1 each service would have a set of interference services, if the adj value is not service level and not direct associate between services then the set would be likely to be the services within the same region. If like consistency then for each service, the set of interference services should be each consistency level between it and another service as well as a chain, s1 -2- s2 -3- s3 mean then s1 has 2 level with s2, s2 has 3 level with s3. Note that the permutation should not be duplicated.\
\
4. category of SLA parameters (for each fine grained service, the could cover any adaption strategy for SLA enforcement, i.e. admission control, resource allocation and task scheduling)\
\
we identify non-functional interference i.e. consistency and functional interference i.e. one service invoke another.\
\
we refer consistency cpu, memory all of there are resources. We see adj and mon variable are all QoS, the goal here is to model objective function for controlling them (both resource quality and measurable quality)\
\
By interference region, we mean the group of QoS objective from number of service that interfere each others, for certain RQ in a interference region, (e.g. consistency) there is need a sequencer for consensus since it is global quality. As long as a RQ of a service being used by multiple PM then it is global.\
\
we define component which consists of find-grained services, component level could also have adj, men QoS, each would need to be considered during MOP solving for any inclusive service. (note that component and service level quality should not be redundant)If a component adj is global then in consensus, it would have different node and possibly different interference region if not all the services of a component are in the same region.\
\
we use hirerichical structure to describe service nested with each other, therefore an interference of a adj of parent service would contain all sub-services, and their SLA should put into the same MOO decision process. If such nested relation is on different component then the at least one service needs to be defined for each component (so that we know such service exist) so we can fine the real one that need to be scale up/down in/out.\
\
adjustable (could be vector)(each associate with cost for each party (cost needs to be normalised by compute % of compared paris, of course need to consider weight), only value that runtime adjustable, always controllable and can not be violated, usually ensure by provider, cost does not have to be fixed, we should always try to determine demand with minimum resource) also associate with interference boundary (services)\
\
only global adj need region and sequencer\
\
\
\
if the adj is constant value for each service in the interference region, then when apply to equation 2) it can be reduced to one rather than consider all service (i.e. number of replica in VM level)\
eventually we decide to use cpu/memory per component, thus RQ can be component level such as cpu/memory thus all inclusive service would be using the same RQ.\
\
SLA as well but usually can not be violated)\
order error level \
(order error should be as weak as possible since they may be charged, but the consistency requirement is expressed as consistency on mon variable) measurement of order error is the max number of difference to the target service, while measurement of consistency is the max number of different of the same request on different replica as this is easier to be interpreted by human.\
\
\
resource, no of VM (cpu memory etc), PM (for cpu if we can not assign freq, then can assign it based on ECU, e.g. 2.8 GHz may = 2.8/1.2=2.3333 ECU, and change it in granularity of 2.333)\
monetary price willing to pay\
no of invocation (this is actually controllable workload)\
\
monitor able (max capacity is form of (at least one) adjustable or other monitor able (actually we can only consider association with adj as it cover influence to other mon of the same service as well as other from services (either adj or mon), since mon can be translated into adj anyway), may be probability, each with utility point per party
\f1 ,
\f0 also associate with a responsible party when violation cause by certain adjustable value (that is the cost party on adjustable value))\
service time\
response time\
accuracy\
throughput\
\
also have measure metric that usually not in SLA but cloud should ensure:\
scalability\
elasticity\
\
also have other factors not in SLA and usually exogenous (this can associated with service or region as well, which can only be measured, we can use simple AR to predict this)\
workload\
time\
input size\
\
\
Note that an interference region contain all service in each region of an adj, each region of an adj consist of any service that interfered with a service in that region. Therefore interferences service of a service does not necessary to be all service in that adj region.\
\
monitor able can capture relation amongst parameters, each adjustable may have a boundary region (e.g. consistency region, component region) that capture relation of a service to other serves then in the equation of mapping monitor able to adjustable, such boundary region may or may not be used.\
\
Note that provider and com suer can both identify what service should be interfered for a adj value. For com suer, he should provide weight for each service it has, if for provider, the weight for services come from different consumers is define by the total profit = TR-TC of that consumer. \
there are different output of MOO for each region of weight linked services, regardless if they come from the same consumer. \cf2 (therefore no of weight linked services determine number of MOO to be sorted. Only global adj need sub region for sequencer to reach global agreement)\cf0 \
\
(it should be specified if is region level as it always associate with a region, also it need to specify if the adj value is global value i.e. consistency level, if not global then it can by adjusted in each PM. non global value can change on each node, if need to change global one then will effect all node, if such effect violate SLA then corresponding node needs to add in to solve MOO)\
\
after each PM determine there MOOs, for each global adj the final decision is carry out on the sequencer of adj region. We can apply GA again but the change value would be only that adj within the range of min - max for each interfered service, any mon that effect by such adj should be counted in as well. Selection should be applied the same rule as on each PM.
\f1 (of course one more dimension on no of PM)
\f0 \
\
***************\
QoS and control primitives has 3 measurement function:\
1) measurement on state, e.g. response time (direct), throughput = number of complete service / t (it usually specified by SLA) (this should be the fact, even for consistency. We could measure the max difference between request of a replica to others (e.g. logical clock 7 on R1, but 1 on other, such clock should be reset once max order error reached), then use it to feed model. Such result should be no more than the setup in SSOR) consistency small does not mean response time better, it could be small if workload is small. such relation could be learn by feed fact to dynamic objective model.\
\
\
2) prediction of expected state (before adjustable element change, that is on current adjustable value state) e.g. throughput = number of request / sum of the response time of each request (sec)\
(for adjustable this would be the assign ones) (objective function note for adj may have more than one adj share the same obj function) (for control primitives, this is a cost function (per-service) )\
\
3) the bounds on SLA (could )\
\
***************\
1) is not used in calculation. On under provision monitoring we only compare 1) and 3), 1) worse than 3) means violation then adjustment need; On over provision monitoring we compare the adj of 1) and 2), if 1) < 2) then adjustment needed. (may be 1) can > 2) but as long as no mon's such that 1) < 3) then no adjustment needed) \
\
\
adj is over provision primitives and mon is under provision primitives. we can say\
trigger: \
OP = utility of RQ lower than % (when OP the variable change is only the target RQ, of course could be more if there are multiple OPs in the same interference set)\
UP = MQ is violated by % (when UP the variables change are the RQs that associate with the target MQ, of course could be more if there are multiple UPs in the same interference set)\
\
IM may decide for certain interval, a mon is related with no adj, it is possible in which case we simply igonore that mon from objective\
\
but the objectives of service that related to the considered RQ should be included in optimisation due to interference. (note that number of VM may not be considered in OP, therefore in such case VM should be removed when any resource is below default quota, the same for increase VM) \
\
2) is used when solve MOO, then can determine up/down, and adjust monitor able variable via adjustable variable depends on the equation. (as for how the mon change when adj change can be find out by linear dynamic model, so no need to define i.e y = 1/x) (the up and down can not be changed, i.e if there is need to reduce resource then resource should not be added, so it reduce over optimise a SLA (such as consistency))\
2) may be, can be predefined or learn online using MIMO (ARMA/ARIMA)?\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\cf2 if 2) is region level metric, such as cpu, men then for each service it can be compute via 1) (mean % of cpu usage on service) in fact or auto-created model in 2) need measure value from 1) that is, region level adjustable value can be transferred to service level. \cf0 \
***************\
\
There should be transfer function between 1), 2) and 3)e.g. throughput of a node = SLA throughput/number of node..even availability can be1- number of node * downtime % = overall availability. However, such may have the drawback that service instance created after a MOP, which may cause the constraint change. We assume that any violation cause by such change should be detected and thus resolve by the next MOP process. Budget is not QoS but SLA parameter, it needs to be distributed and considered separately.\
\
for adj we need to follow the granularity of executor e.g. only allow for each VM/component that such adj is the same granularity e.g. cpu/memory \cf2 for each adj there also should be a function to transfer the control variable to a form such that the executor accepted (i.e for cpu we can directly assign value to each service and sum them up as the total cpu of VM, we expect the right proposition would be consumed)\cf0 \
\
even using the mutual information (the data should be based on measure ones 1)) to learn interfered service there is still need to define the service boundary. If a control variable is component level then there should be interfered component. (i.e. no of node)\
\
covaraition could be helpful to decide how a RQ affect MQ (asc or desc?)\
\
such model can capture any SLO interference sine any men are represent by adj, then if two men are interfered then all adj of men1 would be contained in men2 and vice versus, then it actually can be replaced as men2 = a*men1\'85 which shows how men1 would influent men2.\
\
\cf2 in order to do comparison and calculation any way, when SLA value, measured value via 1) and adj value used in 2) we need transfer function.\cf0 \
\
note that cpu usage can use unify metric such as ECU to measure.\
\
when train mode we do not consider 0 men since this does not reflect the actual data relations.\
\
we normalise the data as value/ max of the observed value. The proposition between data of interval is still the same. This normalisation only create a base value, therefore even in MOP the new value may be greater, it can still exceed 100% as long as the base value is the same. (in addition we only predict one interval ahead)\
\
5. (partial equilibrium should always focus on product type) if it is possible to determine how much adjustable parameter is demanded, then we can choosing the best fit type/spot using partial equilibrium of demand supply theory.
\f1
\f0 This fit the case where spot instance are used where price change based on supply demand.\
\
Price determine can be based on different consumer (their services) in such case no negotiation is needed as the SLA would always satisfied and reach equilibrium for each consumer. Price should be based on each PM and the granularity of that adj (e.g. component for cpu, memory) as well.\
\
\
budget could be based on each PM, component and consumer as well as adj\
\
can use partial equilibrium for each spot instance type as each product. Seems like amazon does not allow scale up - purchase a type still charge the whole regardless if you fully using it. can consider fine grained resource to prices per cpu/memory.\cf2 (we can calculate demand/supply function dynamically - demand function Q = b - Pa, supply function Q = d + Pc, (to get b,a,d and c can use
\fs38 \cf2 \uc0\u8234 Linear least squares\uc0\u8236
\fs24 \cf2 (polynomial regression)) then\cf0 \cf2 it can be used MR = MC, MR = quantity times demand function
\f1 \cf2 ,
\f0 \cf2 therefore the demand resource would definitely being produce with a assumption of profit maximum) remember to consider case where there is no equilibrium point within the quantity range, then usually select the max of demand. \cf0 \
\
\cf2 for the first interval, the demand/supply function would be calculated by a set of the (budget of each region adj type)/ total quantity of that adj type in each interval. Then the subsequence interval can use the calculate equilibrium point price and the total number of adj value in each previous intervals. \
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\cf0 \
\
When demand function is known we calculate price = P = (PED/(1+PED))*MC, PED = P/Q * dQ/dP. Then for each individual of MOO we set criteria: budget >= sum of each service of the calculate equilibrium price based on current amount of adj * amount of adj, in order to verify its feasiablity. \
we can calculate the demand function Q = b -Pa by using training data of final Q,P by each MOO interval end. \
\
one MOO can have more budget function as well if it cross different consumers.\
\
when apply demand/supply with MOO we may need finer grained price (instead of per hour, we may need per sec etc.)
\f1 \
\
\f0 5.1. \cf2 when under supply, calculating the ratio of capacity (cpu/memory, for the additional demand) can be also used to reduce the most suitable instance type (e.g. most expensive fits the ratio) when over supply
\f1 \cf2 ,
\f0 \cf2 also increase the most suitable type (e.g. cheapest and require less number of type) (the solution for if scale up or scale out could potentially be determined as well). When it can't add more on the VM it seeking to add more PM with suitable type. (all those output should be involve in the solution of MOO)\cf0 \
\
assign resource to service as instance type, although the resource may not be fully used buy this is how the price is calculated. instance type can be scale up/down to another type\
\
6. with 2
\f1 nd
\f0 order ARMAX and least squares algorithm, we can dynamically model relationship between monitroable value and adjustable value, giving mon(k) = a1mon(k-1) + a2mon(k) + adj(k) + adj(k-1) \cf2 + d(k-1) + d(k-2).\cf0 \
\
adj(k) = sigma An1*adjn1 (n = no of services for adj1) (k)+ sigma An2*adjn2 (n = no of services for adj1) (k) + \'85\'85. sigma Axc*adjxc (x = no of services for adjc within interference domain of target service) (k), c = no of adjustable value that associate with m Axc = row vector, adjxc = column vector\
(this can be either adj value or mon value)\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\cf2 d(k) = sigma An1*dn1 (n = no of services for with in the region of associated adjustable values) (k)+ sigma An2*dn2 (k) + \'85..sigma Axc*dxc (x = no of services for with in the region of associated adjustable values) (k), c = no of exogenous parameter such as workload. Axc = row vector, dxc = column vector\cf0 \
\
least squares algorithm can solve this by building partial derivation of S (sum of error) = 0, for solving each a1,a2,And the number of equals needed = 2 + 2 sigma Ki (i=1,2 \'85 c) K = no of service for each interference domain of adjustable value for a service, i= no of different type of region for adjustable value\cf2 + 2 * exogenous parameters * no of all services involved (get rid of the redundant ones).\cf0 This is the minimum required number of measurement as well. (not duplicated monitoring)\
\
update the model and apply the model can be done on each interval k, but adj and mon needs to be normalised (such as use mean value or % of total), \cf3 or we can update model on each request, so may be better capture the case where violation and over provision occur (the AR worth be still using measure value from previous k-1 interval) and then only apply it on k invertval, normalisation is still needed. (this is experiment driven) \cf0 Adj value also needs to be normalised, when req time decrease resource should be increase such that req = 1 / resource etc.\
\
for each interval k, if no request to a service but still it would be used for calculation but at k time step the coefficient would be 0.\
\
\
to decide wether use ARMA/ARIMA/ARMAX is a experiment driven task.\
\
upon SLA violation, this can seen as a trigger of interval k change, such as spike load or the up/down turing point on workload graphic.\
\
like in ANN we can also use step-wise approach to determine the number of order\
\
in symmetric uncertainty, for new or removed control primitive, the interval that it is unavailable would be represented as value 0. And garardunlly, it can be determined as sensitive/insensitive\
\
7. to make the model even more generic and flexible, one does not need to know which adj/mon value can influence mon value, it can be learn online via AIC/RS and maximum likelihood method, as well as stepwise algorithm. (of course the sample time of finding the number of parameters would be longer than finding the coefficient)\
\
\
8. the main difference between ANN and ARMA is that ANN capture nonlinear relation. However, since the model is updated in very short epoch, therefore it is possible to capture local linear relations. ANN needs to determine no of nodes in hidden/input layer and no of hidden layer, as order in ARMA. They can be also dynamically defined via AIC/RS. When apply ANN we apply the same as ARMA: relevant mon and adj values as input (including interference services), a mon as output. \
\
When use BP, there is need to compute MSE of all sample up to k interval, so that to decide if update the weight in k interval and the current model is always with min MSE. Or we can use infinite run of training until the errors is below a certain value or the max number of iteration has been reached.\
\
we apply AIC/RS and stepwise/incremental selection on ANN we can assume 1/2 hidden layer (as 2nd order on ARMA), then only go through combination of input and hidden neurone \
\
\'93It has been mathematically proved that the single-hidden \
layer feed-forward networks are universal approximators \
that can learn any continuous functions with arbitrary \
accuracy\'94\
\
\cf2 for training ANN we assume weight can be learn online therefore on each training interval we only need to determine no of input and no of neurone and activation function (no of layer is set to 1 or 2). whose value not less than the min AIC/RS in the history. Therefore instead of finding the optimal model in one run, we only interesting in finding the best ever model and eventually reach to the best model. (learning rate would be fixed, as we can have infinite samples) \
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\cf0 this selection is only for ann, when the mutual information finds changes then model is re-selected. Find the best RS for each model. (ARMA can use it to find order) (for efficiency selecting ANN can use similar way to PROP where select granularity change on the fly)\
\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\cf2 An incremental model selection could be applied, i.e. (start with full input) chose the number of neurone of a hidden layer that best than the best historical AIC/RS (or just the best one if the target couldn't find ), then do it only the next layer, then reduce the input one by one. Note that the RSS should be base on the new model on the give interval (with the data in the buffer windows). Or we can consider all combination via backward approach.\cf0 \
\
\cf4 (instead of AIC, we may apply the least RSS seems we only interest in accuracy and the speed of converge, if only apply RS ann can only run 1 round to see number of neruos), mutual information is a good way to predict interference set and related resource quality.\
t\
\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\cf0 train ann we can train a model using step-wise approach, that is, setup worst RS and RMAPE and record any model that produce higher than that. Increase the number of neurone until the increment does not provide better model. Of course to prevent training forever, a max training time is needed such that the best ever model is returned if such time is reached.\
\
The above is only for the same ANN structure, for subsequent only need to train form the selected number of neurone and return as long as it get the RS and RMAPE better than the worst one. If the structure change then the entire process should be repeated.\
\
with predefined accuracy
\f1 (the chance that SLA violation occur)
\f0 , we can even dynamically chose either ARMA or ANN.\
\
We could apply a buffer windows, so we won't running out of memory to store historic data. This is applicable since the old data can not represent current dynamics.\
\
all monitor of each adj should be kept even they are not used since the next aic run may decide to use some of them so that previous data set can be used. also may be better to use normalised monitoring data? or attempt to use smaller number as possible\
\
If apply ACO then there is need upper/lower bound on all QoS, especially resource quality therefore
\f2\fs26 heuristic factors can be found by output of RQ value/sum of outputs of all other RQ value (if smaller is better the can be 1 - the above)\
\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\f0\fs24 \cf5 we can even have a sequencer/planner for each interference set, it is used to decide global RQ involve MOP. We do not consider cross PM service interference. Therefore MI only search locally (or only on the same VM). Only accept predefined cross PM interfered service. sequencer is used when the MOP should be solved globally (involve global RQ).\cf0 (or if possible, try use MI to iterate every service in the cloud, this is carry on on each node, of course should exclude its own replica, we need to ensure the training sample time is the same, thus only consider the leap time) \
(one leverage is to only consider services on the same PM, as well as the services instance that required by target service)\
\
as opposed to sequencer, we can possibly apply a decentralised manner, since the node that willing to optimise would automatically being a sequencer, then all the objective related to global RP would be considered. However, the dynamically changing QoS model needs to passing through to the temporarily sequencer. (to repeat, the sequencer is mainly for collecting QoS model from different node) of course if the objective is under optimising then any further trigger would be ignored.\
\
not that the primitives which interfere a QoS could be changed even cpu, memory could be remove if they are not significant\
\
note that for the abstract qos model, the the training data can be composite
\f1 (could be sequential, parallel, loop and conditional etc.)
\f0 or atomic service as long as it gives enough sensitive set of control primitives (that is all the significant inclusive services are identified by the middleware) and the training data are fed correctly.\
\
\
We can conclude all the control primitives into a aggressive 'cost' as an objective function (that is sum of each provision of control primitives' objective function) when solving MOP. but the restriction of each resource would need to be considered \
\
9. In MOO the selection rule could be: (non dominated and crowd distance)/ assigned weight (if any)\
weight include: 1) amongst services from different consumers, 2) services from the same consumers 3) amongst different adj/mon value from the same service\
\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\f3 \cf0 \
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\f0 \cf2 when determine global value on sequencer, it may not have weight on each node, (we first ensure SLA are all meet) therefore would need to determine on the least sum degradation i.e. how much the mon are worse than local decision when apply certain global adj value, then pick up the smallest one. (if we use weight) or can simply use domination ranking. (of course need to reexamine any constraints that effected such as budget)\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\cf0 \
when consensus on adj we can agree on different level of region concurrently, i.e no of PM and consistency \
\
Some adj from different MOO (or ven the same MOO) may needs to compete with each others (i.e. cpu/memory) when such competition occur we first satisfy the ones that have more profit, then looking to add more PM normally add 1, the same as when a PM usage below predefined value which cause it can be removed. \cf2 in this case we know how much resource needs thus add on the proper VM otherwise from the cheapest one . (need to send to sequencer for determine)
\f1 \cf2 no action if node already decided to be added) (PM section could based on availability)\cf0 \
\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\f3 \cf0 when add new PM we may or may not know how much resource demand, if unknown we use the cheapest VM, the budget would be equally grained from the current PM who propose to add new one (or in ratio if how much is known). As during MOO this should be a constraint on each PM.\
\
at the end of each run of MOO needs to rebalance the remaining budget, based on the goal that every node should have equally free budget. The overall budget would not exceed the consumer anticipation.\
\
the SLA bound would be distributed as well if applicable e.g. throughput\
\
such distribution can be done in a decentralised way after each adj has been finally decided.
\f0 \
\
when we need remove PM, remove the ones that use least resource. and equally distribute the budget\
\
In MOO, the objective could be each SLO + min the cost, of each service we could sum up cost/mon of all adj value so that reduce the number of objective. for consistency only sum up the consistency that a target service needs to maintain when another service arrive first. Therefore assume each service has the same number of SLO, the complexity of objective is 2 * S \
\
or we can use the maximum adj * S + mon * S (more detail strategy) (may be with objective selection?) that is consider each mon and adj for each service.
\f1 (for some adj such as resource would be component level therefore would be one for each component)
\f0 \
\
We assume that even when SLA of mon has been satisfied, the consumer still wish to pay more in order to optimise the mon. Subject to budget\
\
mutual information (define how much dependency) and covariation (decide if conflict or redundant) (the data should be based on measure ones 1)) can be used for objective reduction, does not need to be with MOO optimiser.\
\
the objectives for each group could be QoS + total cost of each instance, if sharing RP e.g. CPU then the cost of it is equally proportioned to each instance.\
\
==================================================================\
\
dynamic MOP can be see as static MOP for each point in time, therefore can be solve using traditional approach. GA may need to consider re-evolve but ACO seems like capable to handle change of objective, another factor is that the interval t of optimisation needs to map with the time taken to search (although the trigger is defined by the over/under utilisation, but we can decide if we need to start the metaheristics)\
\
In ACO even it is optimise a single objective, each ant would select a RP for all objective even it is not sensitive by the current objective. Such RP's heuristic factor could determine its preference. Each ant would eventually have a solution, which is an individual. Therefore at each round, ACO would produce a pareto set, and all the sets from all rounds are combined together to form a larger pareto set, which then sorted and non-dominating and crowding distance to have the pareto optimality.
\f1 (the best solution for each objective is selected based on
\f0 their value, only the best solution is allowed to update pheromone factor
\f1 )
\f0 \
\
(
\f4 pheromone factor is more like the preference of a certain objective, whereas heuristic factor is more like preference of how it affects to the global objectives.
\f0 )\
\
\
it is possible to integrate the dynamic economic equilibrium model with ACO, upon a solution we used the decided RP price, then the decided price is based on trade-off RP. assume distinct price for each service instance._\
\
when allocating RP, it is possible to limit to certain type, e.g. a VM has 128MB, 1 core CPU, 100 concurrency. so that the MOP process can face less option. At the extreme case (arbitrary possibility) such type never exist then the search space would be every possible value. (continuos and descret)\
\
When select a RP, other RP use the current provision value. If RP from external service is uncontrollable then simply use the provisioned value. For EP, if changes, there is need to update with the latest one (same as QoS sensitivity model change, the existing pareto sets needs to be re-evaluated, also he rustic factor needs to be updated, possibly, we could update the pheromone factor as well (just iterate all solutions again, based on each iteration from the 1st to last)).\
\
may be different optimisation group could be seen as different colony? ACO is more suitable than GA because it is better in terms of handling dynamic/continuous optimisation, and potentially good at many objectives (as each ant works in paralleled)\
\
in ACO all the solutions are recored till the last and then sort, this is due to dynamic change of objective function which case 'which is better' could be changing.\
\
for improvement and degration of objective, we use the ratio better achieved value and the predicated value of that objective before optimisation. (use predicted value even such has been violated)\
\
even though only one ant for an objective, it needs to determine RP which does not required by this objective, but during pheromone update it updates 0.\
\
An ant randomly select an objective therefore each objective could have more than one solution each iteration. the number of ant should be > the number of objective (or we make sure each objective is optimised by at least one ant each iteration) All the sensitive RP and objective should be considered together to avoid lock-in on local optimisation solution.\
\
our aco only uses non-dominated sort at the last stage, therefore modify it does not affect diversity of searching solutions. (so we can set the extreme value of crowd distance is 0 instead of infinite) (or use ranking dominance?)\
\
when constructing solution, the sequence of RP for ant could be random, if when select a PR any constraints violated, then this particular RP would needs to be reselected without the previous RP candidate that cause violation. if all candidates are cause violation then go back to previous RP. The sequence of selecting RP could be random, so if an ant suicide then the next ant may not be trapped in the same way. (should we say we have two ants: the normal ant and the guidance ant which assist the normal ant regarding which RP should be consider next)\
\
for each objective (normal ant), the guidance ant could maintain another pheromone, which is updated as fixed value * the number of successful ant that select such RP as the ith RP to search. (this probability is determined on pheromone factor only, and its update could be local update only) this need the min/max pheromone value as well. Note that this guidance ant only effect the efficiency to produce a valid solution not quality of the solution. Thereby it uses a different pheromone structure as the normal ant.\
\
each ant would have a suicide time if so time reach it stops regardless if the solution has been completed. If concurrent then only global pheromone update is needed.\
\
It may be possible to add local pheromone update (for diversity): for all ants for the same objective, update a fixed value for local one of a candidate based on how many ants have selected such candidate. If it is under concurrency it may have little useful. (see various qos for workflow paper) note then local and global update are updating the same pheromone value.\
\
our ACO is MAX-MIN ant systems thereby we need to define max/min pheromone value.\
\
may be worth to compare the produced solution with existing one to make sure it is really better\
\
when the solution can not be found (always constraints violated) by trying every candidate of an input. if it is the last input for such constraint, then it would go back to each input of that constrain, try every candidate. If it is not the last one, then it will let the guidance ant to select another CP to search. If still cause the violation even when try every possible combination of inputs for a constraints, it then start the normal ant again for a better sequence. Of course once the time is reach such ant should be suicide.\
\
when QoS model changes, ACO can re-evaluate even when there are new primitive/phase out primitives, it just simply go through the selection process only for the new primitive.\
if it is a merge of two local optimisation, then the process only need to validate the constraints and remove the solutions that violate them. This case, however could affect performance of ACO, we observed that the case of new primitive/phase out is unlikely to happen frequently\
\
\
The training data (demand) can be different from control (provision), but they are assumed to be able to be mapped together. e.g. security policy (provision), the level of risk (demand) or the computational resource and different instance type to be matched.\
\
\
we could adopt two form of ACO - 1 has weight and 1 is weight-free. This only require to change the heuristic factor formula and change crowdoing distance to weight sum formula.\
\
To determined which objectives are conflicted when optimisation, not only the models of depends-on instance, but also those that deployed on the same VM as depends-on instance should be considered. Also, should consider those instances that uses global CP\
\
based on current strategy, the scale in/out are based on scale up/down (e.g. not enough on PM or no hardware CP is needed on a PM), but in future it is possible to design more
\f4 sophisticated\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\f0 \cf0 techniques for scale in/out (this may need a centralised control) this is sufficient because:\
\
1) VM allocation require centralised control\
2) optimising the cal up/down could help to reach global minimised in/out as well. (e.g. min the number of VM)\
we consider the VM allocation for best global resource usage as a separated problem\
\
\
cost like CPU is proportioned to each service-instance of VM, (they all be considered in the same objective group for sure), thus when decide to need new replica, the extra portion of budget can be equally allocated to the new replica (as decided by ACO), then the post-balancing of available budget can be triggered. For other CP that still sufficient, we can cut the default amount to the new replica.\
\
QoS sensitivity is identified via CP and EP, and it is selected from same VM and
\b required by
\b0 service-instances.\
\
Conflcting objective is identified via CP, and it is selected from:\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\f4 \cf0 1) directly or indirectly
\b interact with
\b0 the service-instances on the attached PM\
2) deployed on the attched PM and those that on the same PM as the interacted service-instances;\
3) from other VMs but sharing the same CP with service-instance on the attached PM (i.e. a global control of load balancing policy for all instances of a service).\
4) we only consider nonpartitionable CP amongst PM under VM interference. \
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\f0 \cf0 \
\
using demand not provision value, for a CP provision decision X, we can tell least X more likely to cause under provision whereas higher X more likely to over provision.\
\
\
why not MAPE?\
\pard\pardeftab720\partightenfactor0
\cf0 Calculating an aggregated MAPE is a common practice. A potential problem with this approach is that the lower-volume items (which will usually have higher MAPEs) can dominate the statistic. This is usually not desirable. One solution is to first segregate the items into different groups based upon volume (e.g., ABC categorization) and then calculate separate statistics for each grouping. Another approach is to establish a weight for each item\'92s MAPE that reflects the item\'92s relative importance to the organization\'97this is an excellent practice.\
\
\
Note that generally, for services chain in the cloud we assume SLA for the whole chain can be spliced to each of the inclusive service. However, if not, we can apply certain function (input as QoS of each service) in the constraints to calculate the global expected QoS.\
\
\
exclude irrelevant primitives is a good way to reduce over fitting problem already!\
\
we can move the optimisation to global in terms of PM, considering VM interference. But whether certain services from different VMs should be optimised in one optimisation loop depends on whether they are truly correlated. (The QoS sensitivity model). Note that in terms of resource competition, we may not be able to follow FCFS (for different VM in the same optimisation group) but follow the way to minimising the number of service that needs to be replicated. We assume all CP are charged on the same price no matter type of PM/software stack, then on the new VM these services would be continuously modelled and optimised by our adaptive approach.\
\
for dynamic ACO, instead of having a solution archive, we can only have a short memory, K, at each iteration t, for each objective, the best number of K solution is try to replace the entires of that memory from t-1. The pheromone of new ones are increased whereas removed on are decreased as constant value (e.g., (max-min)/K). This can increase diversity and it is a randomised way to cope with dynamic. Of course at the final stage only the best of each objective can be put into the pareto front.\
\
after the above actions, the memory from t-1 can be mutationed and generate new solution (memory based)\
\
for objective finder, we can maintain a objective topology (based on if sensitive to same CP (directly or indirectly)) in each local autoscaler. This regional concept not only help to reduce centralisation in distributed environment, also help such reduction on a particular PM (node). Each autoscaler only interested on the services deployed on corresponding VMs/PM.\
\
MAPE not cater for modelling, model@runtime not used for optimising, DDDAS cater for both. model@runtime may only consider requirements change (which is part of our consideration), but not how the system behave under fixed requirements.\
\
even if some software CP is not charged, so we do not care their over provision (as they can be still used by other service-instance, even they are assigned to one instance, unlike hardware Cps, in case they are charged, think of the case of broadband charge), we should still find the best combination of them and other CP, since a good software CP may require less hardware CP, which can be used by other VM.\
\
the frequency of model selecting and training could be different to frequency of measurement (interval) e.g., measure every interval but select and train every 10 intervals, feed data for all these 10 intervals for training and selecting.\
\
for CPU and memory we measure demand via average whereas use max for thread, this is because software CP has more closed and direct impact to service than hardware CP. So we measure the demand more restricted. \
\
when model should let it run for some time in order to get basic training data.\
\
our QoS model is not try to directly make prediction of future QoS, but to predict the correlation of QoS and primitives, then given a set of primitives, that correlation can be used to predict the boundary value of QoS in the future. And most importantly, to compare and evaluate different decision in adaptation.\
\
Traditional MAPE-K is fixed, even it could use reforcement learning, the policy is still fixed and K is usually just mean for information sharing. may be, we can propose something like MAPE-A (adaptation), which is the same as DDDAS but could be interested in self-adaptive SE community. This is indeed the same as MAPE-K in the sense that they are sharing information, but extra adaptation could means that each component could also learning itself (e.g., A and P), may be called self-learning MAPE-K-Simulation? (Rami: could apply DDDAS within the global MAPE)\
\
A hybrid and adaptive multi-learners approach for online QoS modelling in the cloud\
\
use feature selection not extraction because 1) the data is labeled so supervised learning is good 2) filter in selection produces negligible overhead.\
\
can apply objective reduction with ACO on autoscaling algorithm, the major point is that once an objective is in conflict with one or more other objectives, then it is considered as conflict with all objectives. We can use pearson correlation coefficient as metric (and its distance as well) to see if the objectives are positive or negative, then we can:\
\
1) simply get rid of all positive objectives.\
2) use the algorithm describe in reduction-neighborhood.pdf to do the reduction.\
\
distance measurement can used spearman, which can work with nonlinear relations.\
\
the concept of direct and indirect primitives is different from reliable and unreliable, they are just split based on the fact that they are physically sharing resources. Direct ones has direct impact to QoS, whereas indirect ones indirectly impact QoS via the sharing resources (soft-and hard-ware). The primitives in indirect space may be also reliable, those in direct space may be unreliable as well, therefore not all the primitives in indirect primitive space are dubious useful and misleading. Direct space just more correlate to the QoS than indirect space in average.\
\
the weight-sum approach is so called scalarization (this need manual specify or update weight, which is unrealistic in my case), whereas we focus on pareto-based. (or we develop a hybrid one, which is praeto and scalarization based?)\
\
reduce the objective does not care about the different demsionality amongst objectives, because as long as they are reduced then the reduced objective can be optimised even without modify the CP that only related to it. In addition, it might need to put the reduced objective in the constraints, in case some of them are violated.\
\
Wang, harmonic distance \
\
switch(adaptive) better than ensemble as it is less sensitive to dumb-down learners\
\
spear mans, remove certain objectives that results in -0.2 to 1, the level of removal increasing fro -0.2 to 1.\
\
A key difference between our work and Nelly's bayesian based DDN is that, in the cloud, we assume arbitrary number of adaptation strategy and the combination of configuration, in addition, these configuration could be numeric not discrete only. In other words, it is impossible for a developer to design a specific combination of configuration to form an adaptation strategy. However, their approach would require such action, and given large number of adaptation strategy, the overhead of bayesian based DDN tends to be insufficient for online adaptation. We use ML (e.g., ANN) to model correlation, then apply it to a search-based optimisation for decision making, whereas they use bayesian based DDN directly on decision making. (linking QoS and certain adaptation strategy).\
\
on reduction, QoS and cost objective may be un-reduceable, therefore we focus on different QoS objectives.\
\
CP bundle may be help to reduce complexity, however, it could prevent optimal CP combination\
\
when assign hardware CP decisions, it 1) assign based on priority, otherwise assign the one that can be satisfied by the remaining resource, 2) if the unsatisfied one incur predictive QoS/SLA violation, ignore it if it is less than a threshold. (or see if the migration/replication incur more cost than the penalty, do 3) before 2)) 3) In case such violation is non-ignorable, we select either replicate or migrate based on their cost via linear regression of historical data. \
\
if the objective change is
\i which
\i0 and
\i when
\i0 , then we have to restart the ACO (this is not occur in high frequency, thus does not effect the adaptation process), if the change is only
\i how
\i0 , then we can resolve it via the dynamic variation. This is because the objective topology could change as well.\
\
constraint change does not affect the optimisation as it is not occur in high frequency and can resolve by start new optimisation.\
\
reduction and model change needs to be synchronised, the optimisation process is not competed until the action has been taken.\
\
as long as the search is completed, the action needs to be finished no matter wether the
\i which
\i0 and
\i when
\i0 change
\i .
\i0 \
\
group of VM (possible regioning) consists of many regions, each region has many objectives from those VMs.\
\
\
\pard\pardeftab720\partightenfactor0
\cf6 regards to if trigger optimisation, \
when detect QoS violation (under provision) for at least t interval for ensuring stability (or the average value over t intervals)::\
1) if it is predicted that the likely QoS one step ahead is still ok, then do nothing\
2) if the likely QoS one step ahead is not ok, then compare (violated QoS unit to QoS threshold * penalty per unit and penalty of ACO (cost per hardware CP unit), which can be estimated offline with regards to number of objective) to decide if trigger optimisation.\
\
when detect over provision for at least t interval (or the average value over t intervals)::\
then compare (idle CP to utility threshold * penalty per idle CP and penalty of ACO (cost per hardware CP unit)) to decide if trigger optimisation.\
\
(consider using risk of optimisation?)\
risk, may be could, using the 1 - probability of successful rate * the penalty? where the probability can be updated over time.\cf0 \
\
\
\cf7 the above can be also achieve in a proactive way, in order to be used for big job HPC application: upon each job, it predict the future QoS directly and do as before (do not need to consider 1) of under provision)\
\
the new setup should at least being used for time t, if the detect of violation is within the t time frame of the last setup, then ignore it.\cf0 \
\
modelling interval should not be too small, ideally it should be close to the worst case of response time. \
\
with regards to elasticity, our approach focus on how to produce the QoS/cost optimal demand (search based) of cloud-based application/cloud, whereas other work focus on the non- QoS/cost optimal demand (rule, control) better way to provision/de-provision control knobs. (vm-pm mapping). optimal demand, need to define these terms. may be use optimal demand in the context use qos/cost in the title.\
\
should mention that max QoS and min cost is good for both cloud consumer/provider, especially provider more concern about utilisation of such cost.\
\
due to we are working on single cloud scenario, we need to assume the total capacity of the cloud (e.g., number of PM) is always larger than the total affordable primitives of the consumers. In other words, when services/applications are competing for resources, they would eventually being satisfied by scale up/out as long as they can afford the charges.\
\
wallace.cs.bham.ac.uk\
\
when cloud dynamics occur, new adding primitive can use 0 to present the missing info whereas the removed primitive would be removed directly, this is because those primitives are usually indirect primitives, and the same info can be easily provided by another indirect primitives.\
\
\cf7 Enable Global QoS and Cost (benefits?) Optimized Elasticity in the Cloud using Computational Intelligence based Approach \
\
Enable Globally-optimal QoS and Cost (benefits?) for the Cloud Services using Computational Intelligence based Approach \cf0 \
\
Self-Aware and Self-Adaptive Autoscaling for Service-Based Applications in Cloud\
(using self-awareness and MAPE? might be add Self-Aware?) (thesis and journal 2)\
\
Self-adaptive online QoS Modeling for Elastic Autoscaling in Service-Oriented Cloud (journal 1)\
\
adaptation overhead from 2 - 8\
achieved score from 2 - 8\
\
a drawback of ensemble: Unfortunately, the ratio of examples to classes is small at the metalevel for any reasonable number of algorithms to choose from, and there are serious risks of overfitting due to underlying similarities among algorithms.\
\
mind the differences on static/dynamic and offline/online\
\
using the first node in cloud as reference, the interval start time of a new VM should be sync to the same physical clock. e.g., first node at 0:15 and interval is 5 mins, then if new VM start by 2:17 it needs to wait till 2:20 to start sampling. The same applied to new primitive/service\
\
\
\cf5 1) elastic strategy = certain combination of CP\
2) elastic rules= a pair of condition and an action consists of certain combination of CP\
\
Rule-based could be 2)\
Control theory could 1) or 2) (this category usually explicity model control-error, MPC is a special case which can be similar to search based ones, however it mainly focus on prediction rather than the search in optimisation)\
Search based could be 1) as there is no clear mapping of condition and action\cf0 \
\
\
\
sampling interval for the entire cloud should be the same, whereas modelling interval could be differ for each middleware instance.\
\
\
regards to whether CPU sharing is better than isolation, if the demand can be satisfied then isolation is better, however, if it is not then sharing is better (typical over provisioning strategy) thus the consequence of under/over provision is more serious than interference.\
\
\
J.RAO's 3 work (closet work on QoS modelling)\
1. has interference\
2. has interference and software CP\
3. has interference, software CP and dynamic selection of primitives (use Simplex Reduction)\
\
upon provision hardware CP for VMs on a PM, follow:\
1. hardware CP before software CP.\
2. scale down before scale up. (particular for hardware)\
3. in case of scale up, smaller value before larger value. (particular for hardware)\
\
may be better to clarify instead of focus on add/remove VM (horizontal), we focus on fine-grained provisioning inside the VM (vertical). But they can be used in conjunction.\
\
non-functional/softgoal usually a range, functional/hardgoal usually a concrete target. \
\
we do not have to update external QoS models (objectives from other PM) at QoS modeller, just whenever there is need to update regions due to model changes, we multicast, all the associated nodes in the region would do the updates. If there is need to trigger autoscaler, the actual external models can then be updated on the local PM where the optimisation is trigger (this can be done in the same process as the phase that avoid duplicate optimisation)\
\
We consider Task = Service, which is Couldlet in CloudSim.\
\
\cf7 Tradeoff in Autoscaling ===============================================\cf0 \
\
the simulation (search based ) could be better than control theory one as it is expensive to try these on the real system.\
\
may use sth similar to C measure as a metric in the ant colony versus step-wise and rule based approaches. \
\
Dealing with scale in/out: (these could improve instability that caused by the inaccuracy)\
\
1) when optimise, the selected control value for VM should be the total capacity of the PM, in case the PM can not satisfy the required value, we can then migrate or replicate. \
\
2) one possibility
\f1 preliminary experiments
\f0 , is that we equivalently assign the possible provision to VMs and then set up a threshold for trigger scale out e.g., if the current max of CPU is 100% (the max provision) then if it actually use 90% of the max and the decide value is also above 90% then it is likely to be a scale out. this could also prevent scale out if we do not need to consider it.\
\
3) another way to do at runtime change of max provision is that, we only scale out if 2) has meet and the max provision can not be add by (1+g)*the max, where g is a present age > 100%; otherwise we update the max provision by (1+g)*the current max. This update occur when the optimisation is done and in the executor.\
\
when we we found that the actual and decided value is below 90%of the current max provision, we update the max provision by (1-g)*the current max.\
\
4) the min provision can be also updated i.e., and each monitoring, update the vector's min value with the observed min. (need to have a min value), or the latest min value.\
\
ACO may result in solution that violate some objective, in that case, we simply do not use it. The threshold that trigger ACO could be different with the actual SLA that used in ACO (i.e. the trigger one could be better)\
\
the CP in cost model is regardless to the QoS sensitivity because although they are not correlated with the QoS, they are still essential to support the service, e.g., it may be not sensitive to memory at all at some point in time, but it may still correlated with it in the future, so it is still charged. but this can not go outside the provisioned VM.\
\
we can use distance to find the most balanced solution over the pareto optimal set. If the aco finally have to have multiple solution, then randomly select one as the are all good solutions.\
\
1. search good tradeoff solution\
2. find the most well-balanced (best compromised) solution out form the set in (1)\
\
triple measure for best balanced solution (i.e., compromise-dominance): <pareto-dominance (superiority), nash-dominance (fairness/stability), distance (similarity-to-ideal) = G-distance>. the priority decrease from left to right.\
\
Adaptively Resolve Trade-off under Dynamic and Uncertainty for Autoscaling in the Cloud: an Ant Colony and Compromise-Dominance based Solution.\
\
1st uncertainty: search the trade-off solution\
2nd uncertainty: difficult to weight under interferences. This even make eplison dominance difficult.\
3rd the problem is clearly NP hard\
\
in Xen, the max mem and cpu is set to the maximum (except those that have been collected to Dom0) for each VM as the actual threshold is controlled by our method.\
\
\cf5 Can link the work to self-aware handbook by saying self-awareness AS a solution to the autoscaling problem. Putting software and hardware CP in the same model could reduce maintenance difficulty and also search for better combination. This is the main benefits over the approach that consider them in different models and/or consider them separately (e.g., in a sequence, see TR-10-full-version)\
\
<<<<<<<<<<<<<<<<<< journal of QoS modelling approach\
\
\
"This is because the direct primitives are able to affect QoS by directly controlling the utilization in different aspects of information; whereas all the indirect ones can only do so via contention, thus they can only provide information about contention and this means that the information redundancy becomes a problem to accuracy."\
\
\
We can use overall result for hybrid not for adaptive is because in the former, we can compare with all other possible selection techniques thus it can draw sensible conclusion. On the other hand for the later, there are hundreds of alternatives thus overall result of the 3 learning algorithm does not make any senses and we need adaptive methods. By proving the effectiveness of the adaptive method, we can imply the approach will be useful when other alternatives are involved. \
\
\pard\pardeftab720\partightenfactor0
\cf0 !!!!!!!!!!! Move the description of two sub-space to the section of hybrid multi-learners method rather than during the system and qos models (for the description of SU also do the same)\
\
the aim for primitive selection is to select the primitives that maximise relevance but yet the minimal redundancy. \
\
\pard\pardeftab720\partightenfactor0
\cf7 Use the relevance and redundancy trend to compare with the non-linearity of accuracy. the direct shows that the
\i\b correlation of cumulative SU to model accuracy is nonlinear, and it exist for primitives provide quite different aspect of the class (QoS).
\i0\b0 (but for those that produce the same kind of information, it is ok). This prove that the direct and indirect provide quite different information. \cf0 \
\
journal 1 - investigate and analyse the possible relevant primitive space motivate to study SU (found out direct ones are more relevant and important than indirect ones) and partition the direct/indirect space.\
\
thesis - investigate the solution to the high dimension in previous method motivate to study SU (found out direct ones are more relevant and important than indirect ones) and partition the direct/indirect space.\
\
single vs mutil on primitive selection - direct has higher relevance and tends to cause less effect on accuracy therefore direct tends to be more important, it shows that like the nonlinearty in direct, the nonlinearty between direct and indirect spaces imply that they provide different aspects of information, this also match the fact that one on direct control of different aspect and the other only for interference/contention (direct is thus more important), thus to explore the full benefits from direct it is improper for using single-learner as in such case the
\i\b \cf7 correlation of cumulative
\i0\b0
\i\b SU to model accuracy is nonlinear, and it exist for primitives provide quite different aspect of the class (QoS).
\i0\b0 Note that CPU and memory may contain information directly about the QoS and some information on service interference (hybrid information), however, it still different to the other in the direct space and those from the indirect space, as they provide either direct or contention information. \
\
\pard\pardeftab720\partightenfactor0
\i\b \cf7 \
\pard\pardeftab720\partightenfactor0
\i0\b0 \cf7 \
consider say 1. the direct 2. between direct and indirect, 3. the indirect\
\
can include the variation of hybrid multi-leaners in journal to show that even we further incorporating wrong information in, the model is still better than single leaners, this means how important it is to partition the space in order to get SU working properly. \cf0 \
\
bar chart: N-SMAPE, stability and complexity. Then the detailed table of each algorithm and qos attribute.\
\
\
\
thinking:\
\
1. select relevant primitives then select useful primitive\
2. observe direct primitive has higher relevance than others for each feature dimension, thus partition to two sub-space for analysis. Both sub-space can have irrelevant primitives but has been removed.\
3. single SU value and pair-wised comparison if insufficient as it can not determine which relevant primitives set can produce better accuracy.\
4. multivariable function copse with nonlinear but too expensive, thus use linear and cumulative combination of SU values.\
5. this require the correlation of cumulative relevance and redundancy to model accuracy is linear, otherwise it can mislead the selection. \
6. the 4 obersvations and mention the different/same aspects of information.\
7. given linear and nonlinear aspect in the space and nonlinear stuff mainly for primitives provide different aspect of information, single-learner can not help as it can not distinguish linear/nonlinear areas and primitives provide different aspect of information, thus mislead the selection. Also nonlinear factor cannot be efficiently resolve, we decide to at least avoid the misleading.\
8. partition the space into n+1, where n is in the direct space and 1 is in indirect space. (n=the number of primitive)\
9. given nonlinear factor and different aspect of information in direct space, we use mRMR for each sub-space based on the primitive, also, because only one primitive per sub-space, it is equals to eliminate irrelevant primitive thus can be represented using mR, where the comparison between different relevant primitives set does not influent the result.\
10. in indirect space, simply use mRMR, we have different variations. \
11. finally, what we do is to partition the space into two sub-space, which is sufficient. \
\cf5 \
one benefits of hybrid method is that we do not need to consider all interference, but only the singnificant ones, same applied to direct primitives.\
\
can mentioned that we only consider one metric per primitive and Qos\
\
(do not consider cloud dynamics in journal, but use the experiment comparison to show the importance of select primitive online and the importance of considering interference information)\
\
<<<<<<<<<<<<<<<<<< journal of QoS modelling approach\
\
The autoscaling problem can be seen as multi-objective combinatorial optimisation problem (discrete optimisation)\cf0 \
\
in fitness landscape approximation, people would train the QoS model during evolutionary evaluation. but it aims for faster coverage, not good for runtime management. \
\
\pard\pardeftab720\partightenfactor0
\cf8 Things to determine:\
\
1. keep the trail difference vs not keep the trail difference (not keep)\
2. use local best to update trail vs use global best to update trail (use local)\
3. the best alpha and beta values. (4-1)\
4. the best evaporation value. (0.1)\
5. keep local update or not keep (not keep)\cf0 \
\
" Since the errors are squared before they are averaged, the RMSE gives a relatively high weight to large errors. This means the RMSE is most useful when large errors are particularly undesirable.\
\
The MAE and the RMSE can be used together to diagnose the variation in the errors in a set of forecasts. The RMSE will always be larger or equal to the MAE; the greater difference between them, the greater the variance in the individual errors in the sample. If the RMSE=MAE, then all the errors are of the same magnitude"\
\
"The inferences that can be made from a sample will be greater/more precise/more accurate if distributional assumptions which validate the mean can be made (ie. the distribution is symmetric and hence the mean is a valid measure of centre). However, these assumptions cannot always be made and the mean may give a misleading idea of the data. If the assumptions cannot be made, then the median is a better measure of centre that we know will be representative regardless of the distribution of the measurements."\
\
\
\pard\pardeftab720\partightenfactor0
\cf5 for journal content of QoS modelling, we can:\
\
1. Consider the new data as 350 intervals for write-intensive workload (5:5) (the previous one is read-intensive (9:1))\
2. use normalised deviation, together with normalised SMAPE \
3. use details accuracy as table and the overall result as bar chart, focus on stable and overall result of hybrid approach\
4. apply 1 for the adaptive multi-learner's evaluation as well, and if needed, use the new data to test sensitivity of result to the similarity of candidate learning algorithms. Also show the adaptive multi-learner is overall better and the most stable.\
\
\pard\pardeftab720\partightenfactor0
\cf0 \
\
\
\
QoS(t) is a value at tth interval, the one that used in optimisation should be the function. This is the same as represent the model by vectors, but i think this is more clear.\
\
in adaptive multi learner, we does not consider time series when finding local error because usually, the most recent sample has higher weight, in addition, adding too many dimension making it difficult to express the true distance.\
\
do not use nash in the search as it will decrease the diversity/variability/coverage. Do not use distance in the post trade-off as here, the number of improved objective is more important than the extents, and nash provide better fairness, especially when the number of objectives is large.\
\
although hormonic objectives can be optimised in parallel but tradeoff also true for this case at some point after the parallel optimisation: a decisions improve A more than B and a decision benefit B more than A.\
\
the budget of CP might be better to be consistent and fixed per-service per VM.\
\
caution about the use of "A of B for C" and "A for B of C", if the link between A and B is important, then use the former; if the link between B and C is important, use the later.\
\
the term 'adapt' is quite general, it depends on the actual autoscaling actions, e.g., vertical or horizontal scaling. We are not interested on the reason between them, but rather, we try to answer the questions: how much resources and what configurations are needed in order to optimise the different QoS objective, reduce the cost and comply with the requirements?\
\
goal or requirements are threshold come from the users/stakeholders, objective is about the process of the system toward these requirements. In some cases, e.g., EPiCS, goal can be represented by either requirement models or system models; while objective can be only represented by system models.\
\
put ageing of this phd work as assumption, e.g., CPU get old and heat up when utilisation increases. assume about no service composition, assume about no tradeoff between v-scaling and h-sclaing\
\
two ways to incorporate time in autoscaling, 1. use time in the modelling (or trend prediction), 2. use time in the decision making, usually based on latest reward function. Both aim for making optimisation persistent. \
\
RL (can be model-based or model-free) can learn model when searching for good decision, but only use those MDP models cannot explicitly reason about tradeoff and tends to have larger overhead than a dedicated QoS modelling approach (even for analytical ones). (or it cannot reach pareto trade-offs or well compromised trade-offs results)\
\
the rail update in MOACO is actually a diversity vs optimality trade-off, the 1/(1-q)*s is better for the diversity as it is forcely reset whenever a better global solution found, so the search do not continuously explore similar path and seek alternatives (we found this is better for autoscaling), whereas the traditional one may be more for attraction. (better diversity may lead to emergently better compromised decision, which is something that very difficult (in p time) to achieve based on purely optimal search) This is because the nature of problem, e.g., the trade-off surface is dynamic and uncertain.\
\
autoscaling system/process should contain two physical parts: the managing autoscaling logic and the manageable services and VMs. Also have some logical aspect: QoS modelling, granularity of control and trade-off decision making.\
\
\pard\pardeftab720\partightenfactor0
\cf9 DEMAND\cf0 : demand should be the max resources that a cloud-based service/application consumed according to given environmental conditions, while not violate SLA or budget. (people is selfish and want as better as possible Qos with as less as possible cost, so it needs to consider both QoS and cost/budget requirement, while the current definition often ignore budget) This explores the full potentials of cloud-based service/application and elasticity. We mainly from a view of cloud consumers (need support from the cloud providers). If considers profit/energy, then will need to do it form the view of cloud providers.\
\
QoS interference in decision making might not be an issue from beginning. We aim to optimise service's objective till the point where QoS interference becomes significant, and then mitigate them by making well-compromise trade-offs.\
\
We can actually migrate/replica a service within an application, even for database, by copying only the dependent files and DB tables.\
\
We need pareto before nash because since the set of decisions can influence the results, ensuring the superiority of the decisions beforehand can yield more preferable decisions in nash dominance.\
\
Some important research directions of self-adaptive software systems are, when to act(adapt), what to act and how to act, my thesis mainly focus on what to act.\
\
\cf10 DEMAND(1)\cf0 : the current definition of demand on software system should be refined, because it emphasis too much on cost, i.e., put a strong preference on cost and it see SLA as upper bound while it should be a baseline only. Secondly, it it ignores the true demand of the application/service itself, this unlock its full potentials (currently the demand is probably from an economic point of view). Finally, it is highly and solely rely on the constraints (for the ones that are not in the objective function) which assume to be correctly given by the software architects. \
\
\cf10 DEMAND(2)\cf0 : It should be: the highest amount of resources/configuration that the system utilises naturally while both SLA and budges are complied. (this describe (i) the demand of software service and (ii) the demand of the owner) It can further has QoS-driven demand and cost-driven demand, if either SLA or budget needs to be compromised. \
\
omnigraffine\
\
desired value= it is ok if we violate them. \
constraint= we cannot violate them.\
\
(mRMR says using MI with discrete data is better than continuous one)\
\
Scheduling could be similar to allocation such that they all assign limited resources to some tasks, but differ such that the former is near time while the latter can be for future, another similar is provisioning, which is about allocate resource for some tasks, without global view of limitation of resources (if we add on demand, then provisioning = autoscaling)\
\
The only thing not changing is the change itself. My recent research is all about changes to do with software systems: changes at development time ("evolution"), changes at runtime ("adaptation"), changes among viewpoints ("meaningful"), changes to stakeholders ("requirements"), changes to the attack/defense ("security"), changes to the interests disclosure ("privacy"), and the bidirectional synchronisations of various kinds of changes ("invariant traceability").\
\
\pard\pardeftab720\partightenfactor0
\f2 \cf0 Samuel Kounev\
\
future:\
\
1. interference aware horizontal and vertical scaling trade-off\
2. interference aware energy saving and profit optimisation (with trade-offs, at this point, we only need to satisfy SLA rather than optimise QoS as much as possible)\
\
\'93
\f5\b\fs22 \cb11 \expnd0\expndtw0\kerning0
Just a few reminder, for the self- install, some light lifting/bending may be required; installing QS is similar to installing a DVD player. Telco Sockets may be labelled Virgin Media, ntl or Telewest etc.\'a0 It\'92s only a problem if it\'92s BT or Sky (BT & Sky sockets may not be labelled), TV/BBI Quickstarts also\'a0 include 1.5m\'92s of coaxial cable, along with a 3m Ethernet cable for the broadband.
\f2\b0\fs24 \cb1 \kerning1\expnd0\expndtw0 \'94\
\
\'93
\f5\b\fs22 \cb11 \expnd0\expndtw0\kerning0
Yup! It is also important that you take note of this details, your new account number: ( 411275-04 ), the area reference : ( 12 ) and the activation line: 0800 953 9500. You will be needing this on the activation process
\f2\b0\fs24 \cb1 \kerning1\expnd0\expndtw0 \'94\
\
\'93
\f5\b\fs22 \cb11 \expnd0\expndtw0\kerning0
That's {\field{\*\fldinst{HYPERLINK "mailto:moverswebchat@virginmedia.co.uk"}}{\fldrslt \cf12 moverswebchat@virginmedia.co.uk}}\'a0and my name is : Ernalynne.
\f2\b0\fs24 \cb1 \kerning1\expnd0\expndtw0 \'94
\f6\b\fs28 \expnd0\expndtw0\kerning0
36253595\
\
\pard\pardeftab720\sl260\partightenfactor0
\f5\fs22 \cf0 \cb11 The earliest appointment is available for 23/01/2017, Monday between 8AM to 12PM, is it okay?
\f0\b0\fs24 \cb1 \kerning1\expnd0\expndtw0 \
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\cf0 \
IEEE SERVICES 2017\
SERVER: ServicesCongress.org\
congressweb\
congress2@IEEE\
\
akemi-miu/\
Sakura Chinami\
\
@ARTICLE\{Chen:2015:tse, \
author=\{Chen, Tao and Bahsoon, Rami\}, \
journal=\{IEEE Transactions on Software Engineering\}, \
title=\{Self-Adaptive and Online QoS Modeling for Cloud-Based Software Services\}, \
year=\{2016\}, \
note=\{doi:10.1109/TSE.2016.2608826\}\}\
\
\
@ARTICLE\{tsc-chen-2015, \
author=\{Chen, T. and Bahsoon, R.\}, \
journal=\{IEEE Transactions on Services Computing\}, \
title=\{Self-Adaptive Trade-off Decision Making for Autoscaling Cloud-Based Services\}, \
year=\{2015\}, \
keywords=\{Cloud computing;Decision making;Interference;Optimization;Quality of service;Throughput;QoS interference;Search-based optimization;cloud computing;multi-objective trade-offs\}, \
doi=\{10.1109/TSC.2015.2499770\}, \
note=\{doi:10.1109/TSC.2015.2499770\}\
\}\
==================================================================\
\pard\pardeftab720\partightenfactor0
\cf0 \
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\cf0 \
10. as the economic road map suggests, IaaS, Paas is more likely to be monopolistic market while SaaS would be monopolistic competition.\
\
price is different for each service instance of a consumer \
\
the reasons to make budget on each node is: it allow local computation get rid of consensus of the same type of PM. as solve it centrally is the same as solve it distributedlly. But the demand function of each node can better capture the true demand. \
\
cost = price * (stand by power + power per resource * the resource a consumer use) (e.g. cpu frequency, price here are from utility provider)\
\
(how to find power per resource refer to 'enery-cost', we can use approximate value since this is out of scope)\
\
unlike previous work, we assume what the user willing to pay depends on his/her maximum budget and the trade-off decision of QoS optimisation, which implies that the users do not have to spend all his/her possible budget.\
\
\
When change RP provision we could either stop-start the VM or doing live migration on the same PM (if on-the-fly allocation is not allowed). may be we should add by yes combination of resource we imply vertical and horizontal scaling. (how much we need the scale?)\
\
\
11. on architecture, we can use DDDAS, and for the trigger of optimisation, we can do either reactive (based on measurement, can be used for both under and over) or proactive (based on prediction) mainly for under. \
\
Experiment driven:\
1) ARMA/ARIMA/ARMAX? (if include workload)\
2) if there is need to apply AR on workload? or simply based on last interval's workload? \
3) the way to forward/backward search on ANN model, if need to try every combination?\
4) if apply dynamically determine if use ANN or ARMA? or choose one only?\
5) if apply time series on ANN?\
\
for demonstrating problem we need to \
1) show benefit of fine grained\
2) in terms of interference between services we mean 1,the original approach consider only one service is broken as it is interfered 2, change one control of a service may interfere another service.\
\
such interference could be conflict interference or concord interference\
\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\cf2 we may use the term QoS conflict to describe all the interference, sine response and throughput are both conflict with resource QoS, and this is more suitable for MOP. \cf0 \
we can use resource and use consistency (positive correlated in terms of performance objective and negative correlated (conflict) in terms of consistency and resource usage objective) as example. We also need to show vertical and horizontal interference for both.
\f1 (in this case the use of MOP is because it support conflict optimisation)
\f0 \
\
we can also in 2) to demonstrate the issue of trade off\
\
conflict usually between mon on same or different service and adj of a service, concord usually between mon of a service and mon on same or different service\
\
\
for accuracy experiment, can change the control value and measure the actual mon and the output mon of model. Or can only change workload etc, measure both mon and adj, then apply adj to model and compare the actual mon and output mon of model, in such case we can use 70% data to train and 30% to test. \
\
or since we aim for online dynamic model, we can simulate online through the entire workload distribution, see if the model perform well from less data to more data.\
we can use both of the above simulation for accuracy and adaptivity.\
\
\
\
for MOO experiment, we first need to proof our approach can reduce SLA violation by comparing the different SLO over the entire workload, scalability and elasticity can be evaluated here. Then we can apply general approach i.e. static to compare
\f1 (those does not consider conflict, service inter fence and equilibrium)
\f0 SLO, (if optimal) and profit (if maximised). example of comparison can see \
\
\
HPL-2008-123R1-mimo\
\
\
Assumptions:\
\
SLA model\
System model e.g. VM, PM\
price model\
\
need provides:\
\
transfer function\
\cf2 transfer function to executor\cf0 \
component/service (we use cpu/memory per component)\
cost function\
default min resource on a VM (use to determine if remove such VM)\
if a RQ is global\
\
b8:8d:12:1c:32:5a \
7300\
\pard\pardeftab720\partightenfactor0
\cf0 \