forked from englianhu/binary.com-interview-question
-
Notifications
You must be signed in to change notification settings - Fork 0
/
binary-Q1TD.Rmd
2037 lines (1566 loc) · 88.9 KB
/
binary-Q1TD.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "<img src='www/binary-logo-resize.jpg' width='240'>"
subtitle: "[binary.com](https://github.com/englianhu/binary.com-interview-question) Interview Question I - Tick-Data-HiLo For Daily Trading <span style='color:red'>(Blooper)</span>"
author: "[®γσ, Lian Hu](https://englianhu.github.io/) <img src='www/ENG.jpg' width='24'> <img src='www/RYO.jpg' width='24'>®"
date: "`r lubridate::today('Asia/Tokyo')`"
output:
html_document:
number_sections: yes
toc: yes
toc_depth: 4
toc_float:
collapsed: yes
smooth_scroll: yes
---
```{r setup, include=FALSE}
suppressPackageStartupMessages(library('BBmisc'))
pkgs <- c('knitr', 'kableExtra', 'tint', 'devtools', 'lubridate', 'plyr', 'pryr', 'stringr', 'magrittr', 'dplyr', 'tidyr', 'tidyverse', 'tidyquant', 'turner', 'readr', 'quantmod', 'htmltools', 'highcharter', 'googleVis', 'formattable', 'ggfortify', 'DT', 'forecast', 'PerformanceAnalytics', 'broom', 'microbenchmark', 'memoise', 'doParallel', 'Boruta', 'fBasics', 'fPortfolio', 'rugarch', 'parma', 'rmgarch')
suppressAll(lib(pkgs))
## Set option to below if you want to plot an independent webpage with graph
#'@ op <- options(gvis.plot.tag=NULL)
op <- options(gvis.plot.tag = 'chart')
options(gvis.plot.tag = 'chart', warn = -1, 'getSymbols.yahoo.warning' = FALSE)
#'@ options(rpubs.upload.method = 'internal')
rm(pkgs)
#'@ options(htmltools.dir.version = FALSE)
## filter tick data to get daily high-low price (with timeline).
source('function/filter_HL.R')
source('function/read_HL_tick_data.R')
source('function/forecastUSDJPYHL.R')
source('function/sim_predict.R')
source('function/sim_staking.R')
source('./function/plotChart2.R')#, local = TRUE)
```
# Introduction
From paper [Binary Q1](https://englianhu.github.io/2017/09/binary-forex-trading-Q1.html) or ([Alternate link](http://rpubs.com/englianhu/binary-forex-trading-Q1)) and also [Binary Q1 (Extention)](http://rpubs.com/englianhu/binary-Q1E) or ([Alternate link](http://rpubs.com/englianhu/316133)). Both papers test the accuracy of various statistical models and gain a high ROI per annum. However, as stated in the paper there has a concern which is don't know highest or lowest price came first, therefore the paper compared all possible outcomes. Finally the `Hi-Lo` and `Lo-Hi` models made highest returns. The research on this paper will be applicable to the real-life.
In order to test the timeline of daily highest and lowest price when I am writing [Real Time Trading System (Trial)](https://beta.rstudioconnect.com/content/3775/), here I created this file to read the tick-data-history to test the ROI (Return On Investment) per annum. Kindly refer to section [Reference] for further information.
# Data
## Read Data
I use more than 3 years data (from week 1 2015 until week 27 2018)^[You are feel feel to get the data via [FXCMTickData](https://github.com/fxcm/FXCMTickData)] for the question as experiment, 1st year data is burn-in data for statistical modelling and prediction purpose while following 2 years data for forecasting and staking. There have 52 trading weeks within a year.
There will be a certain spread charged by operators but the OHLC dataset does not provide the information for the course of the exchange rate. The tick-data history will be similar with rebirth model in soccer betting for normal FOREX trading market. The financial betting market similar with pre-match soccer betting.
```{r read-data1, echo=FALSE, eval=FALSE}
## ================== eval = FALSE =============================
## Do not execute...
##
## Remove all objects include hidden objects.
#'@ rm(list = ls(all.names = TRUE))
## get currency dataset online.
yr <- c(2015, 2016, 2017, 2018)
wk <- 1:53
dr <- 'data/USDJPY/'
## https://www.epochconverter.com/years
llply(yr, function(i) {
if(i == 2015|i == 2017) wk <- 1:53 else wk <- 1:52
llply(wk, function(j) {
lnk <- paste0(
'https://tickdata.fxcorporate.com/USDJPY/', i, '/', j, '.csv.gz')
if(!dir.exists(dr)) dir.create(dr)
if(!file.exists(paste0(dr, 'Y', i, 'W', j, '.csv.gz'))) {
download.file(lnk, destfile = paste0(
dr, 'Y', i, 'W', j, '.csv.gz'))
#cat(paste0(dr, 'Y', i, 'W', j, '.csv.gz downloaded!\n'))
}
})
})
## https://stackoverflow.com/questions/43642708/fread-with-gunzip-whats-the-more-memory-efficient-way/43643513
## https://stackoverflow.com/questions/37727865/how-can-i-use-fread-to-read-gz-files-in-r?noredirect=1&lq=1
#'@ data.table::fread("gunzip -c data/tickdata/Y2015W1.csv.gz")
## Due to the dataset size more than 1.5 mil rows, here I run and tidy the dataset instead of read the *.csv.gz or *.csv files.
llply(yr, function(i) {
if(i == 2015|i == 2017) wk <- 1:53 else wk <- 1:52
llply(wk, function(j) {
if(file.exists(paste0(dr, 'Y', i, 'W', j, '.csv.gz')) &
!file.exists(paste0(dr, 'Y', i, 'W', j, '.csv'))) {
R.utils::gunzip(paste0(dr, 'Y', i, 'W', j, '.csv.gz'),
remove = FALSE)
cat(paste0(dr, 'Y', i, 'W', j, '.csv.gz extracted!\n'))
}
})
})
## remove all files size less than 1MB
if(any(file.exists(paste0(dr, dir(dr, pattern = '.csv')[file.size(paste0(
dr, dir(dr, pattern = '.csv'))) <= 1000000])))) {
file.remove(paste0(dr, dir(dr, pattern = '.csv')[file.size(
paste0(dr, dir(dr, pattern = '.csv'))) <= 1000000]))
}
```
```{r read-data2, echo=FALSE, eval=FALSE}
## --------------------- Read Data -------------------------------
dr <- 'data/USDJPY/'
fls <- dir(dr, pattern = '.csv$')
#dfm <- read.csv(paste0(dr, fls), skipNul = TRUE)
# start <- seq(1, 186, 31)
# stop <- start - 1
# stop <- c(stop[-1], length(fls))
# paste0('fls = fls[', start, ':', stop, ']')
nm <- str_replace_all(fls, '.csv', '')
##
for(i in seq(length(fls))) {
#if(!file.exists(paste0(dr, nm[i], '.rds'))) {
assign(nm[i], read.csv(
paste0(dr, fls[i]), skipNul = TRUE) %>% tbl_df)
## save dataset.
eval(parse(text = paste0(
"saveRDS(", nm[i], ", '", dr, nm[i], ".rds')")))
eval(parse(text = paste0("rm(", nm[i], ")")))
cat(paste0(dr, nm[i], '.rds saved!\n'))
#}
}; rm(i, fls, nm)
## --------------------- Check Files -------------------------------
## check the number of *.rds files in directory.
drt <- dir(dr, pattern = '[^_HL].rds')
# start <- seq(1, 186, 31)
# stop <- start - 1
# stop <- c(stop[-1], length(drt))
# paste0('drt = drt[', start, ':', stop, ']')
##verify the number of files.
drt %>% str_split_fixed('Y|W|.rds', 4) %>% tbl_df %>%
select(V2, V3) %>% filter(V2 == 2015)
## If the downloaded files are display in Asia/Tokyo timezone, then
## we can change to default UTC timezone.
#'@ Y2015W2 %>% mutate(
#'@ DateTime = mdy_hms(DateTime, tz = 'Asia/Tokyo'),
#'@ DateTime = with_tz(DateTime, 'UTC'))
## --------------------- Filter Data -------------------------------
dr <- 'data/USDJPY/'
drt <- dir(dr, pattern = '[^_HL].rds')
nm <- str_replace_all(drt, '.rds', '')
# start <- seq(1, 186, 31)
# stop <- start - 1
# stop <- c(stop[-1], length(drt))
# data.frame(drt = paste0('drt = drt[', start, ':', stop, ']'),
# colm = ';',
# nm = paste0('nm = nm[', start, ':', stop, ']'))
for(i in seq(length(drt))) {
assign(nm[i], readRDS(paste0(dr, drt[i])))
## filter daily highest and lowest price.
#'@ assign(paste0(nm[i], '_HL'), nm[i] %>% mutate(Date = as.Date(mdy_hms(DateTime))) %>%
#'@ group_by(Date) %>%
#'@ filter(Bid == min(Bid)|Bid == max(Bid)|Ask == min(Ask)|Ask == max(Ask)) %>%
#'@ filter(!duplicated(Bid)|!duplicated(Ask)))
## Error : assign() cannot handle nm[i] %>% mutate(...) since 'nm[i]' is class character.
## convert timezone, will take time around 20 minutes for 1 million plus rows due to not vectorised handling.
#'@ eval(parse(text = paste0(
#'@ nm[i], "_HL <- ", nm[i], " %>% mutate(DateTime = mdy_hms(DateTime, tz = 'UTC')) %>% rowwise() %>% do(DateTime = with_tz(.$DateTime, tzone = 'GMT')) %>% mutate(Date = as.Date(DateTime))")))
## filter daily highest and lowest price.
eval(parse(text = paste0(
nm[i], "_HL <- ", nm[i], " %>% mutate(DateTime = with_tz(mdy_hms(DateTime), 'GMT'), Date = as.Date(DateTime)) %>% group_by(Date) %>% filter(Bid == min(Bid)|Bid == max(Bid)|Ask == min(Ask)|Ask == max(Ask))")))
eval(parse(text = paste0(nm[i], '_HL %<>% filter(!duplicated(Bid)|!duplicated(Ask))')))
## save dataset.
eval(parse(text = paste0(
"saveRDS(", nm[i], "_HL, '", dr, nm[i], "_HL.rds')")))
eval(parse(text = paste0("rm(", nm[i], ")")))
eval(parse(text = paste0("rm(", nm[i], "_HL)")))
cat(paste0(dr, nm[i], '_HL.rds saved!\n'))
}
#'@ test <- Y2018W9
#'@ test %<>% mutate(DateTime = mdy_hms(DateTime, tz = 'UTC')) %>% rowwise() %>%
#'@ do(DateTime = with_tz(.$DateTime, tzone = 'GMT')) %>%
#'@ mutate(Date = as.Date(DateTime))
## -----------------------------------------------------------------
## convert timezone, will take time around 20 minutes for 1 million plus rows due to not vectorised handling.
#'@ eval(parse(text = paste0(
#'@ nm[i], "_HL <- ", nm[i], " %>% mutate(DateTime = mdy_hms(DateTime, tz = 'UTC')) %>% rowwise() %>% do(DateTime = with_tz(.$DateTime, tzone = 'GMT')) %>% mutate(Date = as.Date(DateTime))")))
## check the number of *.rds files in directory.
dr <- 'data/USDJPY/'
drt <- dir(dr, pattern = '_HL.rds')
#'@ drt %>% str_split_fixed('Y|W|_HL.rds', 4) %>% tbl_df %>%
#'@ select(V2, V3) %>% filter(V2 == 2015)
nm <- str_replace_all(drt, '.rds', '')
## filter_HL() to get daily unique high low price.
#'@ source('function/filter_HL.R')
## simulate secondary filter daily high-low price.
for(i in seq(length(drt))) {
assign(nm[i], readRDS(paste0(dr, drt[i])))
assign(nm[i], eval(parse(text = paste0('filter_HL(', nm[i], ')'))))
## save dataset.
eval(parse(text = paste0(
"saveRDS(", nm[i], ", '", dr, nm[i], ".rds')")))
eval(parse(text = paste0("rm(", nm[i], ")")))
cat(paste0(dr, nm[i], '.rds saved!\n'))
}
```
I gathered the 3 datasets from below websites:
- **1st Dataset** - `quantmod::getSymbols(src = 'yahoo')`: which contains OHLCV data price (timezone in GMT). The place orders function required highest or lowest price come first, therefore I gathered the data via **FXCMTickData**.
- **2nd Dataset** - [FXCMTickData](https://github.com/fxcm/FXCMTickData) : which contain the timeline of highest and lowest price within a day (timezone in UTC). The daily dataset used for forecast daily price.
- **3rd Dataset** - `TFX::queryFX()`: which shows the current price. The place orders function required data history, therefore I gathered the data via **FXCMTickData**.
<span style='color:red'>There will probably occurs inconsistancy of data price among 3 datasets, however there will be cost few years time to gather all real time price via `TFX::queryFX()`. Otherwise all data gather via 1 channel (`queryFX()`) will be perfect. However, you can feel free to read futher in this paper where verified the consistancy of datasets.</span>
The will be another research project for **Real Time High Frequency Trading** where collect the real-time data and also high-frquency trading for tick-data. You are feel free to browse over [Real Time FXCM](https://github.com/scibrokes/real-time-fxcm).
Below is the dataset gather via `getSymbols(src = 'yahoo')`.
```{r data-summary1}
## read saved dataset.
mbase <- readRDS('./data/USDJPY/USDJPY.rds')
```
```{r data-summary2, echo=FALSE}
## size of dataset.
paste0('mbase : [', paste(dim(mbase), collapse = ' x '), ']')
```
```{r data-summary3, echo=FALSE, results='asis'}
summary(mbase) %>%
tidy %>%
select(-Var1) %>%
rename(Category = Var2) %>%
kable(caption = 'Table Summary') %>%
kable_styling(bootstrap_options = c('striped', 'hover', 'condensed', 'responsive')) %>%
scroll_box(width = '100%', height = '400px')
```
*Table 2.1.1 : 1st dataset - summary of daily price dataset.*
## Tidy Data
For 2nd dataset, here I gather the tick data via **FXCMTickData**, there are more than million rows dataset (million rows per file while there are 52 files over 52 weeks) while I tidy and filter only highest and lowest `bid/ask` price. From the table below we can know the timeline which is weather highest or lowest price came first.
```{r read-data3}
## read tick data.
HL_tick_data <- read_HL_tick_data()
dtID <- unique(HL_tick_data$Date)
## arrange the sequence of highest and lowest price.
HL_tick_data %<>% tbl_df %>% mutate(sq = rep(1:4, length(dtID)))
## Print dataset
HL_tick_data
```
*Table 2.2.1 : 2nd dataset - daily high-low price tick-data.*
For 3rd dataset, due to there is a real-time data, the **Real Time Trading System (Trial)** web application only stored the real-time `bid/ask` transaction price but not collect all tick-data in seconds. However, you are feel free to gather from [DataCollection](https://beta.rstudioconnect.com/content/3153/).
# Statistical Modelling
## Forecast Daily Hi-Lo Price
I tried to apply Lasso, Elastic Net and Ridge models to test the accuracy of prediction via [shinyApp](https://beta.rstudioconnect.com/content/2367/).
$$\begin{equation}
\sigma^2_{t} = \omega + \sum_{i=1}^{\rho}(\alpha_{i} + \gamma_{i} I_{t-i}) \varepsilon_{t-i}^{2} + \sum_{j=1}^{q}\beta_{j}\sigma^{2}_{t-j}\ \cdots\ Equation\ 3.1.1
\end{equation}$$
Here I directly apply *GJR-GARCH*^[Kindly refer to [GJR-GARCH 模型](https://vlab.stern.nyu.edu/zh/doc/3?topic=mdls) for more information] model due to I had compared few statistical models and got the best fitted model. Kindly refer to **Binary Q1** for the paper.
- Auto Arima models (Adjusted and use the optimal AR and MA parameters)
- Exponential Time Series (27 ETS models)
- Univariate Garch models (GARCH, T-GARCH, GJR-GARCH, eGARCH, etc altogather 12 models.)
- Exponential Weighted Moving Average
- <s>Monte Carlo Markov Chain</s>
- <s>Bayesian Time Series</s>
- <s>Midas</s>
![](www/ARCH-family.jpg)
*Source presentation 3.1.1 : [Univariate Garch 2012 (powerpoint)](https://github.com/englianhu/binary.com-interview-question/blob/master/reference/Univariate%20Garch%202012%20powerpoint.pdf)*
![](www/ARCH-family-formula.jpg)
Here I wrote another extention page for Q1 which is analyse the multiple currencies and also models from minutes to daily. You are feel free to browse over **Binary Q1 (Extention)**. The paper compare and get the optimal predictive model based on the various number of observations.
## GARCH Order
The `rugarch` package not only provides various GARCH models but also contains below 3 to suite our needs :
- variance.model
- mean.model
- distribution.model
Kindly refer to [rugarch包与R语言中的garch族模型](http://mob.dataguru.cn/mportal.php?mod=view&aid=794) for more information.
![](www/optimal-garchOrder-1.jpg)
*Source presentation 3.2.1 : Univariate Garch 2012 (powerpoint)*
![](www/optimal-garchOrder-2.jpg)
*Source presentation 3.2.2 : Univariate Garch 2012 (powerpoint)*
![](www/optimal-garchOrder-3.jpg)
*Source presentation 3.2.3 : Univariate Garch 2012 (powerpoint)*
Due to above presentation compares the `infocriteria` of the GARCH order, here I directly set `garchOrder(1,1)` in my model.
## ARMA Order
For ARMA order, kindly refer to **GARCH模型中的`ARMA(p,d,q)`参数最优化**^[You can read the paper by search inside section [Reference]] for more information. I used old function which is `calC()` but not `calc_fx()` in this paper.
```{r read-data4, echo=FALSE, eval=FALSE}
## get currency dataset online.
#'@ getFX('USD/JPY', from = '2014-01-01', to = '2017-01-20') #oanda only provides 180 days data. getSymbols()
## Get data.
#'@ USDJPY <- getSymbols('JPY=X', src = 'yahoo', from = '2015-01-04',
#'@ to = '2018-07-06', auto.assign = FALSE)
#'@ names(USDJPY) <- str_replace_all(names(USDJPY), 'JPY=X', 'USDJPY')
#'@ USDJPY %<>% na.omit
#'@ saveRDS(USDJPY, 'data/USDJPY/USDJPY.rds')
mbase <- readRDS('./data/USDJPY/USDJPY.rds')
if(!is.xts(mbase)) mbase <- xts(mbase[, -1], order.by = mbase$Date)
## dateID or timeID
timeID <- index(mbase)
timeID0 <- ymd('2016-01-04')
timeID <- timeID[timeID > dateID0] # index(mbase)[index(mbase) > ymd('2016-01-04')]
#length(timeID)
#[1] 651
#start <- seq(1, 652, 109)
#stop <- start - 1
#stop <- c(stop[-1], length(timeID))
#data.frame(timeID = paste0('timeID = timeID[', start, ':', stop, ']'))
#1 timeID = timeID[1:109] # index(mbase)[index(mbase) > ymd('2016-01-04')][1:109]
#2 timeID = timeID[110:218] # index(mbase)[index(mbase) > ymd('2016-01-04')][110:218]
#3 timeID = timeID[219:327] # index(mbase)[index(mbase) > ymd('2016-01-04')][219:327]
#4 timeID = timeID[328:436] # index(mbase)[index(mbase) > ymd('2016-01-04')][328:436]
#5 timeID = timeID[437:545] # index(mbase)[index(mbase) > ymd('2016-01-04')][437:545]
#6 timeID = timeID[546:651] # index(mbase)[index(mbase) > ymd('2016-01-04')][546:651]
## simulate financial betting.
pred.data <- sim_predict(
mbase, #timeID = index(mbase),
timeID = index(mbase)[index(mbase) > ymd('2016-01-04')][546:651],
timeID0 = ymd('2016-01-04'),
.preCat = 'Op', .preCat2 = 'Hi', .preCat3 = 'Lo', .preCat4 = 'Cl',
.save = TRUE, currency = 'JPY=X')
#index(mbase)[index(mbase) > ymd('2016-01-04')][268:269] #error
#Latest Date (GMT): 2017-01-11 done!
#Latest Date (GMT): 2017-01-11 saved!
#
#Error in solve.default(res$hessian * n.used, A) :
# Lapack routine dgesv: system is exactly singular: U[1,1] = 0
#> pred.data %>% dplyr::filter(LatestDate.GMT == '2017-01-12')
## A tibble: 1 x 10
# LatestDate.GMT Lst.Open Lst.Low Fct.High Lst.Close ForecastDate.GMT Fct.Open Lst.High Fct.Low
# <date> <dbl> <dbl> <dbl> <dbl> <fctr> <dbl> <dbl> <dbl>
#1 2017-01-12 NA 113.76 110.5123 NA T+1 NA 115.219 113.6786
# ... with 1 more variables: Fct.Close <dbl>
#> pred.data %>% dplyr::filter(LatestDate.GMT == '2017-01-13')
## A tibble: 1 x 10
# LatestDate.GMT Lst.Open Lst.Low Fct.High Lst.Close ForecastDate.GMT Fct.Open Lst.High Fct.Low
# <date> <dbl> <dbl> <dbl> <dbl> <fctr> <dbl> <dbl> <dbl>
#1 2017-01-13 114.696 114.284 110.5143 114.664 T+1 114.5294 115.388 114.3947
## ... with 1 more variables: Fct.Close <dbl>
#> pred.data %>% dplyr::filter(LatestDate.GMT == '2017-01-11')
## A tibble: 1 x 10
# LatestDate.GMT Lst.Open Lst.Low Fct.High Lst.Close ForecastDate.GMT Fct.Open Lst.High Fct.Low
# <date> <dbl> <dbl> <dbl> <dbl> <fctr> #<dbl> <dbl> <dbl>
#1 2017-01-11 115.875 115.681 110.5116 115.872 T+1 115.7538 116.835 115.7639
## ... with 1 more variables: Fct.Close <dbl>
rm(obs.data, USDJPY)
#> pred.data %>% dplyr::filter(LatestDate.GMT == '2017-01-12')
## A tibble: 1 x 10
# LatestDate.GMT Lst.Open Lst.High Lst.Low Lst.Close ForecastDate.GMT Fct.Open Fct.High Fct.Low
# <date> <dbl> <dbl> <dbl> <dbl> <fctr> <dbl> <dbl> <dbl>
#1 2017-01-12 115.055 115.219 113.76 NA T+1 115.1867 110.5123 113.6786
## ... with 1 more variables: Fct.Close <dbl>
```
Here I read my saved dataset where forecast 1 trading day advanced for daily Hi-Lo price. Kindly refer to **Real Time Trading System (Trial)** for more information.
```{r read-data5, echo=FALSE}
pred.data <- readRDS('data/USDJPY/pred.data.rds') %>% tbl_df
pred.data %>% data.table
```
*Table 3.3.1 : Forecast high-low daily price.*
```{r tidy-data1}
pred.data %>% dplyr::filter(LatestDate.GMT == '2017-01-12') %>%
kable(caption = 'Data Error') %>%
kable_styling(bootstrap_options = c('striped', 'hover', 'condensed', 'responsive')) %>%
scroll_box(width = '100%')
```
*Table 3.3.2 : Forecast daily price missing close price.*
<s>Due to the day 2017-01-12 unable forecast the closed price.^[`armaSearch()` error on method = 'ML'], here I omit the transaction on that day.</s>
Below I combine the daily price dataset with forecast `Hi-Lo` price.
```{r tidy-data2}
## filter data.
pred.data %<>% dplyr::filter(LatestDate.GMT != '2017-01-12')
## Copied dataset.
mbase <- data.frame(Date = index(mbase), mbase) %>% tbl_df
pred.data %<>% tbl_df
## Add `Date` column as forecasted Date.
pred.data$Date <- lead(pred.data$LatestDate.GMT)
pred.data$Date[length(pred.data$Date)] <- data.table::last(pred.data$LatestDate.GMT) + days(1)
pred.data %<>% select(Date, Fct.Open, Fct.High, Fct.Low, Fct.Close) %>% data.table
## Merge dataset.
pred <- merge(tbl_df(mbase), pred.data, by = 'Date') %>%
tbl_df %>% select(-USDJPY.Volume, -USDJPY.Adjusted)
rm(pred.data)
## Print dataset.
pred %>% data.table
```
*Table 3.3.3 : Tidy dataset for forecast high-low daily price.*
Below I test if the dataset scrapped from `quantmod::getSymbols(src = 'yahoo')` equal to **FXCMTickData**. Unfortunately the data gathered is not tally each other.
```{r tidy-data3}
mb.dateID <- unique(mbase$Date)
td.dateID <- unique(HL_tick_data$Date)
## Check the start and end date
data.frame(MB = range(mb.dateID),
TD = range(td.dateID)) %>%
kable(caption = 'Data Range') %>%
kable_styling(bootstrap_options = c('striped', 'hover', 'condensed', 'responsive'), full_width = FALSE, position = 'float_left') %>%
footnote(general = 'Date range of 1st and 2nd dataset.',
general_title = 'Table 3.3.4 : ', footnote_as_chunk = TRUE)
```
*Table 3.3.4* shows the date range for both **MB** (mbase dataset) and **TD** (tick-data).
<span style='color:red'>Below shows all dates NOT in each dataset. Unfortunately the inconsistancy of dataset gathered from `getSymbols(src = 'yahoo')` will caused whole predictive models bias, it will affect the staking amount and evetually effect the ROI. The dataset used to working fine few years ago. There will be another research which gather only dataset from operator to solve the issue.</span>
```{r tidy-data4}
## Check if dateID not in another dataset.
mb.dateID[!mb.dateID %in% td.dateID]
td.dateID[!td.dateID %in% mb.dateID]
```
```{r plot-data1A, echo = FALSE}
## Real price.
pl.RHigh <- pred %>%
select(Date, USDJPY.High)
pl.RHigh <- as.xts(pl.RHigh[-1], order.by = pl.RHigh$Date)
pl.RLow <- pred %>%
select(Date, USDJPY.Low)
pl.RLow <- as.xts(pl.RLow[-1], order.by = pl.RLow$Date)
## Predicted price.
pl.FHigh <- pred %>%
select(Date, Fct.High)
pl.FHigh <- as.xts(pl.FHigh[-1], order.by = pl.FHigh$Date)
pl.FLow <- pred %>%
select(Date, Fct.Low)
pl.FLow <- as.xts(pl.FLow[-1], order.by = pl.FLow$Date)
## Plot graph
hc <- highchart(type = 'stock') %>%
hc_title(text = 'USD/JPY Currency Exchange Rate') %>%
hc_subtitle(text = 'Comparison of Forecast and Real Price (Highest Price)') %>%
hc_add_series(pl.RHigh, id = 'pl.RHigh', color = 'blue') %>%
hc_add_series(pl.FHigh, id = 'pl.FHigh', color = 'red')
hc
```
*Graph 3.3.1A : <span style='color:red'>Forecast daily highest price</span> vs <span style='color:blue'>real daily highest price</span>.*
```{r plot-data1B, echo = FALSE, results = 'asis'}
## Plot graph
hc <- highchart(type = 'stock') %>%
hc_title(text = 'USD/JPY Currency Exchange Rate') %>%
hc_subtitle(text = 'Comparison of Forecast and Real Price (Lowest Price)') %>%
hc_add_series(pl.RLow, id = 'pl.RLow', color = 'blue') %>%
hc_add_series(pl.FLow, id = 'pl.FLow', color = 'red')
rm(pl.RHigh, pl.RLow, pl.FHigh, pl.FLow)
hc
```
*Graph 3.3.1B : <span style='color:red'>Forecast daily lowest price</span> vs <span style='color:blue'>real daily lowest price</span>.*
*Graph 3.3.1A* and *Graph 3.3.1B* above compare the real price and forecast price. Following section will be compare the MSE (Mean Squared Error).
## Mean Squared Error
```{r mse1}
## Mean Squared Error : Comparison of accuracy.
## https://cran.r-project.org/web/packages/kableExtra/vignettes/awesome_table_in_html.html
data.frame(
Category = c('High = ', 'Low = ', 'Close = '),
MSE = c(mean((pred$Fct.High - pred$USDJPY.High)^2),
mean((pred$Fct.Low - pred$USDJPY.Low)^2),
mean((pred$Fct.Close - pred$USDJPY.Close)^2))) %>%
kable(caption = 'Mean Squared Error') %>%
kable_styling(bootstrap_options = c('striped', 'hover', 'condensed', 'responsive'), full_width = FALSE, position = 'float_right') %>%
footnote(general = 'MSE of prediction.',
general_title = 'Table 3.4.1 : ', footnote_as_chunk = TRUE)
```
$$\begin{equation}
\frac{1}{n}\sum_{t=1}^{n}e_t^2\ \cdots\ Equation\ 3.4.1
\end{equation}$$
*Table 3.4.1* at the right-hand-side shows the accuracy of the predictive model. You can also refer to previous studies where compare the accuracy of predicted `Open`, `High`, `Low` and `Close` price. For accuracy comparison of statistical models, you might refer to **GARCH模型中的`ARMA(p,d,q)`参数最优化** or previous papers for further information.
## Univariate Preditive Error
**GARCH模型中的`ARMA(p,d,q)`参数最优化** compare the accuracy of gjrGARCH and Fi-gjrGARCH model while **Multivariate GARCH Models** states the statistical predictive error based on univariate and also use more sophistical methods for modelling. This paper I still keep using univariate gjrGARCH in mentioned paper which is not the most accurate.
```{r tidy-data5}
rx <- pred %>%
mutate(diff = Fct.High - Fct.Low) %>%
dplyr::filter(diff <= 0)
rx %>%
kable(caption = 'Preditive Error for Univariate gjrGARCH') %>%
kable_styling(bootstrap_options = c('striped', 'hover', 'condensed', 'responsive')) %>%
scroll_box(width = '100%', height = '400px')
```
*Table 3.5.1 `r paste0('[', paste(dim(rx), collapse = ' x '), ']')` : Preditive `HiLo` Error for Univariate gjrGARCH.*
```{r tidy-data6}
rx <- pred %>%
mutate(Range = ifelse(Fct.Close > Fct.High | Fct.Close < Fct.Low, 0, 1)) %>%
dplyr::filter(Range == 0)
rx %>%
kable(caption = 'Preditive Error for Univariate gjrGARCH') %>%
kable_styling(bootstrap_options = c('striped', 'hover', 'condensed', 'responsive')) %>%
scroll_box(width = '100%', height = '400px')
```
*Table 3.5.2 `r paste0('[', paste(dim(rx), collapse = ' x '), ']')` : Preditive `Cl` Error for Univariate gjrGARCH.*
*Table 3.5.1* states the forecast highest price lower than forecast lowest price where *Table 3.5.2* states forecast closing price not in the range of forecast highest and lowest price. [Betting Strategy] in next section will omit the error and only trade on the high winning opportunities.
# Betting Strategy
## Kelly Criterion
$$\begin{equation}
f = \frac{Edge}{Odds} = \frac{p^∗ x−(1−p^∗)}{x}
\ \cdots Equation\ 4.1.1
\end{equation}$$
The section 1 inside [Application of Kelly Criterion model in Sportsbook Investment](https://github.com/scibrokes/kelly-criterion) shows the ROI for betting on sportsbook, where section 2 will be started during spared time after FOREX market.
For the betting strategy, here I need to do correction from **Binary Q1** as stated in section [Introduction]. Below I tidy the dataset to match the `bid/ask` price for close a transaction.
As we know from previous paper, the closing price will be settled price (closed transaction) if the forecast price has no occured within a day (from 12:00:01 AM until next day 12:00:00 AM, due to system calculate the price spend a minute, therefore the opening price will not in count.). There will be 2 limit orders (but and sell) placed after 12:00:00AM. Once one among the order limit stand (opened transaction), another limit order will be automatically turned to be closed transaction request. Therefore there will only transaction within a trading day unless no any order limit placed stand.
```{r data-summary4, results = 'asis'}
## Here I combine real-time dataset with the daily dataset.
pred <- merge(HL_tick_data, pred, by = 'Date') %>%
tbl_df %>%
arrange(DateTime)
## Tidy dataset.
## bid price for sell, ask price for buy orders. bid price always lower than ask price due to banker need to earn vigorish/spread margins. We need to see bid price if we wanna sell and ask price when we wanna buy.
pred %<>% group_by(Date) %>%
dplyr::filter(Bid == max(Bid, na.rm = TRUE)|Ask == min(Ask, na.rm = TRUE)) %>%
dplyr::select(DateTime, Date, USDJPY.High, USDJPY.Low, USDJPY.Close, sq, Bid, Ask, Fct.High, Fct.Low, Fct.Close) %>%
rename(High = USDJPY.High, Low = USDJPY.Low, Close = USDJPY.Close) %>% tbl_df
pred %<>% mutate(
Fct.High = round(Fct.High, 3),
Fct.Low = round(Fct.Low, 3),
Fct.Close = round(Fct.Close, 3),
Sell = ifelse((Fct.High <= High &
Fct.High >= Low &
Fct.High > Fct.Low) & # Here I filter the
(Fct.Close > Fct.High | # both HiLo and Cl
Fct.Close < Fct.Low), 1, 0), # predictive error.
Buy = ifelse((Fct.Low >= Low &
Fct.Low <= High &
Fct.Low < Fct.High) & # Here I filter the
(Fct.Close > Fct.High | # both HiLo and Cl
Fct.Close < Fct.Low), 1, 0)) # predictive error.
## follow the seq to determine the buy or sell limit order stand within a day. Then the other side will be automatically switch to close transaction limit order but not placed another limit order.
pred <- ldply(split(pred, pred$Date), function(x) {
x %<>% mutate(Trans = ifelse(!is.na(Bid), 'sell', 'buy'),
Trans = ifelse(Sell == 1|Buy == 1, Trans, 0))
if(x[1,]$Trans == 'sell'|x[1,]$Trans == 'buy') #if open transaction.
x[nrow(x),]$Trans <- 'close' # then close transaction.
x }) %>% tbl_df
pred$.id <- NULL
pred %<>% mutate(Trans = factor(Trans))
```
Topic *2.1.4 Staking Model* in paper **Binary Q1** states that the financial market unable to know the payout rate in advanced, therefore use the forecasted pice based on statistical models. I use the amount calculated by Kelly Criterion as the stakes (possible loss or terms as stop-loss in financial market), variance of forecasted Hi/Lo and also Closed price. However it doesn't make sense in financial market.
> Hedge Fund Market Wizards has a good discussion on this in Ed Thorp's chapter.
>
- Kelly Criterion has highest long-term growth rate, but gives you higher drawdowns and risk of ruin.
- In gambling you know your theoretical odds, but in trading your win rate is only an estimate.
- If you estimate your win rate incorrectly, the profits you'd miss out on by underestimating are less than the losses you'd incur by being overconfident.
- If there's large uncertainty about your win rate (e.g. trend following systems) Kelly may be inappropriate.
- "Suppose you have 1MM and your max allowable drawdown is 200K - then from the Kelly perspective you don't have 1MM, you have 200K in capital."
- If you bet .5 K.C. you get .75 of the returns with .5 of the volatility - half Kelly is better psychologically.
*Source quote 4.1.1 : [Kelly Criterion in Forex primordia • Jul 22, 2017, 7:07 PM](https://amp.reddit.com/r/Forex/comments/6oxh5e/kelly_criterion_in_forex)*
> EXAMPLE
Let's look at a trading example.
>
Say, you have a EURUSD trading strategy that wins approximately 70% of the time. The StopLoss in your strategy is 40 pips and the TakeProfit is 20 pips (spread accounted for).
>
This means that your B and P parameters are as follows:
>
B = 20 pips / 40 pips = 0.5
P = 70% = 0.7
Let's input these values into Kelly’s formula and see what we get:
>
K = ( PxB – (1–P) ) / B
K = ( 0.7 x 0.5 – (1–0.7) ) / 0.5 = 0.1
>
This means that the optimal risk for this trading strategy that will maximize your profits in the long term is 10%.
>
If you want to be a bit more conservative, then go with the Half-Kelly of 5%.
>
Whatever you do, don’t invest more than 10% per trade – it’s pointless.
>
If you invest more than 20% then you will turn this great strategy into one that will ruin your account.
>
That's how you apply the Kelly Criterion in practice.
*Source quote 4.1.2 : [ForexBoat:Kelly Criterion](https://www.forexboat.com/kelly-criterion/)^[You may feel free to read the comment onto the article [Q&A on Kelly criterion, stop-loss, take-profit and also leverage ratio](https://www.forexboat.com/kelly-criterion/#comment-2053452762) as well.]*
## Staking Model
[Chapter 20 Against the Odds: The Mathematics of Gambling](http://srdas.github.io/MLBook/Gambling.html#odds)^[Publised book [**Data Science: Theories, Models, Algorithms, and Analytics** - *Sanjiv Ranjan Das (2017-03-24)*](http://srdas.github.io/MLBook/).] elaborates the odds price, edge, Kelly Criterion staking model, portfolio, Entropy and also day trading.
$$\begin{eqnarray}
g_t(f) &=& \frac{1}{t} \ln \left( \frac{B_t}{B_0} \right) \\
&=& \frac{1}{t} \ln \left( \prod_{i=1}^t [1+r +f(Z_t -r)\varepsilon] \right) \\
&=& \frac{1}{t} \sum_{i=1}^t \ln \left( [1+r +f(Z_t -r)\varepsilon] \right)
\end{eqnarray}
\ \cdots Equation\ 4.2.1 $$
By applying the law of central limit theorem for large data, here we get:
$$\begin{equation}
g(f) = \lim_{t \rightarrow \infty} g_t(f) = E[\ln(1+r + f (Z-r)\varepsilon)]
\ \cdots Equation\ 4.2.2
\end{equation}$$
and $\varepsilon_{i}$ is a weight function where applied statistical models to forecast the price. I used some models and eventually concludes the GJR-GARCH model generated highest ROI. The weight function doesn't same with different currencies since it using substraction among 2 forecast prices but not in ratio^[The pips difference for USD/JPY different with others. Thereore there will need to compare among currencies.].
$$\begin{equation}
\varepsilon = h(x)_{1} - v
\ \cdots Equation\ 4.2.3
\end{equation}$$
$v$ is a switch function to determine the settled price.
$$\begin{equation}
v
\left\{\begin{matrix}
h(x)_{2}
& if(Lo <= h(x)_{2} <= Hi)
& \\ Cl
& otherwise
\end{matrix}\right.
\ \cdots Equation\ 4.2.4
\end{equation}$$
where $Hi$, $Lo$ and $Cl$ are the daily highest, lowest and closed price.
**Binary Q1** applied a more sophisticated model as stated above to compares all possible outcomes of predicted price and ROI. Due to the financial betting only allows player place bets and awaiting for the settlement (unless placed another bet at other predicted price), there will be no any limit order and close transaction request (similar with FOREX trading market can place more than 1 limit order to lock the profit).
<span style='color:red'>The staking model in previous papers (includes **Binary Q1 (Extention)**) have only **MISTAKE** which is wrote for real FOREX trading market but not financial betting since I only think of spread betting but forgot the normal betting :</span>
- <span style='color:red'>Forecast highest price to sell and forecast lowest price to buy to maximise the profit : I wrote 2nd forecast price as settled price if it was between the daily Hi-Lo range (otherwise daily closed price will be settled price). There will be only applicable to FOREX trading market but NOT financial betting market.</span>
- <span style='color:red'>However it will be more easily since the settled price will be only daily closed price.</span>
The paper **Binary Q1** compares all outcome :
- `Hi-Cl` + `Lo-Cl` generate highest ROI in financial betting market.^[Here I will conducting another research for financial betting.]
- `Hi-Lo` or `Lo-Hi` will be best betting strategy for noarmal FOREX market.
```{r betting-strategy, echo = FALSE}
bs <- data.frame(.id = c('fundAutoArimaHICL', 'fundAutoArimaLOCL'),
StartDate = ymd('2015-01-02'),
LatestDate = ymd('2017-01-20'),
InitFund = 1000,
LatestFund = c(1401.694, 1499.818),
Profit = c(401.69378, 499.81773),
RR = formattable::percent(c(1.401694, 1.499818)))
sm <- data.frame(.id = 'Combine',
StartDate = ymd('2015-01-02'),
LatestDate = ymd('2017-01-20'),
bs %>% dplyr::select(InitFund, LatestFund, Profit) %>% colSums %>% t) %>%
mutate(RR = formattable::percent(LatestFund/InitFund))
rbind(bs, sm) %>%
kable(caption = 'Return on Investment') %>%
kable_styling(bootstrap_options = c('striped', 'hover', 'condensed', 'responsive')) %>%
scroll_box(width = '100%')
rm(bs, sm)
```
*Table 4.2.1 : betting strategy for financial betting.*
```
No .id StartDate LatestDate InitFund LatesFund Profit RR
07 fundAutoArimaHILO 2015-01-02 2017-01-20 1000 1637.251 637.25113 163.7251%
10 fundAutoArimaLOHI 2015-01-02 2017-01-20 1000 1716.985 716.98492 171.6985%
```
*Table 4.2.2 : betting strategy for normal FOREX market.*
Kindly refer to [2.1.5 Return of Investment in **binary Q1**](https://englianhu.github.io/2017/09/binary-forex-trading-Q1.html#return-of-investment) for full table. However, the paper does not filter the univariate preditive error.
## Optimal Edge
I don't pretend to know the optimal edge for staking. Here I need to compare above models with normal Kelly model.
<span style='color:red'>I used to use</span> <s>`Edge1 = ifelse(fB1 > 0, B1, ifelse(fS1 > 0, S1, 0))`</s> <span style='color:red'>to measure the edge for staking while it might be wrong due to I put the edge for selling as secondary edge for `Buy` as well. Here I use </span>`Edge1a = ifelse(fB1 > 0, B1, 0)`<span style='color:red'> and </span>`Edge1b = ifelse(fS1 > 0, S1, 0)`<span style='color:red'> to seperates the edge for `Buy` and `Sell`. It means that the buy action will be primary and sell action will be secondary where the edge for both `buy` and `sell` will stand. Therefore most of the observation will overcame probabilities 0.5.</span>
```{r staking1, echo = FALSE, eval = FALSE}
## =============== WRONG ==================
## http://srdas.github.io/MLBook/Gambling.html#simulation-of-the-betting-strategy
pred %>%
mutate(
Fct.High = round(Fct.High, 3),
Fct.Low = round(Fct.Low, 3),
Hi = ifelse(High >= Bid, 1, 0),
Lo = ifelse(Low >= Ask, 1, 0)) %>%
tbl_df %>%
select(-Date) %>%
mutate(Buy1 = ifelse(Fct.High >= High, 1, 0),
Buy2 = ifelse(Fct.High >= Bid, 1, 0),
Buy3 = ifelse(Fct.High >= Hi, 1, 0),
Sell1 = ifelse(Fct.Low <= Low, 1, 0),
Sell2 = ifelse(Fct.Low <= Ask, 1, 0),
Sell3 = ifelse(Fct.Low <= Lo, 1, 0)) %>%
select(-High, -Low, -Close)
## A tibble: 1,244 x 14
# DateTime sq Bid Ask Fct.High Fct.Low Hi Lo Buy1 Buy2 Buy3 Sell1 Sell2 Sell3
# <dttm> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 2016-01-05 03:06:08 1 120. NA 120. 119. 0 NA 1 1 1 0 NA NA
# 2 2016-01-05 12:22:55 4 NA 119. 120. 119. NA 1 1 NA NA 0 0 0
# 3 2016-01-06 00:07:16 2 119. NA 120. 119. 0 NA 1 1 1 0 NA NA
# 4 2016-01-06 11:38:17 3 NA 118. 120. 119. NA 1 1 NA NA 0 0 0
# 5 2016-01-07 01:16:01 2 119. NA 120. 119. 0 NA 1 1 1 0 NA NA
# 6 2016-01-07 09:24:27 4 NA 117. 120. 119. NA 1 1 NA NA 0 0 0
# 7 2016-01-08 13:30:07 2 119. NA 119. 119. 0 NA 1 1 1 0 NA NA
# 8 2016-01-08 21:57:33 3 NA 117. 119. 119. NA 1 1 NA NA 0 0 0
# 9 2016-01-11 01:34:30 2 NA 117. 120. 119. NA 1 1 NA NA 0 0 0
#10 2016-01-11 13:37:52 4 118. NA 120. 119. 0 NA 1 1 1 0 NA NA
## ... with 1,234 more rows
```
## Application of Kelly Criterion to Normal FOREX Market
Previous paper using the forecast HiLo price and forecast closing price as settlement, the stakes will be the edge for `pnorm(Hi, mean(Lo), sd(Lo))` and vice verse. It will be $\frac{\sigma_{Hi}}{\sigma_{Lo}}$ or $\frac{\sigma_{Lo}}{\sigma_{Hi}}$ but missing the difference of pips between buy/sell price and closed price.
In this paper will count the difference of pips and also leverage ratio. The risk management on leverage will be counted into the staking model.
The papers **The Kelly Criterion and the Stock Market** and **Beat the Market - A Scientific Stock Market System** states the application of Kelly criterion onto the stock market. Kindly refer to the paper in [Reference] for further knowledge.
## Application of Kelly Criterion to Financial Betting Market
Previous paper use Kelly model to placed a certain amount and awaiting for settlement. The forecast closed price will be the settled price if it was within the range of HiLo in the day. There is not workable due to traders not allowed to placed an close transaction limit order in financial betting market.
- [Financial Spread Betting : Binary](http://www.financial-spread-betting.com/Bet-markets.html)
- [What are Binary Options or Fixed Odds Bets? Binary.com](http://winonmarkets.net/wp/2014/03/20/what-are-binary-options-or-fixed-odds-bets-binary-com/)
Due to there has no dataset for financial betting, therefore I do not have the payout rate or odds price for <s>minutely, hourly and</s> daily betting. The sample dataset and research might refer to **Application of Kelly Criterion model in Sportsbook Investment** where collected AH (Asian-Handicap) and OU (Over-Under) odds price of couple of bookmakers and placed bets^[The odds price offered by operators do not same with the probabilities of the result, similar with the exchange rate offered by different operators will be difference as well.].
I tune a bit in this paper which is set the forecast closed price cannot be settled price. There will be another research which is collect odds price from operators to test the ROI.
# Return of Investment
## Normal FOREX Market
```{r ROI-1}
sim_staking(pred) %>%
dplyr::select(Date, Edge1a, Edge1b, Edge2a, Edge2b, Buy, Sell, Trans, BR, Profit, Bal) %>%
kable(caption = 'ROI for Normal FOREX Market') %>%
kable_styling(bootstrap_options = c('striped', 'hover', 'condensed', 'responsive')) %>%
scroll_box(width = '100%', height = '400px')
```
*Table 5.1.1 : ROI for normal FOREX market.*
```{r plot-data2a, echo = FALSE}
fx <- sim_staking(pred, pnorm_type = 'Bid-Lo',
bid_ask = 'b1-a1', financial_bet = FALSE,
Kelly = 'none') %>%
dplyr::select(Date, Bal)
names(fx)[2] <- 'None.Close'
plotChart2(fx, graph.title = 'Peformance of Investment Fund for FOREX Market', subtitle = 'Initial Fund size : ', initial = '$10,000')
```
*Graph 5.1.1A : ROI for normal FOREX market. (None)*
```{r plot-data2b, echo = FALSE}
fx1a <- sim_staking(pred, pnorm_type = 'Bid-Lo',
bid_ask = 'b1-a1', financial_bet = FALSE,
Kelly = 'mixed') %>%
dplyr::select(Date, Bal)
fx1b <- sim_staking(pred, pnorm_type = 'Bid-Lo',
bid_ask = 'b1-a2', financial_bet = FALSE,
Kelly = 'mixed') %>%
dplyr::select(Date, Bal)
fx1c <- sim_staking(pred, pnorm_type = 'Bid-Lo',
bid_ask = 'b2-a1', financial_bet = FALSE,
Kelly = 'mixed') %>%
dplyr::select(Date, Bal)
fx1d <- sim_staking(pred, pnorm_type = 'Bid-Lo',
bid_ask = 'b2-a2', financial_bet = FALSE,
Kelly = 'mixed') %>%
dplyr::select(Date, Bal)
fx2a <- sim_staking(pred, pnorm_type = 'Ask-Lo',
bid_ask = 'b1-a1', financial_bet = FALSE,
Kelly = 'mixed') %>%
dplyr::select(Date, Bal)
fx2b <- sim_staking(pred, pnorm_type = 'Ask-Lo',
bid_ask = 'b1-a2', financial_bet = FALSE,
Kelly = 'mixed') %>%
dplyr::select(Date, Bal)
fx2c <- sim_staking(pred, pnorm_type = 'Ask-Lo',
bid_ask = 'b2-a1', financial_bet = FALSE,
Kelly = 'mixed') %>%
dplyr::select(Date, Bal)
fx2d <- sim_staking(pred, pnorm_type = 'Ask-Lo',
bid_ask = 'b2-a2', financial_bet = FALSE,
Kelly = 'mixed') %>%
dplyr::select(Date, Bal)
## convert to xts format.
fx1a <- xts(fx1a[-1], order.by = fx1a$Date)
fx1b <- xts(fx1b[-1], order.by = fx1b$Date)
fx1c <- xts(fx1c[-1], order.by = fx1c$Date)
fx1d <- xts(fx1d[-1], order.by = fx1d$Date)
fx2a <- xts(fx2a[-1], order.by = fx2a$Date)
fx2b <- xts(fx2b[-1], order.by = fx2b$Date)
fx2c <- xts(fx2c[-1], order.by = fx2c$Date)
fx2d <- xts(fx2d[-1], order.by = fx2d$Date)
## combine dataset.
fxm <- do.call(cbind, list(fx1a, fx1b, fx1c, fx1d, fx2a, fx2b, fx2c, fx2d))
names(fxm) <- c('BL.B1A1.Close', 'BL.B1A2.Close', 'BL.B2A1.Close', 'BL.B2A2.Close',
'AL.B1A1.Close', 'AL.B1A2.Close', 'AL.B2A1.Close', 'AL.B2A2.Close')
rm(fx1a, fx1b, fx1c, fx1c, fx1d, fx1a, fx2b, fx2c, fx2c, fx2d)
plotChart2(fxm, graph.title = 'Peformance of Investment Fund for FOREX Market', subtitle = 'Initial Fund size : ', initial = '$10,000')
```
*Graph 5.1.1B : ROI for normal FOREX market. (Mixed Kelly model)*
```{r plot-data2c, echo = FALSE}
fx1a <- sim_staking(pred, pnorm_type = 'Bid-Lo',
bid_ask = 'b1-a1', financial_bet = FALSE,
Kelly = 'normal') %>%
dplyr::select(Date, Bal)
fx1b <- sim_staking(pred, pnorm_type = 'Bid-Lo',
bid_ask = 'b1-a2', financial_bet = FALSE,
Kelly = 'normal') %>%
dplyr::select(Date, Bal)
fx1c <- sim_staking(pred, pnorm_type = 'Bid-Lo',
bid_ask = 'b2-a1', financial_bet = FALSE,
Kelly = 'normal') %>%
dplyr::select(Date, Bal)
fx1d <- sim_staking(pred, pnorm_type = 'Bid-Lo',
bid_ask = 'b2-a2', financial_bet = FALSE,
Kelly = 'normal') %>%
dplyr::select(Date, Bal)
fx2a <- sim_staking(pred, pnorm_type = 'Ask-Lo',
bid_ask = 'b1-a1', financial_bet = FALSE,
Kelly = 'normal') %>%
dplyr::select(Date, Bal)
fx2b <- sim_staking(pred, pnorm_type = 'Ask-Lo',
bid_ask = 'b1-a2', financial_bet = FALSE,
Kelly = 'normal') %>%
dplyr::select(Date, Bal)
fx2c <- sim_staking(pred, pnorm_type = 'Ask-Lo',
bid_ask = 'b2-a1', financial_bet = FALSE,
Kelly = 'normal') %>%
dplyr::select(Date, Bal)
fx2d <- sim_staking(pred, pnorm_type = 'Ask-Lo',
bid_ask = 'b2-a2', financial_bet = FALSE,
Kelly = 'normal') %>%
dplyr::select(Date, Bal)
## convert to xts format.
fx1a <- xts(fx1a[-1], order.by = fx1a$Date)
fx1b <- xts(fx1b[-1], order.by = fx1b$Date)
fx1c <- xts(fx1c[-1], order.by = fx1c$Date)
fx1d <- xts(fx1d[-1], order.by = fx1d$Date)
fx2a <- xts(fx2a[-1], order.by = fx2a$Date)
fx2b <- xts(fx2b[-1], order.by = fx2b$Date)
fx2c <- xts(fx2c[-1], order.by = fx2c$Date)
fx2d <- xts(fx2d[-1], order.by = fx2d$Date)
## combine dataset.
fxm <- do.call(cbind, list(fx1a, fx1b, fx1c, fx1d, fx2a, fx2b, fx2c, fx2d))
names(fxm) <- c('BL.B1A1.Close', 'BL.B1A2.Close', 'BL.B2A1.Close', 'BL.B2A2.Close',
'AL.B1A1.Close', 'AL.B1A2.Close', 'AL.B2A1.Close', 'AL.B2A2.Close')
rm(fx1a, fx1b, fx1c, fx1c, fx1d, fx1a, fx2b, fx2c, fx2c, fx2d)
plotChart2(fxm, graph.title = 'Peformance of Investment Fund for FOREX Market', subtitle = 'Initial Fund size : ', initial = '$10,000')
```
*Graph 5.1.1C : ROI for normal FOREX market. (Normal Kelly model)*
```{r plot-data2d, echo = FALSE}
fx1a <- sim_staking(pred, pnorm_type = 'Bid-Lo',
bid_ask = 'b1-a1', financial_bet = FALSE,
Kelly = 'adjusted1') %>%
dplyr::select(Date, Bal)
fx1b <- sim_staking(pred, pnorm_type = 'Bid-Lo',
bid_ask = 'b1-a2', financial_bet = FALSE,
Kelly = 'adjusted1') %>%
dplyr::select(Date, Bal)
fx1c <- sim_staking(pred, pnorm_type = 'Bid-Lo',
bid_ask = 'b2-a1', financial_bet = FALSE,
Kelly = 'adjusted1') %>%
dplyr::select(Date, Bal)
fx1d <- sim_staking(pred, pnorm_type = 'Bid-Lo',
bid_ask = 'b2-a2', financial_bet = FALSE,
Kelly = 'adjusted1') %>%
dplyr::select(Date, Bal)