## Abstract

In the following study, 59132 trading rules were backtested and analyzed in order to determine the statistical significance of the results. Tested rules were on a different level of complexity. There were 56 independent tests – one for each symbol from WIG20 and mWIG40 indices. For statistical testing, two methods were used – White’s Reality Check and Monte-Carlo Simulation. This work was inspired by a case-study from the “Evidence-based Technical Analysis” book as well as “Re-Examining the Profitability of Technical Analysis with White's Reality Check and Hansen's Spa Test” whitepaper.

## Introduction

Technical analysis (TA) tries to analyze price trends and chart patterns to find profitable trading opportunities. Modern TA strategies employ models and trading rules based on different raw and processed time series. The type of data used varies from pricing and volume data to more sophisticated sources like web-scraped sentiment data. One of the advantages of the TA approach is the possibility to unambiguously program every aspect of the trading process. Depending on the sophistication of the system one can achieve various degrees of automation. It ranges from semi-automated systems (where traders run their scripts to acquire trading signals and then trade manually) to fully autonomous process.

The typical flow for developing a trading strategy is to come up with the trading rule(s) and then backtest it. Backtesting is simply a simulation run on historical data to see how the rule would perform if used in the past. At a high-level, if the rule has any predictive power and it performed well in the past - there is a chance that it will continue to do so in the near future.

This development process is full of different traps and biases that cause the results of backtesting to be over-optimistic. The not exhaustive list of things that can go wrong is:

- Not taking into account trading costs (frequent trading may cost a lot - there are fees and other costs like a bid/ask difference)
- Look-ahead bias (using data points that would not be present at the moment in a real-life scenario)
- Position bias (e.g. one test trading rule that tends to take a long position on historical data with the uptrend)
- Over-fitting / data-snooping bias (one test tens of thousands of rules/parameters on single time series - good results may be a matter of pure luck and not rule predictive power)

Many studies deal with the above-mentioned issues. Some of the problems are easier to avoid (e.g. look ahead bias or trading cost inclusion) while the others may be more challenging - data-snooping bias is one of them. The nature of most trading rules is that they depend on many parameters. For example, one can think about a trading rule that uses two moving averages based on stock's close prices to produces a trading signal. Even such a simple scenario has at least 3 parameters - moving average type (simple, exponential, weighted, etc.) and two lookback periods for the averages. To choose appropriate parameters, one may run multiple tests to choose the set that performs the best. If one is to test two types of moving averages and six lookback periods, one will have to run 30 tests (assumes that one moving average is always shorter). Now, all of those 30 possibilities will be tested using the same historical time series. If one increases the parameter’s search space and rule’s complexity - the number of necessary tests increases exponentially. The more tests one runs - the higher the chances that achieved performance is due to pure luck rather than rule's predictive power.

In this article, data-mining methods are used to find a profitable TA trading strategy. As tens of thousands of rules are tested it is necessary to control for data-snooping bias. Two statistical methods are used to test if best performing rules have actual predictive power or if their superior performance is due to luck. The following work is inspired by the “Evidence-based Technical Analysis” book written by David Aronson and the “Re-Examining the Profitability of Technical Analysis with White's Reality Check and Hansen's Spa Test” whitepaper by Po-Hsuan Hsu and Chung-Ming Kuan.

## Methodology

The primary focus of the study was to find profitable TA rules that exhibit predictive power. For that, data mining was used. It is a process in which the profitability of many rules is tested and compared so that one or many superior rules can be selected.

## Universe

56 symbols that belonged (at a study time) to WIG20 and mWIG40 indices of the Warsaw Stock Exchange were tested. The assumption was that no TA rule works universally best across all underlying stock symbols. Symbols (actually - companies represented by symbol's price movement) have various characteristics and different TA rules may exploit their different traits and inefficiencies.

## Rules

For each symbol, 59132 rules were tested independently. Those rules can be classified into multiple categories. Description of the rules, their parameter values, and the number of used combinations for each class can be found in the appendix. At a high level, two types of rules can be found – simple and complex. Simple rules follow a set of unchanged principles throughout the whole backtest. A good example of such a rule could be the classic two moving averages (MA) strategy, where one buy if a shorter MA crosses a longer one from the below and sell when it crosses it from above. Complex rules on the other hand consist of many simple rules. During the testing period, their performance is periodically reviewed, and rule behavior can change accordingly. As an example, one can think about a complex rule that always follows its best performing constitutes. Throughout the testing period, this complex rule may follow different simple rules after each review, depending on the performance metric.

## Statistical testing

As a backtest performance metric the average daily returns were used. For the statistical testing, two methods were used - White’s Reality Check and Monte-Carlo Simulation. Both of the procedures are described in the appendix. At a high level though, in both of them, one artificially creates a distribution of best average daily returns. Then, returns from the actual backtest are compared to that distribution to see what the probability is of achieving such a result. If the probability is very low one can assume statistical significance. It would mean that there is a high chance that a rule has predictive power and superior test performance is a result of that.

Using such tests protects from data-snooping bias. It will be shown later in the results section, that it is common for a rule to have a good result only by a chance. When it happens, an assumption about the rule’s future performance is just a gamble and one cannot rely on that.

## Framework

All data processing and computation was done using Python and custom-build framework. There were three main parts of that framework – signal generation, backtesting, and results analysis module. Such structure helped reuse the code, optimize computations and it will serve as a foundation for further researches. The code is available on the author's git page.

## Results

Table 1 summarizes the achieved results. For exact definitions of rules see Appendix 1 where all parameters are listed. In Table 1, the rule is described as class abbreviation followed by parameter names and values.

Both statistical tests that were used can yield different results. To have a single metric, if the outcome of both tests were statistically positive then the rule had a “Statistically Significant” label assigned. If only one test was positive, then the “Directionally Positive” label was assigned.

Among all tested signals, only 17 backtests resulted in a statistically significant outcome of some sort (10 “Statistically Significant” and 7 “Directionally Positive”). Out of symbols for which rules demonstrated superior statistically significant performance 6 of them belongs to the WIG20 index and 11 to mWIG40.

Following are the classes of rules that achieved satisfying results for at least one symbol: Momentum Strategies in Volumes (1), Filter (4), Reversed Filter (1), On-balance Volume (3), Reversed Moving Averages (3), Moving Averages (2), Reversed Complex (learning) (2), Complex (learning) (1). Table 1 shows that even simple rules can achieve superior outcomes. Actually, only 3 out of 17 rules were complex. Another unexpected result is that 6 rules were of “reversed” type.

Presented results emphasize the importance of statistical testing to uncover data-snooping bias. Although all of 56 top rules found with data-mining procedure achieved superior positive returns, only 30% of them did it because the rule had actual predictive power. This number would be even smaller (18%) if one is to take a more conservative approach and count as statistically significant only those rules where both tests yielded statistical significance.

Symbol | Index | Best rule | Rule class | Statistical significance | Avg. daily returns | Days in position | Length of backtest (years) |
---|---|---|---|---|---|---|---|

11BIT | mWIG40 | MSV ROC m2 k1c50 | Momentum Strategies in Volumes | Directionally positive | 0.003476 | 2349 | 10 |

ALIOR | WIG20 | CPX LRN filter ma cb oba m120 r5 daily returns | Complex (learning) | No predictive power | 0.001507 | 1586 | 8 |

AMICA | mWIG40 | oba S n2 | On-balance Volume | Statistically significant | 0.002471 | 5678 | 23 |

AMREST | mWIG40 | filter 07 DL lb3 | Filter | Directionally positive | 0.001543 | 3709 | 15 |

ASSECOPOL | WIG20 | reversed CPX LRN ma oba msp m10 r5 daily returns | Reversed Complex (learning) | Statistically significant | 0.001376 | 5163 | 22 |

BENEFIT | mWIG40 | reversed CPX LRN ma cb msv m5 r5 avg log returns held only | Reversed Complex (learning) | No predictive power | 0.001598 | 1535 | 9 |

BOGDANKA | mWIG40 | CPX LRN ma oba msv m120 r5 daily returns | Complex (learning) | No predictive power | 0.001376 | 2320 | 11 |

BORYSZEW | mWIG40 | ma S n25m20 | Moving Averages | Directionally positive | 0.003664 | 5945 | 24 |

BUDIMEX | mWIG40 | reversed ma S n40m30 b015 | Reversed Moving Averages | No predictive power | 0.001231 | 6214 | 25 |

CCC | WIG20 | reversed MSV XAVGS m20n2 k05c10 | Reversed Momentum Strategies in Volumes | No predictive power | 0.00152 | 3548 | 16 |

CDPROJEKT | WIG20 | ma S n100m50 c25 | Moving Averages | Directionally positive | 0.001974 | 6306 | 26 |

CIECH | mWIG40 | oba S n2 | On-balance Volume | Statistically significant | 0.002505 | 3825 | 15 |

CIGAMES | mWIG40 | reversed CPX LRN filter support resistance ma cb msp msv m20 r20 daily returns | Reversed Complex (learning) | No predictive power | 0.002067 | 2135 | 13 |

COMARCH | mWIG40 | filter 025 EL20 lb28 | Filter | Statistically significant | 0.001556 | 5278 | 21 |

CYFRPLSAT | WIG20 | reversed ma S n10 | Reversed Moving Averages | Statistically significant | 0.002109 | 3009 | 12 |

DINOPL | mWIG40 | reversed ma S n5m2 b02 | Reversed Moving Averages | No predictive power | 0.00337 | 772 | 3 |

ECHO | mWIG40 | CPX LRN filter ma cb cdl m120 r5 daily returns | Complex (learning) | No predictive power | 0.001136 | 5237 | 24 |

ENEA | mWIG40 | reversed CPX LRN filter ma cb oba m20 r20 daily returns | Reversed Complex (learning) | No predictive power | 0.000978 | 2387 | 12 |

ENERGA | WIG20 | reversed CPX LRN filter ma cb oba msv m20 r5 daily returns | Reversed Complex (learning) | No predictive power | 0.001641 | 1184 | 7 |

FAMUR | mWIG40 | filter 07 EL15 lb28 | Filter | No predictive power | 0.001526 | 3355 | 14 |

FORTE | mWIG40 | CPX LRN filter support resistance ma cb oba cdl m120 r5 daily returns | Complex (learning) | No predictive power | 0.001394 | 4980 | 24 |

GETIN | mWIG40 | filter 015 EL10 lb14 | Filter | Statistically significant | 0.002339 | 4736 | 19 |

GPW | mWIG40 | reversed ma S n40m20 c25 | Reversed Moving Averages | No predictive power | 0.000999 | 2345 | 10 |

GRUPAAZOTY | mWIG40 | oba S n2 | On-balance Volume | No predictive power | 0.001806 | 2979 | 12 |

GTC | mWIG40 | MSV AVG m2 k05c50 | Momentum Strategies in Volumes | No predictive power | 0.001034 | 3960 | 16 |

HANDLOWY | mWIG40 | reversed ma S n10m2 c50 | Reversed Moving Averages | No predictive power | 0.000892 | 5722 | 23 |

INGBSK | mWIG40 | reversed CPX LRN cb oba msp m10 r10 avg log returns held only | Reversed Complex (learning) | No predictive power | 0.0010168 | 5865 | 26 |

INTERCARS | mWIG40 | filter 08 DL lb14 | Filter | No predictive power | 0.001395 | 3724 | 16 |

JSW | mWIG40 | oba S n2 | On-balance Volume | No predictive power | 0.002541 | 2219 | 9 |

KERNEL | mWIG40 | reversed oba S n50m20 d3 | Reversed On-balance Volume | No predictive power | 0.001409 | 3072 | 13 |

KETY | mWIG40 | reversed CPX LRN support resistance ma cb oba msv m120 r20 avg log returns | Reversed Complex (learning) | No predictive power | 0.000986 | 5781 | 24 |

KGHM | WIG20 | reversed CPX LRN oba msp msv m60 r20 avg log returns | Reversed Complex (learning) | No predictive power | 0.001199 | 5435 | 23 |

KRUK | mWIG40 | MSV AVG m5 k2c10 | Momentum Strategies in Volumes | No predictive power | 0.001534 | 1580 | 9 |

LIVECHAT | mWIG40 | reversed MSV XAVGS m10n2 k2c10 | Reversed Momentum Strategies in Volumes | No predictive power | 0.002025 | 1375 | 6 |

LOTOS | WIG20 | CPX LRN ma cb msp msv m120 r20 daily returns | Complex (learning) | Statistically significant | 0.001533 | 3289 | 15 |

LPP | WIG20 | reversed CPX LRN filter ma oba msp m10 r10 avg log returns | Reversed Complex (learning) | Directionally positive | 0.001573s | 4405 | 19 |

MABION | mWIG40 | reversed CPX LRN filter ma msp cdl m60 r10 daily returns | Reversed Complex (learning) | No predictive power | 0.002246 | 1844 | 10 |

MBANK | WIG20 | MSV AVG m5 k05c50 | Momentum Strategies in Volumes | No predictive power | 0.00126 | 6527 | 28 |

MILLENNIUM | mWIG40 | CPX LRN support resistance m20 r20 daily returns | Complex (learning) | No predictive power | 0.001411 | 2826 | 28 |

ORANGEPL | WIG20 | reversed CPX LRN ma cb msv cdl m5 r5 daily returns | Reversed Complex (learning) | No predictive power | 0.000967 | 4217 | 22 |

ORBIS | mWIG40 | oba S n25m10 c50 | On-balance Volume | No predictive power | 0.000907 | 5552 | 23 |

PEKAO | WIG20 | reversed CPX LRN oba msp cdl m5 r5 daily returns | Reversed Complex (learning) | No predictive power | 0.001088 | 4621 | 22 |

PGE | WIG20 | reversed CPX LRN filter cb oba m120 r20 daily returns | Reversed Complex (learning) | No predictive power | 0.001385 | 2171 | 11 |

PGNIG | WIG20 | reversed filter 005 EL2 lb3 | Filter | Statistically significant | 0.001403 | 3669 | 15 |

PKNORLEN | WIG20 | reversed ma S n5m2 d4 | Reversed Moving Averages | No predictive power | 0.001064 | 4708 | 21 |

PKOBP | WIG20 | reversed filter 005 EL3 lb7 | Reversed Filter | Statistically significant | 0.00141 | 3885 | 16 |

PKPCARGO | mWIG40 | reversed MSV XAVGS m5n2 k15c50 | Reversed Momentum Strategies in Volumes | No predictive power | 0.002165 | 1597 | 7 |

PLAY | WIG20 | reversed CPX LRN filter oba msv cdl m20 r5 avg log returns | Reversed Complex (learning) | No predictive power | 0.00247 | 447 | 3 |

PLAYWAY | mWIG40 | reversed ma S n40m30 b05 | Reversed Moving Averages | Directionally positive | 0.004792 | 861 | 4 |

PZU | WIG20 | reversed CPX LRN filter ma oba cdl m10 r5 daily returns | Reversed Complex (learning) | No predictive power | 0.001085 | 227 | 10 |

SANPL | WIG20 | MSV AVG m2 k1c50 | Momentum Strategies in Volumes | No predictive power | 0.001325 | 6581 | 27 |

STALPROD | mWIG40 | oba S n2 | On-balance Volume | Statistically significant | 0.002199 | 5604 | 23 |

TAURONPE | WIG20 | reversed oba S n5m2 b03 | Reversed On-balance Volume | No predictive power | 0.001148 | 2473 | 10 |

TRAKCJA | mWIG40 | CPX LRN filter ma cb msp cdl m120 r5 daily returns | Complex (learning) | No predictive power | 0.002029 | 2507 | 2 |

WAWEL | mWIG40 | reversed ma S n5 | Reversed Moving Averages | Directionally positive | 0.001221 | 5378 | 22 |

WIRTUALNA | mWIG40 | reversed ma S n5m2 b01 | Reversed Moving Averages | No predictive power | 0.001721 | 1262 | 5 |

## Summary

One should not take as granted that the rule’s impressive returns from the backtest will continue to persist in the foreseeable future. Along with superior returns, the rule has to have predictive power. This is especially important if one is testing multiple rules on the same historical time-series.

In the following study, it was proved that data mining can be used to find rules that provide positive returns as well as predictive power. The importance of statistical testing was also shown - among all tested signals only 17 out of 56 backtests resulted in a statistically significant outcome. Surprisingly, most of the “winning” rules were pretty simplistic. Discovered rules could be used as a stand-alone strategy or be incorporated into one.

## Appendix 1 – Rules definitions

**Filter rules**

If price closes at least x% above last low price within lookback period – buy and hold until price moves down at least x% from subsequent hight. At that time simultaneously sell and go short. Moves less than x% in either directions are ignored. Subsequent high/low can be defined as follow:

- A subsequent high is the highest closing price achieved while holding a particular long position. Likewise, a subsequent low is the lowest closing price achieved while holding a particular short position
- A low/high can be defined as the most recent closing price that is less/greater than the "e" previous closing prices

- Allowing a neutral position to be imposed, that is - closing position without going long/short. This is accomplished by liquidating a long position when the price decreases y% from the previous high and covering a short position when the price increases y% from the previous low.
- Holding position for a fixed number of days, c. While holding period - ignoring all other signals generated during that time.

- x = 0.005, 0.01, 0.015, 0.02, 0.025, 0.03, 0.035, 0.04, 0.045, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.12, 0.14, 0.16, 0.18, 0.2, 0.25, 0.3, 0.4, 0.5 (24 values)
- y = 0.005, 0.01, 0.015, 0.02, 0.025, 0.03, 0.04, 0.05, 0.075, 0.1, 0.15, 0.2 (12 values)
- e = 1, 2, 3, 4, 5, 10, 15, 20 (8 values)
- c = 5, 10, 25, 50 (4 values)

- Basic rule: 120 possibilities
- Basic with c: 360
- Alternate low/high: 480
- Alternate low/high and c: 1440
- Basic with neutral position (x,y): 925
- Total: 3325

**Support & Resistance rules**

A simple trading rule based on the notion of support and resistance (S&R). Go long when the closing price exceeds the maximum price over the previous n days (lookback period) and sell when the closing price is less than the minimum price over the previous n days. As an alternative one can define minimum/maximum to be the most recent closing price that is less/greater than the “e” previous closing prices. After entry, positions were held for a prespecified number of days, “c”.

Used parameters and their values:

- lookback = 3, 7, 14, 28, 56 (5 values)
- e = 2, 3, 5, 10, 15, 20 (6 values)
- c = 5, 10, 15, 25 (4 values)

- Standard low/high price definition: 20 possibilities
- Alternate low/high definition: 80
- Total: 100

**Moving Averages rules**

In an uptrend, long positions are retained as long as the closing price remains above the moving average (MA). Thus, when the price goes below MA - it is a sell signal. In a downtrend, short positions are held as long as the price trend remains below the MA. When the price penetrates from the upside of the moving average it is regarded as a buy signal. There are the following variations of this rule:

- Two MA used – short (fast) and long (slow) one. Buy and sell signals can be generated by crossovers of a slow-moving average by a fast-moving average.
- The fixed percentage band filter requires the buy or sell signal to exceed the moving average by a fixed multiplicative amount, b
- The time delay filter requires the buy or sell signal to remain valid for a prespecified number of days, d, before action is taken.
- Holding a given long or short position for a prespecified number of days, c.

- n/m (MA lookback periods): 2, 5, 10, 15, 20, 25, 30, 40, 50, 75, 100, 125, 150, 200, 250 (15 values)
- b: 0.001, 0.005, 0.01, 0.015, 0.02, 0.03, 0.04, 0.05 (8 values)
- d: 2, 3, 4, 5 (4 values)
- c: 5, 10, 25, 50 (4 values)

- Basic version: 15 possibilities
- 2 MA version: 105 possibilities
- 2 MA with “b”: 840 possibilities
- 2 MA with “d”: 420 possibilities
- 2 MA with “c”: 420 possibilities
- Total: 1809

**Channel Break-Outs rules**

A channel can be said to occur when the high over the previous n days is within x% of the low over the previous n days, not including the current price. Go long when the closing price exceeds the channel, and to sell when the price moves below the channel. Long and short positions are held for a fixed number of days, c. Additionally, a fixed percentage band, b, can be applied to the channel as a filter. Used parameters and their values:

- n: 5, 10, 15, 20, 25, 50, 100, 150, 200, 250 (10 values)
- x: 0.005, 0.01, 0.02, 0.03, 0.05, 0.075, 0.10, 0.15 (8 values)
- c: 2, 5, 10, 25, 50 (5 values)
- b: 0.001, 0.005, 0.01, 0.015, 0.02, 0.03, 0.04, 0.05 (8 values)

- Standard (no “b”): 400
- With all parameters: 2150
- Total: 2550

**On-Balance Volume Averages rules**

Those rules are based on the volume of transaction data. The on-balance volume (OBV) indicator is calculated by keeping a running total of the indicator each day and adding the entire amount of daily volume when the closing price increases, and subtracting the daily volume when the closing price decreases. Moving average is then applied to n days to the OBV indicator. The OBV trading rules employed are the same as for the Moving Average rules, except in this case the value of interest is the OBV rather than price.
Parameters:

- Same as for MA.

- 1800 (same as for MA minus 9 extra cases)

**Momentum Strategies in Price/Volume**

Those rules adopt the so-called “oscillator” constructed from a momentum measure. The momentum measure used in this study is the rate of change (ROC). Specifically, the m-day ROC at time t is: (qt − qt−m)/qt−m, where qt is the closing price(volume) at time t.

Three different oscillators where used:

- Simple oscillator (it is just the m-day ROC)
- Moving average oscillator (w-day simple MA of m-day ROC with w ≤ m)
- Cross-over moving average oscillator (ratio of the w1-day moving average to the w2-day moving average, both based on m-day ROC, with w1 < w2)

Used parameters and their values:

- m (number of days for ROC calculations): 2, 5, 10, 20, 30, 40, 50, 60, 125, 250 (10 values)
- w (number of days for the moving averages): 2, 5, 10, 20, 30, 40, 50, 60, 125, 250 (10 values)
- k (oscillator threshold used as overbought/oversold level): 0.05, 0.10, 0.15, 0.2 (4 values)
- f (fixed holding days): 5, 10, 25, 50 (4 values)

- Simple oscillator: 200
- MA ROC oscillator: 200
- Cross-over moving average oscillator: 900
- Total: 1300*2 = 2600 (as there are two set of rules - using price or volume data)

**Rules based on candlestick patterns**

Rule based on candles formations: morning/evening star, bearish/bullish engulfing and hammer/hanging man. If any of the formations (in mentioned order) triggers long/short signal, then enters a trade. Trade will be held for c days. Lookback period is used within the formation’s calculations. For more details about pattern implementation see previous post. Used parameters and their values:

- n/m (MA lookback periods): 2, 5, 10, 15, 20, 25, 30, 40, 50, 75, 100, 125, 150, 200, 250 (15 values)
- lookback: 3, 7, 14, 21, 28, 56 (6 values)
- c: 1, 2, 5, 7, 14, 21, 28 (7 values)

- Total: 6*7 = 42

**Combined rules**

Combined rules work similarly like standard simple rules. It reuses many simple rules and uses voting strategy to choose a position. If majority of its simple rules goes long – rule output long position. In case of tie neutral position is taken. Short position otherwise.

To generate those rules, simple rules were grouped into different 8 classes (filter rules, MA rules etc.). Then, simple rules (10 or 20) from class where randomly chosen. It was done 10 times to have different sets of rules. Combined rule could constitute of rules from one or many classes. All classes combinations were used.

Used parameters and their values:- Number of classes combinations: (255 values)
- Number of rules from class: 20, 40 (2 values)
- Number of different samples: 10 (10 values)

- Total: 5100

**Complex (learning) rules:**

Similar to combined rules, complex ones consist of many simple rules from different classes. Unlike all the other rules, complex rules are being reviewed, assessed, and changed periodically during backfill. There are three parameters: review span (frequency of reviews), memory span (past data taken into account while assessing performance), and performance metric (based on which decision is made). Each review, best performing simple rule is chosen and followed until the next review. To choose which simple rules should be included – the same procedure as with combined rules was used. Following metrics were used to assess performance:

- daily_returns - sum of rules daily returns
- avg_log_returns – average of log returns
- avg_log_returns_held_only - average of log returns but only on days were position was held
- voting – most frequently position taken by all the simple rules

- Number of classes combinations: (255 values)
- Number of rules from class: 20, 40 (2 values)
- Number of different samples: 10 (10 values)
- Memory spans: 5, 10, 20, 60, 120 (5 values)
- Review spans: 5, 10, 20 (3 values)
- Performance metrics: (4 values)

- Total: 12240

**Reversed rules:**

Additionally, from each generated rule, its “reversed” version was created. That is, if the original rule’s output was a long position, its reversed version would output a short position and vice versa. Number of rules:

- Total: 29566

## Appendix 2 – White’s Reality Check

At high level, the idea of White’s Reality Check (WRC) is to create a sampling distribution of best performing rules. For that random sampling with replacement is used. Then, one checks probability if best rule chosen during data mining comes from newly created distribution. If there is a high probability of that – rule has no predictive power. Rule has predictive power otherwise.

To create WRC sampling distribution following procedure was used:

- Canter daily return
- Run k-times where k is the number of samples
- Randomly (with replacement) choose x days from backtest. “x” should be equal to the total number of backtest days
- For each rule that took part in data mining, calculate average returns achieved during randomly selected days
- Choose maximum value from average daily returns. This will be the distribution data point

## Appendix 3 – Monte Carlo (MC) Simulation

This method is also based on sampling distribution. This sampling distribution represents the expected return of useless (random) rule. At a high level, it is done via random assignment of rule's values to market returns. MC's NULL hypothesis is simply that all rules tested have output values that are randomly correlated with future market behaviour.

Method:

- Obtain daily rules output states (-1,1 or -1,0,1)
- Rules outputs are paired without replacement with random day market returns. Pairing should be constant across all the rules. that is, if (o1, m15), that is output from day 1 is paired with market return from day 15, this pairing should be same for all rules
- Determine mean rate of return for each rule (average daily returns)
- Select highest mean return as entry for sampling distribution