Verifying usefulness

of popular candlestick patterns


Candlestick chart patterns are well known technical analysis tool. They supposed to give insights into the "emotions of traders" by visually representing the size of price moves in different colors. Those emotions are believed to heavily influence market prices. Hence, by investigating different regularly occurring candles patterns one should be able to forecast the short-term direction of the price.

There are tons of books and blog posts about the usage of those patterns so I will not waste more energy and go into the extreme details here. What I prefer to do instead is to experimentally test if some of the most popular candlestick patterns have any prediction power at all.

This work presents how one can verify knowledge with data. Even if “knowledge” intuitively makes sense or it has a great story (or experts) behind it - one should not trust it unless it passed scrupulous testing. If testing was not performed by others – one should do it by her/him-self.

What are candlestick patterns and what they do in theory?

At a high level, in trading there exists a branch of analysis methodology called technical analysis. It aims to forecast the direction of the prices through the study of past market data. In practice, instead of time-series analysis, the most popular technical analysis methods require subjective visual inspection of price data charts. Analysts look in chart for some known shapes called “price formations”. Those formation supposed to predict the next price movement and hence help to make an appropriate trading decision. Candlestick patterns are one of many of those formations.

In theory, candlestick patterns should fill the gap between rational investing based on market value analysis and other fundamentals and actual price changes. The high-level reasoning behind this is as follows - market participants are only humans which often act based on hunch and impulses. It does not matter if the stock is over/under-valued. In the short term, the price will move mainly due to traders' expectations, fears, greed, etc.

Putting aside all the problems related to that oversimplified premise (e.g. in the world where algorithmic trading is normal - not all the participants are humans), how those market sentiments are being revealed by candlestick patterns? A single candle is built on four prices that occurred during a trading session. Those prices are opening, closing, maximum and minimum price. Different distances and settings between those prices should describe different market scenarios.

The story may go like this:
"In a scenario where the market is full of bullish energy, the "hanging-man" pattern can appear. On the hanging-man day, the market opens at or near the highs, then sharply sells off, and then rallies to close at or near the open. If the market opens lower the next day, those who bought on the open or close of the hanging-man day are now left "hanging" with a losing position. The fear of staying at losing position may push traders to rapidly close their positions. That may subsequently change the overall sentiment to rather bearish."

Reality or just a fancy story?

One can see that the description from the section above uses a lot of fuzzy terms, but it also tries to sound smart and uses some logical reasoning. Such a combination is usually at least a yellow flag. Let's break it down and think about it a little bit more.

"In a scenario where the market is full of bullish energy (...)"

That usually refers to a situation where a given stock price increases for some time. It is not told here how long, how much or how steadily price has to increase. We are staring with quite vague market conditions.

"On the hanging-man day, the market opens at or near the highs, then sharply sells off, and then rallies to close at or near the open. If the market opens lower the next day, those who bought on the open or close of the hanging-man day are now left "hanging" with a losing position"

This part explains how the candle is being created. To be more precise, it describes the "hanging-man" formation (fun fact - price formations usually have super fancy names). The market open price should be close to the highest one, meaning that for most of the session price was lower than the one at the beginning. After the session opens the price starts to rapidly decrease and later on slowly go up again. At the end of the session, the price is somewhere close to the starting point. Figures 1 and 2 show how such a candle would look like as well as time-series with the price level during the whole session.

Fig.1 - Hanging Man formation
Fig.2 - Possible price behaviour during the session

This is the part where "emotions" and "market participants behavior" is present. According to theory, the described price movement emerges as a result of the interplay between buyers and sellers. Both groups have certain expectations of the price which then influences their actions. One can think here about the sellers or arbitragers who believe that the uptrend will no longer persist. They start to close positions or go short at the beginning of the session. If there are enough of them - the price rapidly goes down. But then you have the other group - buyers/bulls which still believe in trend. As they happily buy off - the price is moving up again and ends the session near the starting point. The next day's session starts, and it opens at a lower price. That means that sentiment changed a little bit - more and more people believe that the uptrend finishes. From now on, there is "fear of staying at losing position" which "pushes traders to rapidly close their positions". And voilà... the trend of the price changes.

It is a great story that supposes to explain what happened and hence justify the predictive power of the candle formation. It is great because it makes you feel that there is a logical explanation. That A follows B which obviously leads to C. It makes sense, doesn't it? Or... does it?

Let's separate facts from the suppositions. The fact is that the price made above described move. It's certain. But the underlying cause of the move - that's just a guess. For example, we have no evidence that more and more traders believed that the price trend is about to change. Yes - this may "match" to what can be observed with the price, but it's not the only explanation. Actually, one can find multiple reasonable explanations of why the price went down and up. For instance, a couple of big players had to liquidate their positions. It's not that they believe that price will not go up anymore. Maybe they just see great opportunity somewhere else and the current position is not good enough?

When it is good to believe in a story?

So here comes the questions - if there are multiple explanations, then which one is correct? Is there any? Why does it even matter? And the answer is - it depends on one's perspective. If one tries to explain market dynamics and describe this complex system - the story and it's validity matters a lot. But if the only thing one cares about is the forecasting of the future price level - the story is not that important anymore. It is its predictive power that matters the most. The story itself is just a justification. It makes one feels good that she/he understands what and why something happened.

Putting it in different words - even the most reasonable and logical post-mortem explanation of the event is useless unless it allows predicting the future outcome in a similar situation. Or the other way around - if something "works well" (one can make valid predictions based on it), even the most ridiculous story behind it may be satisfactory.

The problem arises when someone believes in a story without a systematic assessment of its predictions. Unfortunately, this is often the case. Reasonable story backed up with cherry-picked examples may be so convincing that one starts to believe in it without a doubt. The situation is even worse when the story/theory is flexible enough to accommodate its future failures.

Let's get back to a "hanging-man" candle example. Imagine the scenario where one uses this candlestick pattern as a trade entry signal. One day she/he finds "hanging-man" in the price time-series. They believe the trend will reverse and they enter an appropriate position. Later on, the trend continues unchanged. The prediction was clearly wrong, but the flexible nature of candlestick patterns allows for a hack here. One can blame the analyst who identified the formation instead of the formation itself. As there are no strict rules defining the candles, one can look at the data post-mortem and tell that in fact, the formation was never there. Maybe "it closed too far away from the open", "the price drop was not big enough" or "the market did not have enough bullish energy". That would explain why the prediction was not wrong and justify further believe in the predictive power of price formations.

Rigorous testing

Above mentioned "post-mortem" justification of prediction's failure is only one from the many traps one can fall into. Human minds have a lot of biases that allow believing in great stories even when the contradictory pieces of evidence exist. There are tons of great books about that. Fortunately, there exist also tons of other (statistical) books which gives us tools to rigorously test different stories (hypothesis). Those tests help us to reject theories that sound promising, but in reality, brings no predictive power.

In the remaining part of this article, I will use statistical tests to assess the predictive power of some of the most popular candlestick patterns. From now on, I will focus purely on whether the pattern does what it promises. I will not go into the reasoning "why" or "how" it is doing it. My goal is to assert if candlestick patterns could be used in any meaningful way of trading. And for that, those patterns should work at least better than the random data points in predicting the next direction or magnitude of the price move.

It is worth emphasizing the fact that the tests will refer to formations defined in exactly the same way as in this work. It is not testing any candlestick formations in general. Those are just concepts. To remove any subjectivity from formation, there have to be clearly defined rules of how formation looks like so that tests can be recreated. Rules should not be ambiguous. If rules according to which formation is defined are programmable – their objectivity is assured.

Hanging man and Hammer

Those two single-candle formations are pretty similar. The only difference between them is that one requires a preceding uptrend and the other downtrend. Below is the example of those two in the real-world price time-series.

Fig.3 - Hanging Man
Fig.4 - Hammer

The previous direction of the price will change. If there was an uptrend, then "Hanging Man" can appear. It means that the trend will reverse (it will either start to be rather a horizontal line or it will change to the downtrend). In the case of the downtrend, "Hammer" can appear. It also suggests the changing direction on the price (to horizontal or uptrend).

Two hypotheses will be examined:

  • A Hanging Man/Hammer formation has the same probability as a random candle in predicting the future move of the price.
  • The magnitude of the desired price move occurring after Hanging Man/Hammer formation is of similar order as the magnitude of the desired price move occurring after a random candle.

The first hypothesis answers the question if the studied formations predict price direction more often than any other candle. If one wants to use formation as an entry signal for a trade - ideally that would be the case. If there is no statistical evidence to reject the first hypothesis it means that Hanging Man/Hammer is as good in predicting future price move as a random candle. In practice - why one should rely on a particular candle formation to know the future price direction if using a random data point is just as useful?

The second hypothesis looks at the magnitude of the price change. The reason for this is as follows: Let's say that the particular price formation is no better than a random candle in guessing the future price direction. OK... but, it may still be profitable to use a formation if its subsequent price change is statistically bigger than the price change after the random candle.

Imagine two identical decks of cards - A and B. Both decks contain one Joker. You can draw a card from only one of the decks. If it happens you choose a Joker - you win a prize. Joker from deck A is worth $100 and the one from deck B is $1000. As both decks are the same and fairly shuffled - there is an equal probability to draw a Joker. From which deck you would draw a card? This imaginary example would be a good metaphor to reality if the price change after the candlestick formation is statistically bigger than the one following a random candle.

Implementation notes:
By the price move, I mean "trend". For example, after Hanging Man one should observe downtrend. Multiple time intervals will be tested - did the trend changed 1, 2, 5, 7, 14, 28 days after formation appearance?

There are multiple ways to define a price trend. In a classical "technical analysis" approach one should draw a line that touches subsequent peaks or valleys. I find this approach too subjective and noisy though. Single extreme price move can make the trend look opposite to what is intuitively observed. Also, as technical analysis is less technical than the name suggests different traders would probably draw a line a little bit differently. They may start from a different point and put emphasis on the different parts of the chart so that it matches their "expertise" (gut feeling?). Fig.5 shows different ways trend line can be drawn using "a classical" approach.

Fig.5 - Multiple possible trend lines

To make a "trend" concept objective and testable a different approach is used in this work. At every candle a straight line is fitted (simple linear regression) into the 14 days of previous closing prices. The slope of the line is then checked. According different values of the slope “uptrend”, “downtrend” and “horizontal” is defined.

When looking at price change - I look at the percentage change. Absolute values have no meaning as different stock prices are of different magnitudes. Price change from $2 to $4 means more than from $102 to $104.

Creating probability distributions
At a high level to accept/reject a hypothesis one needs to know how something usually behaves. This is realized via a random variable probability distribution. When probability distribution is available one can compare the actual (observed) test results with the ones from a probability distribution. If the values obtained in the test are the values that have a very low probability of occurrence (according to distribution) - it is strong evidence to reject the hypothesis. In other words - if according to some assumptions it is very unlikely that something will happen, and then if it surprisingly happens - there is a high chance that the assumptions were wrong.

Let's look at the first hypothesis:
"A Hanging Man/Hammer formation has the same probability as a random candle in predicting the future move of the price."

To start we need to know the probability distribution of random candles prediction accuracy. That is if one chooses some random candles and checks how well they predict future price move - what is the expected ratio of correct predictions?

Before creating the distribution from real data let's try to do some educated guesses. At any given time there are only three possible prediction outcomes. The trend of the price may stay the same, go horizontal or change in the opposite direction. We want to check how well a random candle can predict a change in the trend to either horizontal or opposite direction. This condition will be satisfied if two out of three possible outcomes appear. It means that assuming the equal probability of occurrence of each outcome - random candle will "predict" price movement correctly 2/3 times (or ~66%). Now - that is only a guess. In reality, some assumptions which were made may not be true. For example, maybe when there is an uptrend - the probability of price to stay in the uptrend is relatively higher?

There were a couple of distributions that had to be created to properly test all the hypotheses tested in this work. The reason for that is that different candles were randomly picked depending on the formation and the length of prediction. For example, to create a distribution for the test of Hanging Man formation's predictive power after 7 days, the following steps were executed:

  1. Randomly chose a stock symbol.
  2. For the chosen symbol, randomly pick a data point (candle).
  3. If 14 days trend preceding this candle was an uptrend continue the procedure. If there was a downtrend, abandon the rest of the procedure and start back again from step 1.
  4. Check the trend of the price in the period between the candle and the next 7 sessions.
  5. Record if the later trend matches the prediction or not.
  6. Repeat steps from 1-5 as long as the sample size is big enough. Without going into details, this sample size should be the same as the number of detected Hanging Mans. For example, if one found 100 Hanging Mans in the tested price time-series - this sample also should be 100.
  7. After the sample is big enough, calculate the ratio of correct predictions. For example, if 72 out of 100 candles had a correct prediction it will be 0.72. This number will serve as a data point for the probability distribution.
  8. Repeat previous steps many times so that probability distribution has enough data points. In general, the more the better. In this work 1k data points were used for each probability distribution.

A similar procedure will be executed to accept/reject the second hypothesis (the one about the magnitude of change). A couple of things would have to be changed though:

  • Calculate the average percentage of price change instead of prediction accuracy.
  • Use the price changes only from those cases where the prediction was aligned with the actual outcome.

Such a procedure allows getting insight into how well random candles predict the trend change. Now one can compare results made by Hanging Man formations. Let's assume that actual test results show that from 100 Hanging Mans 76 of them correctly predicted change in the trend. At first glance, it looks promising - the majority of formations were right. But is it really a good result? Let's say that at the same time generated probability distribution shows that on average (from 1k trials) random candles had 73% of correct predictions with the standard deviation of 5%.

In this context, 76% of correct predictions made by Hanging Mans do not look impressive anymore. It's not unusual for the random candles to do even better (e.g. 78%). In such a scenario, there would be no evidence to believe that the performance of Hanging Mans differs from one of the random candles. The hypothesis would not be rejected.

There is also a more formal way of assessing if the observed result is significantly different from the one that could be obtained from random variable distribution. To do this one can use the so-called p-value. This value represents how probable it is for the observed result to occur in distribution. The smaller the number - the less likely it is that observed value belongs to the random variable distribution. In practice, the fact that it is not likely for observed value to come from the distribution allows rejecting the hypothesis.

Tables 1 and 2 show the results of testing the first hypothesis. Each row represents results for a given time interval (i.e. 1, 2, 5, 7, 14, 28 days after a formation occurred). In the first column, one can see the ration of correctly predicted trend changes. The second column shows the average correct prediction ratio from the distribution and the third column shows the distribution standard deviation. The last column is the p-value. It would be sufficient to reject the hypothesis if the p-value is smaller than 0.05.

Tab.1 Hammer results for first hypothesis
Days Trend results Distribution mean Std. Dev. p-val
1 0.515 0.538 0.021 0.861
2 0.502 0.548 0.022 0.974
5 0.606 0.649 0.022 0.987
7 0.592 0.627 0.021 0.945
14 0.529 0.544 0.021 0.749
28 0.563 0.575 0.021 0.704

Tab.2 Hanging man results for first hypothesis
Days Trend results Distribution mean Std. Dev. p-val
1 0.502 0.542 0.019 0.979
2 0.516 0.543 0.020 0.911
5 0.645 0.632 0.018 0.233
7 0.598 0.603 0.019 0.597
14 0.494 0.509 0.019 0.778
28 0.547 0.535 0.020 0.266

As can be seen - none of the time intervals has p-value small enough which means that the hypothesis cannot be rejected. Hammer and Hanging Man formations are no better in predicting price change than the random candle.

Tables 3 and 4 have a similar structure to the previous tables but contain results from testing the second hypothesis. Surprisingly, one of the values has p-value small enough to be accepted as statistically significant. Hanging man formation after 14 days has a p-value equal to 0.045. It means that prices decline 14 days after Hanging Man formation tends to be bigger than after the randomly chosen candles. This is something, but I would be super cautious of this result at this point though. Even with statistical testing, if one tests multiple possible parameters (here, it is 6 different days after the formation appeared) – there is a possibility of data snooping bias or overfitting. Nevertheless – this is the best result so far. None of the other p-values are small enough to be considered significant.

Tab.3 Hammer results for the second hypothesis
Days Trend results Distribution mean Std. Dev. p-val
1 0.015 0.017 0.001 0.954
2 0.024 0.025 0.002 0.635
5 0.029 0.030 0.003 0.606
7 0.035 0.037 0.003 0.675
14 0.601 0.058 0.004 0.241
28 0.080 0.076 0.006 0.270

Tab.4 Hanging man results for the second hypothesis
Days Trend results Distribution mean Std. Dev. p-val
1 -0.013 -0.014 0.001 0.974
2 -0.020 -0.020 0.001 0.673
5 -0.022 -0.024 0.002 0.713
7 -0.032 -0.029 0.002 0.113
14 -0.056 -0.049 0.004 0.045
28 -0.547 -0.535 0.020 0.266

To summarize, there are no pieces of evidence to reject the examined hypotheses, except the 14d Hanging Man. All the other tested cases have no predictive power or provide no additional value. If one sees any of those formations on the price chart followed by the big desired move - one does not need to get excited as this is just cherry-picked coincidence. On the other hand, in theory, the second hypothesis for 14d Hanging Man could be rejected. It means that its usage on charts could be justified. Personally – I’d be careful though and conduct more work to see if there is no overfitting there.


Engulfing is another formation which supposed to predict a change in the price trend. It is formed by two candles. The latter “wraps” the former. Candles should have specific colors depending on the preceding trend. Below are the examples of two Engulfing formations. One after an uptrend (“bearish”) and the other after a downtrend (“bullish”). Both examples are taken from real price time series.

Fig.6 - Bullish Engulfing
Fig.7 - Bearish Engulfing

After the formation - the trend in price will change (either to horizontal or to the opposite).

Two similar hypotheses to the ones from Hanging Man/Hammer will be tested:

  • An Engulfing formation has the same probability as a random candle in predicting the future move of the price.
  • The magnitude of the desired price move occurring after an Engulfing formation is of similar order as the magnitude of the desired price move occurring after a random candle.
The procedure of testing will be similar to the one in the Hanging Man/Hammer section. Both, “bearish” and “bullish” engulfing formations will be tested separately.

Tables 5 and 6 show results for the first hypothesis about engulfing formations. One can see that none of the p-values is even close to value allowing rejection. Hence, one can assume that engulfing formation is no better than a random candle in predicting future price movement.
Tab.5 Bullish Engulfing results for the first hypothesis
Days Trend results Distribution mean Std. Dev. p-val
1 0.502 0.539 0.015 0.994
2 0.945 0.548 0.015 1.000
5 0.630 0.648 0.015 0.891
7 0.612 0.629 0.015 0.758
14 0.533 0.543 0.015 0.758
28 0.578 0.576 0.015 0.420
Tab.6 Bearish Engulfing results for the first hypothesis
Days Trend results Distribution mean Std. Dev. p-val
1 0.501 0.542 0.014 0.999
2 0.514 0.544 0.014 0.978
5 0.618 0.630 0.014 0.801
7 0.585 0.602 0.014 0.890
14 0.495 0.509 0.014 0.815
28 0.526 0.535 0.014 0.720

On the other hand, results from tests against the second hypothesis are more interesting. Those are presented in Tab.7 and Tab.8. One can see that both formations have some p-values small enough to reject the hypothesis. Bullish engulfing formation recorded significantly small p-values for periods 1 day and 2 days. Bearish engulfing results are significant for 1, 2, 5 and 28 days (value for 7 days was on the edge). The fact that significant results are achieved for multiple time windows makes those results more convincing.
Tab.7 Bullish Engulfing results for the second hypothesis
Days Trend results Distribution mean Std. Dev. p-val
1 0.019 0.017 0.001 0.043
2 0.028 0.025 0.001 0.043
5 0.029 0.030 0.002 0.082
7 0.036 0.037 0.002 0.619
14 0.055 0.058 0.003 0.855
28 0.072 0.078 0.004 0.906
Tab.8 Bearish Engulfing results for the second hypothesis
Days Trend results Distribution mean Std. Dev. p-val
1 -0.016 -0.014 0.001 0.004
2 -0.024 -0.020 0.001 0.001
5 -0.027 -0.024 0.001 0.020
7 -0.032 -0.030 0.002 0.055
14 -0.052 -0.049 0.003 0.125
28 -0.075 -0.067 0.004 0.019

To summarize – for the above-mentioned periods the second hypothesis can be rejected. It means that if the desired price movement indeed happened, the magnitude of the price moves is significantly bigger for engulfing formation than the one for random candle. The usage of engulfing formation may be justified in this context.


There many types of star formations. The ones considered in this work are “Morning Star” (bullish) and “Evening Star” (bearish). Both of those formations consist from three candles and are formed as follow:

  • 1st star should have “big” real body (black or white)
  • 2nd star should have “small” body and colour does not matter
  • 3rd star should cover at least half of the first candle and it should have opposite colour
As “big” and “small” are pretty vague terms - big candles where implemented as those with bigger than average real body in given time series and lookback period. The opposite holds for the “small” candles. Figures below show real-world examples of both formations.
Fig.8 - Morning Star
Fig.9 - Evening Star

After the formation - trend in price will change (either to horizontal or to opposite).

Again, similar hypotheses to the previous ones will be tested:

  • A Star formation has the same probability as a random candle in predicting the future move of the price.
  • The magnitude of the desired price move occurring after a Star formation is of similar order as the magnitude of the desired price move occurring after a random candle.
The procedure of testing will be similar to the one in the Hanging Man/Hammer section.

Tab.9 and Tab.10 shows results from the first hypothesis tests. As in the previous cases – none of the candle formation is able to predict future price movement better than any random candle.
Tab.9 Morning Star results for the first hypothesis
Days Trend results Distribution mean Std. Dev. p-val
1 0.540 0.539 0.016 0.459
2 0.558 0.546 0.016 0.234
5 0.616 0.649 0.014 0.990
7 0.578 0.627 0.015 1.000
14 0.523 0.543 0.015 0.909
28 0.566 0.576 0.016 0.708
Tab.10 Evening Star results for the first hypothesis
Days Trend results Distribution mean Std. Dev. p-val
1 0.535 0.541 0.024 0.574
2 0.521 0.545 0.025 0.818
5 0.666 0.623 0.024 0.066
7 0.604 0.602 0.024 0.457
14 0.499 0.509 0.026 0.645
28 0.535 0.536 0.024 0.495

Results from the tests against the second hypothesis are presented in Tab.11 and Tab.12. Again – those are more interesting results. Tests for both Star formations show significant p-value. For Morning Star there are significant results for 1 day and 14 days, while for Evening Star for 1 day only. Hypothesis no.2 could be rejected for those periods while making the usage of the formation justifiable.
Tab.11 Morning Star results for the second hypothesis
Days Trend results Distribution mean Std. Dev. p-val
1 0.019 0.017 0.001 0.029
2 0.246 0.025 0.001 0.591
5 0.030 0.030 0.002 0.497
7 0.038 0.037 0.002 0.380
14 0.063 0.058 0.003 0.031
28 0.080 0.077 0.004 0.269
Tab.12 Evening Star results for the second hypothesis
Days Trend results Distribution mean Std. Dev. p-val
1 -0.017 -0.014 0.001 0.020
2 -0.023 -0.020 0.002 0.092
5 -0.022 -0.024 0.002 0.845
7 -0.034 -0.029 0.003 0.062
14 -0.055 -0.049 0.004 0.128
28 -0.068 -0.067 0.007 0.405


To summarize the performed tests – none of the candlestick formations was able to predict future price movements better than a random candle. Hence, those patterns are useless if one hopes to achieve with them a reliable future price direction prediction. One cannot predict the market future prices with those candlestick patterns. The real-world examples of candlestick patterns correctly predicting the future price trend are cherry-picked and are generated by a random process.

On the other hand, when price indeed moves in the desired direction, the moves followed by the candlestick formations tend to be of a larger magnitude than the moves following a random candle. This could make the usage of candle formations justified, as even if the chances of winning are as good as random - the award is bigger. It should be noted here though, that the magnitude of the move in the undesired direction has not been tested. It is possible that along with a bigger award comes the bigger penalty. To test if that is the case, it would be interesting to look at the overall variance of the changes in the price followed by the candlestick formations as well as the random candles.

As a final word, candlestick patterns are well-known tools for trading. Those methods are fully accepted by many technical analysts’ communities. There are books written on this topic. Yet – one cannot take this “knowledge” as a granted. As it is shown in this work, even though those methods are popular - they are useless when it comes to predicting market price move. We are fully equipped in the statistical methods which allow us to verify “knowledge”. One should not trust methods that did not pass scrupulous testing. Even if this “knowledge” intuitively makes sense or it has a great story behind it. Also, one should be especially cautious if presented knowledge sounds convincing but it is supported only by anecdotal pieces of evidence or cherry-picked examples.