atom.xml

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Michael&#39;s blog</title>
  
  <subtitle>Life-long Learning</subtitle>
  <link href="http://mikelhsia.github.io/atom.xml" rel="self"/>
  
  <link href="http://mikelhsia.github.io/"/>
  <updated>2025-01-19T12:32:21.124Z</updated>
  <id>http://mikelhsia.github.io/</id>
  
  <author>
    <name>Michael Hsia</name>
    
  </author>
  
  <generator uri="https://hexo.io/">Hexo</generator>
  
  <entry>
    <title>【The Wheel Strategy】 Turn Options Into Monthly Income</title>
    <link href="http://mikelhsia.github.io/2025/01/16/2025-01-18-wheel-trading-strategy/"/>
    <id>http://mikelhsia.github.io/2025/01/16/2025-01-18-wheel-trading-strategy/</id>
    <published>2025-01-16T03:05:54.000Z</published>
    <updated>2025-01-19T12:32:21.124Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2025/01/16/2025-01-18-wheel-trading-strategy/cover.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Extracted from Website: <a href='https://steadyoptions.com/articles/the-options-wheel-strategy-wheel-trade-explained-r632/'>SteadyOptions</a></i></p><p>Stock option has always been a popular way to trade due to its high leverage, which allows traders to control a large amount of assets with a small amount of capital. However, the complexity of stock options can be overwhelming for many traders. For those who are the beginners of the stock option, today we’re going to introduce a simple trading strategy that can help you to get started with stock options trading without much effort.</p><a id="more"></a><hr><h3 id="Previous-readings"><a href="#Previous-readings" class="headerlink" title="Previous readings"></a>Previous readings</h3><ul><li><a href="https://mikelhsia.github.io/2024/12/16/2024-12-14-save-memory-by-learning-math/">【How 2】Save Your Valuable Memory and Time by Knowing These Math Formulas</a></li><li><a href="https://mikelhsia.github.io/2024/11/11/2024-11-11-new-type-of-grid-trading-system/">Beyond Traditional Grid Trading - Introducing A New Type of Grid Trading System</a></li><li><a href="https://mikelhsia.github.io/2024/06/28/2024-06-28-why-fit-and-transform/">【ML algo trading】One Pitfall You Definitely Need to Avoid in Feature Engineering</a></li></ul><hr><h1 id="Introduction-to-the-Wheel-Trading-Strategy"><a href="#Introduction-to-the-Wheel-Trading-Strategy" class="headerlink" title="Introduction to the Wheel Trading Strategy"></a>Introduction to the Wheel Trading Strategy</h1><img data-src="/2025/01/16/2025-01-18-wheel-trading-strategy/cover.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Extracted from Website: <a href='https://steadyoptions.com/articles/the-options-wheel-strategy-wheel-trade-explained-r632/'>SteadyOptions</a></i></p><p>The wheel trading strategy is essentially selling either put or call options to collect the premium upfront in order to create a stable income stream. As illustrated in the picture above, this trading strategy consists of the following steps:</p><ol><li>Selling a <code>Cash-Secured Put</code> to collect the premium.</li><li>If the stock price drops below the strike price, you’ll get assigned the underlying stock shares.</li><li>Selling a <code>Call Option</code> to collect the premium. Together with the stock shares you received from the previous step, you are actually selling a <code>Covered Call</code>.</li><li>If the stock price rises above the strike price, the stock shares you own will get called away.</li><li>Then go back to step 1 and repeat.</li></ol><p>The above steps would form a cycle that the investors can repeat over and over again to create a stable income stream. The wheel trading strategy is also known as the “Triple Income Strategy”, where:</p><ol><li>The premium collected from selling the put option is the $1_{st}$ income source.</li><li>If the stock price goes way down and we were assigned the stock shares, we could sell the <code>Covered Call</code> to create the $2_{nd}$ income by collecting premium.</li><li>The $3_{rd}$ source of income comes from the potential dividends paid by the underlying stock if it is assigned to us.</li></ol><h1 id="The-Wheel-Trading-Strategy-in-Real-World"><a href="#The-Wheel-Trading-Strategy-in-Real-World" class="headerlink" title="The Wheel Trading Strategy in Real-World"></a>The Wheel Trading Strategy in Real-World</h1><p>This trading strategy seems way too good to be true, doesn’t it? How could any trading strategy that can generate a stable income stream without any risk? However, there are a few drawbacks that deter many investing organizations and funds from adopting this strategy:</p><ol><li><strong>Capital Intensive</strong>: In order to execute the first step to sell the <code>Cash Secured Put Option</code>, you need to have enough capital to cover the cost when the stock shares are assigned to you.</li><li><strong>Limited Upside</strong>: The maximum gain at a specific point in time would be the premium collected from selling options. Neither a strong bull nor bear market would do you any good.</li><li><strong>Potential Assignment</strong>: If the stock price rises or falls significantly, you might get assigned the stock shares, which can lead to further loss if the stock price continues the previous momentum.</li></ol><p>Therefore, for those who are seeking a stable income stream instead of speculating on the market, the wheel trading strategy would be a better choice for you to collect dimes and dollars from the floor.</p><h1 id="More-Details-to-Manipulate-the-Strategy"><a href="#More-Details-to-Manipulate-the-Strategy" class="headerlink" title="More Details to Manipulate the Strategy"></a>More Details to Manipulate the Strategy</h1><p>With the above being said, the wheel trading strategy still demands special attention to a lot of details. By manipulating the details and the timing, we can make the strategy more robust and less risky. I really benefited a lot after reading <a href="https://www.reddit.com/r/options/comments/a36k4j/the_wheel_aka_triple_income_strategy_explained/">this Reddit post</a>, therefore I do suggest you do the same to read through this Reddit post to gain your own insights. Here are a few ideas of strategy variations that I’ve extracted from the post:</p><ul><li><p>Collect 50% of premium before it expires<br>Instead of collecting the full premium and waiting until the option expires, you could actually collect 50% of the premium upfront by closing the current short put position. The initial premium you receive less the cost to close the current position would give you the net credit you will earn. If the net credit reaches 50% of the original premium received, then why don’t we close it before reaching the expiration date and create a new position to avoid further fluctuation of the stock price?</p></li><li><p>Roll over when the put option is ATM<br>When the stock price approaches or reaches the strike price (At The Money or ATM), the likelihood of option exercise and share assignment increases significantly. Since our primary income source comes from collecting premiums through put and call option sales rather than stock price appreciation, rolling over the position to a new option with a higher strike price becomes a strategic choice. This approach helps minimize assignment risk while creating opportunities to collect additional premium income.</p></li><li><p>Choose the right strike price when selling call option<br>When selling puts, stock assignment becomes inevitable if the share price drops substantially. However, by selecting high-quality blue-chip stocks with strong business fundamentals, any price decline is likely to be temporary, with recovery expected within a reasonable timeframe. Rather than selling covered calls at arbitrary strike prices, a more strategic approach would be to set the strike price equal to our cost basis (the price at which we were assigned the shares). This method prevents our shares from being called away at unfavorable prices and helps avoid realizing losses during adverse market movements.<br>And now, let’s try to implement the above ideas and see whether they actually work in terms of performance.</p></li></ul><h1 id="Performance-Analysis-of-Each-Variation"><a href="#Performance-Analysis-of-Each-Variation" class="headerlink" title="Performance Analysis of Each Variation"></a>Performance Analysis of Each Variation</h1><h2 id="Platform"><a href="#Platform" class="headerlink" title="Platform"></a>Platform</h2><p><a href="https://www.quantconnect.com">QuantConnect</a></p><h2 id="Benchmark"><a href="#Benchmark" class="headerlink" title="Benchmark"></a>Benchmark</h2><p>SPDR S&amp;P 500 ETF Trust (SPY)</p><h2 id="Underlying-Stock-of-the-Put-and-Call-Option"><a href="#Underlying-Stock-of-the-Put-and-Call-Option" class="headerlink" title="Underlying Stock of the Put and Call Option"></a>Underlying Stock of the Put and Call Option</h2><p>SPDR S&amp;P 500 ETF Trust (SPY)</p><h2 id="Backtest-Period"><a href="#Backtest-Period" class="headerlink" title="Backtest Period"></a>Backtest Period</h2><p>2020-01-01 to 2025-01-06</p><h2 id="Performance-Analysis"><a href="#Performance-Analysis" class="headerlink" title="Performance Analysis"></a>Performance Analysis</h2><h3 id="Plain-Vanilla-Wheel-Trading-Strategy"><a href="#Plain-Vanilla-Wheel-Trading-Strategy" class="headerlink" title="Plain Vanilla Wheel Trading Strategy"></a>Plain Vanilla Wheel Trading Strategy</h3><div class="table-container"><table><thead><tr><th>Total Trades</th><th>Net Profit</th><th>Annualized Return</th><th>Max Drawdown</th></tr></thead><tbody><tr><td>79</td><td>45.53%</td><td>7.729%</td><td>28.00%</td></tr></tbody></table></div><img data-src="/2025/01/16/2025-01-18-wheel-trading-strategy/plain_vanilla_wheel_perf.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Backtest result of the plain vanilla wheel trading strategy</i></p><img data-src="/2025/01/16/2025-01-18-wheel-trading-strategy/plain_vanilla_wheel_dist.png" class="" width="400"><p style="text-align:center; color: grey;">  <i>Distribution of each income stream</i></p><p>In the first diagram, the red rectangles represent the period when the stock shares were assigned to us. Except for these periods and the 2020 March when COVID hit the market, the strategy has been consistently profitable, regardless of the bear or bull market trends. In the second histogram chart, you’ll also see that the majority of the income is from the put option premiums, and the call option premium somewhat offset the loss on the stock assignment. This strategy gave us ~7% annual return, which is slightly higher than the US 10-year Bond Yield.</p><h3 id="Take-profit-when-reaching-50-of-the-initial-premium"><a href="#Take-profit-when-reaching-50-of-the-initial-premium" class="headerlink" title="Take profit when reaching 50% of the initial premium"></a>Take profit when reaching 50% of the initial premium</h3><p>This variation of the wheel trading strategy aims to collect 50% of the initial premium before the option expires.</p><div class="table-container"><table><thead><tr><th>Total Trades</th><th>Net Profit</th><th>Annualized Return</th><th>Max Drawdown</th></tr></thead><tbody><tr><td>175</td><td>43.84%</td><td>7.475%</td><td>29.20%</td></tr></tbody></table></div><img data-src="/2025/01/16/2025-01-18-wheel-trading-strategy/take_profit_wheel_perf.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Backtest result of the wheel trading strategy that has 50% take-profit rule</i></p><img data-src="/2025/01/16/2025-01-18-wheel-trading-strategy/take_profit_wheel_dist.png" class="" width="400"><p style="text-align:center; color: grey;">  <i>Distribution of each income stream</i></p><p>In comparison to the traditional wheel strategy, this variant demonstrates mixed results. While the implementation of take-profit rules successfully generated a 45% increase in put premium revenue, it simultaneously led to substantial drawdowns in underlying asset transactions, effectively neutralizing the premium gains.</p><p>The aggressive nature of the profit-taking parameters triggered more than twice the number of standard transactions, resulting in two key disadvantages: elevated transaction costs and an increased frequency of stock entries at unfavorable price points relative to market value.</p><h3 id="Roll-over-when-it’s-ATM-old-and-now"><a href="#Roll-over-when-it’s-ATM-old-and-now" class="headerlink" title="Roll over when it’s ATM old and now"></a>Roll over when it’s ATM old and now</h3><p>So, from the previous variant, we noticed that the take-profit rule is not a good idea because you could either lose more when the downward momentum is strong or enter a new position that is unfavorable to your position. But what if, instead of measuring the profit gain as the timing to find the exit point of your current short put position, measuring the spread between the current strike price and the current stock price could be a more beneficial method to time the exit point? This is why we roll over our short put option when it is <strong>At The Money (ATM)</strong>.</p><div class="table-container"><table><thead><tr><th>Total Trades</th><th>Net Profit</th><th>Annualized Return</th><th>Max Drawdown</th></tr></thead><tbody><tr><td>79</td><td>48.32%</td><td>8.134%</td><td>28.00%</td></tr></tbody></table></div><img data-src="/2025/01/16/2025-01-18-wheel-trading-strategy/roll_wheel_perf.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Backtest result of the wheel trading strategy that rolls over when it's ATM</i></p><img data-src="/2025/01/16/2025-01-18-wheel-trading-strategy/roll_wheel_dist.png" class="" width="400"><p style="text-align:center; color: grey;">  <i>Distribution of each income stream</i></p><p>Astonishing enough, we greatly reduce the chances of stock assignment by rolling over the position when it’s ATM. The strategy has a 8% annual return, which is higher than the US 10-year Bond Yield. On top of that, we lost less on trading the underlying stock and reduced the variance of our strategy performance.</p><h3 id="Recover-to-where-it-drops"><a href="#Recover-to-where-it-drops" class="headerlink" title="Recover to where it drops"></a>Recover to where it drops</h3><p>To minimize assignment risk and potential losses, we can strategically set the covered call strike price equal to our assigned stock cost basis. This approach serves dual purposes: reducing assignment probability and protecting against downside risk. This strike selection methodology aligns with our cost basis, creating a more controlled risk-reward profile. Even though we will still get assigned, we still gain premium from selling covered call options compared to holding the underlying stock.</p><div class="table-container"><table><thead><tr><th>Total Trades</th><th>Net Profit</th><th>Annualized Return</th><th>Max Drawdown</th></tr></thead><tbody><tr><td>79</td><td>64.362%</td><td>10.359%</td><td>28.00%</td></tr></tbody></table></div><img data-src="/2025/01/16/2025-01-18-wheel-trading-strategy/wheel_roll_recover_perf.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Backtest result of the wheel trading strategy that rolls over when it's ATM</i></p><img data-src="/2025/01/16/2025-01-18-wheel-trading-strategy/wheel_roll_recover_dist.png" class="" width="400"><p style="text-align:center; color: grey;">  <i>Distribution of each income stream</i></p><p>Hey! We have a 10% annual return and the max drawdown is still only 28%! Also when looking at the profit distribution histogram, we didn’t lose any capital on trading the underlying stock at all.</p><h1 id="Take-away"><a href="#Take-away" class="headerlink" title="Take away"></a>Take away</h1><p>I’ve tried this strategy on AAPL, TSLA, and also QQQ ETF. All these underlying stocks perform similarly to the SPY ETF. These strategies share the same traits: they don’t have strong momentum in both bull or bear markets, but they create stable income throughout the backtest period. Therefore, if you want to find another trading strategy to compensate for the fluctuation of your main trading strategy, the wheel trading strategy could be a good choice.</p><hr><h1 id="References"><a href="#References" class="headerlink" title="References"></a>References</h1><ul><li><a href="https://www.quantconnect.com/research/17871/automating-the-wheel-strategy/p1">Automating the Wheel Strategy</a></li><li><a href="https://medium.datadriveninvestor.com/the-wheel-strategy-99e16b9540b2">The Wheel Options Strategy</a></li><li><a href="https://medium.com/mastering-options/how-to-use-the-options-wheel-strategy-5013c9938f4b">The Wheel Options Strategy for Begineers</a></li><li><a href="https://medium.com/mastering-options/is-the-wheel-strategy-profitable-1861cb52c3eb">Is the Wheel Strategy Profitable?</a></li><li><a href="https://ali-muhammadimran.medium.com/wheel-strategy-is-the-ultimate-trading-cheat-code-e21b369ab31e">Wheel Strategy is the Ultimate Trading Cheat Code</a></li><li><a href="https://learn.bybit.com/options/wheel-strategy/">Wheel Strategy: A Long-Term Strategy For Consistent Income</a></li><li><a href="https://www.reddit.com/r/options/comments/a36k4j/the_wheel_aka_triple_income_strategy_explained/">The Wheel (aka Triple Income) Strategy Explained</a> <strong><em>Recommended</em></strong></li></ul>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2025/01/16/2025-01-18-wheel-trading-strategy/cover.png&quot; class=&quot;&quot; width=&quot;600&quot;&gt;
&lt;p style=&quot;text-align:center; color: grey;&quot;&gt;
  &lt;i&gt;Extracted from Website: &lt;a href=&#39;https://steadyoptions.com/articles/the-options-wheel-strategy-wheel-trade-explained-r632/&#39;&gt;SteadyOptions&lt;/a&gt;&lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;Stock option has always been a popular way to trade due to its high leverage, which allows traders to control a large amount of assets with a small amount of capital. However, the complexity of stock options can be overwhelming for many traders. For those who are the beginners of the stock option, today we’re going to introduce a simple trading strategy that can help you to get started with stock options trading without much effort.&lt;/p&gt;</summary>
    
    
    <category term="Trading Strategy" scheme="http://mikelhsia.github.io/categories/Trading-Strategy/"/>
    
    
    <category term="Option" scheme="http://mikelhsia.github.io/tags/Option/"/>
    
    <category term="Quantitative Trading" scheme="http://mikelhsia.github.io/tags/Quantitative-Trading/"/>
    
  </entry>
  
  <entry>
    <title>【How 2】Save Your Valuable Memory and Time by Knowing These Math Formulas</title>
    <link href="http://mikelhsia.github.io/2024/12/16/2024-12-14-save-memory-by-learning-math/"/>
    <id>http://mikelhsia.github.io/2024/12/16/2024-12-14-save-memory-by-learning-math/</id>
    <published>2024-12-16T15:37:29.000Z</published>
    <updated>2024-12-16T17:33:01.744Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2024/12/16/2024-12-14-save-memory-by-learning-math/cover.jpg" class="" width="600"><p style="text-align:center; color: grey;">  <i>Cover image created through <a href='https://https://copilot.microsoft.com/'>Copilot</a></i></p><p>Calculating the arithmetic mean and the variance (or the standard deviation) of a dataset is a fundamental task in statistics. These calculations provide valuable insights into the central tendency and distribution of the data in the linear space. However, it’s going to be a resource-intensive operation if you do these calculations repeatedly, especially when dealing with large datasets. To save you time and the memory on your computer, we’re going to explore an incremental approach to calculate the arithmetic mean and the variance (or the standard deviation) of a dataset.</p><a id="more"></a><hr><h3 id="Previous-readings"><a href="#Previous-readings" class="headerlink" title="Previous readings"></a>Previous readings</h3><ul><li><a href="https://mikelhsia.github.io/2024/10/04/2024-10-07-explain-bayes-theorm-without-using-big-words/">【How 2】Explain Bayes’ Theorem Without Using Big Words</a></li><li><a href="https://mikelhsia.github.io/2024/04/23/2024-04-24-spin-up-docker-container-for-your-tws/">【How 2】Breaking Free! Use Docker to Create Hands-Off Interactive Broker TWS Managing Experience</a></li><li><a href="https://mikelhsia.github.io/2022/12/12/2022-12-10-IBKR-Broker/">【How 2】 Set Up Trading API Template In Python - Connecting My Trading Strategies To Interactive Brokers</a></li><li><a href="https://mikelhsia.github.io/2020/10/19/2020-10-19-get-all-tradable-tickers/">【How 2】 Vol. 1. How 2 get all tradable tickers in US markets</a></li></ul><hr><h1 id="Let’s-talk-about-the-problems"><a href="#Let’s-talk-about-the-problems" class="headerlink" title="Let’s talk about the problems"></a>Let’s talk about the problems</h1><p>In quantitative Trading, we often need to calculate the mean and variance of a dataset. For example, we may want to calculate the mean and variance of the returns of a stock over a certain period of time. The formula to calculate the mean is:</p><script type="math/tex; mode=display">\text{arithmetic mean} = \frac{\sum\limits_{i=1}^{n} x_i}{n}</script><p>And to calculate standard deviation, we use the following formula:</p><script type="math/tex; mode=display">\begin{aligned}&\text{variance} = \frac{\sum\limits_{i=1}^{n} (x_i - \bar{x})^2}{n}\\&\text{standard deviation} = \sqrt{\text{variance}}\end{aligned}</script><p>It’s quite straightforward to calculate the mean and variance of a dataset. However, when the scenario gets more complicated, the calculation becomes more complex and consumes much more resources than we expected. Let’s say in your trading strategy that uses 22-day Simple Moving Average of the close price of a stock to evaluate the trend of the stock on the minute-basis over two-day period. Using 22-day Simple Moving Average means you have to calculate the mean of 22 <em> 24 (hours) </em> 60 (minutes) = 31,680 data points for 2 <em> 24(hours) </em> 60 (minutes) = 2880 times.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> numpy <span class="keyword">as</span> np <span class="comment"># To generate random numbers</span></span><br><span class="line"><span class="keyword">import</span> pandas <span class="keyword">as</span> pd <span class="comment"># To plot the results</span></span><br><span class="line"><span class="keyword">from</span> collections <span class="keyword">import</span> deque <span class="comment"># To simulate the operation of a rolling window array</span></span><br><span class="line"><span class="keyword">from</span> statistics <span class="keyword">import</span> mean, stdev, sqrt <span class="comment"># To calculate the mean and the standard deviation</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># Generate random minute-bar 22 days</span></span><br><span class="line">array = np.random.randn(<span class="number">22</span>*<span class="number">24</span>*<span class="number">60</span>)</span><br><span class="line"><span class="comment"># This is the rolling window array with mex len = 22 * 24 * 60</span></span><br><span class="line">queue = deque([], maxlen=<span class="number">22</span>*<span class="number">24</span>*<span class="number">60</span>)</span><br><span class="line"><span class="comment"># New data generate for simulate the following 2 days new minute-bar data</span></span><br><span class="line">new_dataset = np.random.randn(<span class="number">2</span>*<span class="number">24</span>*<span class="number">60</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Initialize the queue by injecting the first 22 days data into the queue</span></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">init_queue</span>():</span></span><br><span class="line">    <span class="keyword">for</span> _ <span class="keyword">in</span> array:</span><br><span class="line">        queue.append(_)</span><br><span class="line">    <span class="keyword">return</span> queue</span><br></pre></td></tr></table></figure><h1 id="Let’s-get-started-with-the-experiments"><a href="#Let’s-get-started-with-the-experiments" class="headerlink" title="Let’s get started with the experiments"></a>Let’s get started with the experiments</h1><h2 id="Arithmetic-mean"><a href="#Arithmetic-mean" class="headerlink" title="Arithmetic mean"></a>Arithmetic mean</h2><p>Let’s recap the formula to calculate the arithmetic mean:</p><script type="math/tex; mode=display">\text{arithmetic mean} = \frac{\sum\limits_{i=1}^{n} x_i}{n}</script><p>We use the following code to initialize the queue and the lists that are needed to start the experiments:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">m_queue = m_queue_2 = list()</span><br><span class="line">queue = init_queue()</span><br></pre></td></tr></table></figure></p><h3 id="The-traditional-Approach"><a href="#The-traditional-Approach" class="headerlink" title="The traditional Approach"></a>The traditional Approach</h3><p>This is relatively easy to everyone. We do the following steps to simulate the scenario of calculating the mean of a dataset repeatedly:</p><ol><li>Iterate the new dataset to get the new minute-bar data point</li><li>Append the new minute-bar data point to the queue</li><li>Calculate the mean by using the que directly</li><li>Store the mean into a list for later comparison<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">for</span> _ <span class="keyword">in</span> new_dataset:</span><br><span class="line">    new_value = _</span><br><span class="line">    queue.append(new_value)</span><br><span class="line">    m_queue.append(mean(queue))</span><br><span class="line"></span><br><span class="line"><span class="comment"># =&gt; Last executed at 2024-12-16 12:12:20 in 54.55s</span></span><br></pre></td></tr></table></figure></li></ol><p>It took <strong>54.55s</strong> to finish calculating 2880 times. Let’s see what would happen if we apply the incremental approach to this scenario.</p><h3 id="The-Incremental-Approach"><a href="#The-Incremental-Approach" class="headerlink" title="The Incremental Approach"></a>The Incremental Approach</h3><p>Start with the same step, we initialize the queue with the same value to make sure we can compare the results from two different approaches. In this approach, we need to keep track of the summation of the queue.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">queue = init_queue()</span><br><span class="line">summation = sum(queue)</span><br></pre></td></tr></table></figure><p>The idea of the incremental approach is relevantly easy to understand. Let’s have a look at the below illustration:</p><img data-src="/2024/12/16/2024-12-14-save-memory-by-learning-math/rolling_window_illustration.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Illustration of appending new value into a rolling window.</i></p><p>When it comes to calculating the mean, we sum up all the elements in the queue and divide it by the length of the queue. As you can see, once you adding new value into the 22-day SMA rolling window, the numbers in the queue will mostly remain the same except the oldest value and the newest value. Therefore, all we need to do is take these two values into account and update the summation accordingly instead of looping through all the numbers in the queue.</p><script type="math/tex; mode=display">\text{New mean} = \frac{\text{summation} + \text{new value} - \text{old value}}{\text{length of queue}}</script><p>Simply by applying this formula, we reduce the execution time down to <strong>13ms</strong>.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">for</span> _ <span class="keyword">in</span> new_dataset:</span><br><span class="line">    new_value = _</span><br><span class="line">    old_value = queue[<span class="number">0</span>]</span><br><span class="line">    queue.append(new_value)</span><br><span class="line">    summation = summation + new_value - old_value</span><br><span class="line">    m_queue_2.append(summation/len(queue))</span><br><span class="line"></span><br><span class="line"><span class="comment"># =&gt; Last executed at 2024-12-16 12:12:20 in 13ms</span></span><br></pre></td></tr></table></figure><p>Now let’s use the below code to compare the results from both approaches.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">pd.DataFrame([m_queue, m_queue_2]).T.rename(columns=&#123;<span class="number">0</span>:<span class="string">&#x27;traditional&#x27;</span>, <span class="number">1</span>:<span class="string">&#x27;incremental&#x27;</span>&#125;).plot()</span><br></pre></td></tr></table></figure><img data-src="/2024/12/16/2024-12-14-save-memory-by-learning-math/compare_mean.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>The mean values calculated with both traditional and incremental approaches</i></p><p>See! The incremental approach is giving the same results as the traditional approach, but use less time than the traditional approach does.</p><h2 id="Variance-amp-Standard-Deviation"><a href="#Variance-amp-Standard-Deviation" class="headerlink" title="Variance &amp; Standard Deviation"></a>Variance &amp; Standard Deviation</h2><p>Now comes to the variance and standard deviation. The formula is as follows:</p><script type="math/tex; mode=display">\begin{aligned}&\text{variance} &= \frac{\sum\limits_{i=1}^{n} (x_i - \bar{x})^2}{n}\\&\text{standard deviation} &= \sqrt{\text{variance}}\end{aligned}</script><p>Again, let’s start with initializing the queue and the lists that are needed to start the experiments:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">queue = init_queue()</span><br><span class="line">m_queue = m_queue_2 = list()</span><br></pre></td></tr></table></figure></p><h3 id="The-traditional-Approach-1"><a href="#The-traditional-Approach-1" class="headerlink" title="The traditional Approach"></a>The traditional Approach</h3><p>Below is the approach that everyone would do to calculate the variance and standard deviation:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">for</span> _ <span class="keyword">in</span> new_dataset:</span><br><span class="line">    new_value = _</span><br><span class="line">    queue.append(new_value)</span><br><span class="line">    m_queue.append(stdev(queue))</span><br><span class="line"></span><br><span class="line"><span class="comment"># =&gt; Last executed at 2024-12-17 00:50:27 in 2m 26.64s</span></span><br></pre></td></tr></table></figure><br>This time, it cost <strong>2m 26.64s</strong> to finish calculating 2880 times. It is apparently way slower than the mean calculation. As we all know, to calculate the variance and the standard deviation, we need to calculate the mean and then calculate the summation of the square of the difference between each data point and the mean. So the time complexity of calculating the variance and standard deviation is $O(n^2)$, meaning costing more time than the mean calculation.</p><h3 id="The-Incremental-Approach-1"><a href="#The-Incremental-Approach-1" class="headerlink" title="The Incremental Approach"></a>The Incremental Approach</h3><p>The incremental approach is a little bit more complicated than the mean calculation. We need to keep track of the summation of the queue, the mean of the queue, and the variance of the queue. This method is called the <a href="https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance">Welford’s Weighted incremental algorithm</a>. You can also refer to <a href="https://changyaochen.github.io/welford/">this post</a> for how to derive the formula using the very basic algebra. According to the formula, we can easily reduce the complexity from $O(n^2)$ to $O(n)$.</p><p>Again, let’s start with the initialization of the queue and the lists that are needed to start the experiments. But this time, we will calculate the summation, mean, and variance of the queue to conduct the following calculation.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">queue = init_queue()</span><br><span class="line">summation = sum(queue)</span><br><span class="line">mean = summation / len(queue)</span><br><span class="line">variance = sum([(x-mean)**<span class="number">2</span> <span class="keyword">for</span> x <span class="keyword">in</span> queue]) / len(queue)</span><br></pre></td></tr></table></figure><p>Now, according to <a href="https://changyaochen.github.io/welford/">this post</a>, we will simplify the formula to</p><script type="math/tex; mode=display">\text{new variance} = \frac{\text{length of the queue} * \text{old variance} + (\text{new value}-\text{old value}) * (\text{new value} - \text{new mean} + \text{old value} - \text{old mean})}{\text{len of the queue}}</script><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">for</span> _ <span class="keyword">in</span> new_dataset:</span><br><span class="line">    old_value = queue[<span class="number">0</span>]</span><br><span class="line">    new_value = _</span><br><span class="line"></span><br><span class="line">    old_sum = summation</span><br><span class="line">    new_sum = summation + new_value - old_value</span><br><span class="line"></span><br><span class="line">    old_mean = summation/len(queue)</span><br><span class="line">    new_mean = (summation + new_value - old_value)/len(queue)</span><br><span class="line"></span><br><span class="line">    old_variance = variance</span><br><span class="line">    new_variance = (len(queue)*old_variance + (new_value - old_value) * (new_value - new_mean + old_value - old_mean))/len(queue)</span><br><span class="line"></span><br><span class="line">    queue.append(new_value)</span><br><span class="line">    m_queue_2.append(sqrt(new_variance))</span><br><span class="line"></span><br><span class="line">    <span class="comment"># Update the summation,</span></span><br><span class="line">    summation = new_sum</span><br><span class="line">    variance = new_variance</span><br><span class="line"></span><br><span class="line"><span class="comment"># =&gt; Last executed at 2024-12-17 01:00:10 in 14ms</span></span><br></pre></td></tr></table></figure><p>Hey, we greatly reduce the execution time needed from 2m 26.64s to <strong>14ms</strong>. Last thing we need to do, is to examine the results from both approaches.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">pd.DataFrame([m_queue, m_queue_2]).T.rename(columns=&#123;<span class="number">0</span>:<span class="string">&#x27;traditional&#x27;</span>, <span class="number">1</span>:<span class="string">&#x27;incremental&#x27;</span>&#125;).plot()</span><br></pre></td></tr></table></figure><img data-src="/2024/12/16/2024-12-14-save-memory-by-learning-math/compare_std.png" class="" width="600"><p><p style="text-align:center; color: grey;">  <i>The mean values calculated with both traditional and incremental approaches</i></p><br>Great! The incremental approach is giving the same results as the traditional approach, but use less time than the traditional approach does. This is the power of incremental approach!</p><h1 id="Take-away"><a href="#Take-away" class="headerlink" title="Take away"></a>Take away</h1><p>This is the power of math! If you follow the steps in <a href="https://changyaochen.github.io/welford/">the post</a> to derive the formula, you will find that the formula is quite simple. However, by simply applying the formulas to the calculation, you’ll find out it saves you not just the time to calculate the variance and standard deviation, but also the memory space when you want to deploy the algorithm in a production environment. Hence, when you are working on a project, always try to think about the math behind the algorithm.</p>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2024/12/16/2024-12-14-save-memory-by-learning-math/cover.jpg&quot; class=&quot;&quot; width=&quot;600&quot;&gt;
&lt;p style=&quot;text-align:center; color: grey;&quot;&gt;
  &lt;i&gt;Cover image created through &lt;a href=&#39;https://https://copilot.microsoft.com/&#39;&gt;Copilot&lt;/a&gt;&lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;Calculating the arithmetic mean and the variance (or the standard deviation) of a dataset is a fundamental task in statistics. These calculations provide valuable insights into the central tendency and distribution of the data in the linear space. However, it’s going to be a resource-intensive operation if you do these calculations repeatedly, especially when dealing with large datasets. To save you time and the memory on your computer, we’re going to explore an incremental approach to calculate the arithmetic mean and the variance (or the standard deviation) of a dataset.&lt;/p&gt;</summary>
    
    
    <category term="How2" scheme="http://mikelhsia.github.io/categories/How2/"/>
    
    
    <category term="How2" scheme="http://mikelhsia.github.io/tags/How2/"/>
    
  </entry>
  
  <entry>
    <title>Beyond Traditional Grid Trading - Introducing A New Type of Grid Trading System</title>
    <link href="http://mikelhsia.github.io/2024/11/11/2024-11-11-new-type-of-grid-trading-system/"/>
    <id>http://mikelhsia.github.io/2024/11/11/2024-11-11-new-type-of-grid-trading-system/</id>
    <published>2024-11-11T05:45:20.000Z</published>
    <updated>2024-11-12T07:59:29.025Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2024/11/11/2024-11-11-new-type-of-grid-trading-system/cover.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Cover image created through <a href='https://https://copilot.microsoft.com/'>Copilot</a></i></p><p>Traditional grid trading systems, despite their popularity, face critical limitations with static price levels and poor performance in trending markets. Also, the vulnerability to sudden market gaps and the inability to adapt to changing market volatility is another huge obstacle. Given the promising and speedy advance in both hard and software of machine learning, <em>Francesco Rundo</em>, <em>*Francesca Trenta</em>, Agatino Luigi di Stallo<em>, and </em>Sebastiano Battiato* in the article <a href="https://www.mdpi.com/2076-3417/9/9/1796">Grid Trading System Robot (GTSbot): A Novel Mathematical Algorithm for Trading FX Market</a> proposed a new type of Grid Trading System that solves the challenges of the traditional one. This system dynamically optimizes entry/exit points and adapts to market conditions in real-time, maximizing profit potential while managing risks. Join me as I break down the complete process, from data preparation to model training, and reveal how you can implement this advanced trading strategy yourself.</p><a id="more"></a><hr><h3 id="Previous-readings"><a href="#Previous-readings" class="headerlink" title="Previous readings"></a>Previous readings</h3><ul><li><a href="https://mikelhsia.github.io/2024/10/04/2024-10-07-explain-bayes-theorm-without-using-big-words/">【How 2】Explain Bayes’ Theorem Without Using Big Words</a></li><li><a href="https://mikelhsia.github.io/2022/04/18/2022-04-20-forex-grid-trading-system/">Looking for no-loss trading strategy? Here’s the strategy that you should look at</a></li><li><a href="https://mikelhsia.github.io/2023/04/26/2023-05-01-pair-trading-cointegration-part2/">【Pair Trading】 Complete Guide to Backtest Cointegration Pair Trading Strategy</a></li></ul><hr><h1 id="Summary-of-the-GTSbot-paper"><a href="#Summary-of-the-GTSbot-paper" class="headerlink" title="Summary of the GTSbot paper"></a>Summary of the GTSbot paper</h1><p>We have discussed the traditional grid trading system in my previous post <a href="https://mikelhsia.github.io/2022/04/18/2022-04-20-forex-grid-trading-system/">Looking for no-loss trading strategy?</a>. The traditional grid trading strategy seems to be a good trading strategy that it always tries to buy low and sell high. However, when the <strong>stationary</strong> property of the time series, which is the core value of this strategy, cannot hold anymore, the strategy will turn into a show ruiner. This is usually due to the global market trading regime or a country’s central bank policy change.</p><img data-src="/2024/11/11/2024-11-11-new-type-of-grid-trading-system/drastic_change.png" class="" width="1000"><p style="text-align:center; color: grey;">  <i>Drastic change of the market trend sabotages the accumulated strategy returns</i></p><p>In the paper <a href="https://www.mdpi.com/2076-3417/9/9/1796">Grid Trading System Robot (GTSbot): A Novel Mathematical Algorithm for Trading FX Market</a>, the authors proposed a new type of Grid Trading System that solves the challenges of the traditional one. This system dynamically optimizes entry/exit points and adapts to market conditions in real-time, maximizing profit potential while managing risks. Therefore, compared to the traditional grid trading system, the GTSbot no longer relies on stagnant price levels as the baseline to define the grids. Instead, it employs a regression network to predict future price movement, building the various grids with different reference prices simultaneously. This system offers the performance similar to the <strong>Ichimoku</strong> FX trading strategy as high as 13.76% ROI over the backtest period, but also greatly reduces the Max DD to one fifth of Max DD generated by the benchmark strategy.</p><h1 id="What-is-GTSbot"><a href="#What-is-GTSbot" class="headerlink" title="What is GTSbot"></a>What is GTSbot</h1><img data-src="/2024/11/11/2024-11-11-new-type-of-grid-trading-system/model_diagram.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>The containing models of the GTSBot</i></p><p>The diagram above briefly demonstrates the architecture of the GTSbot. The GTSbot is composed of the following four main components:</p><h3 id="1-Regression-Network"><a href="#1-Regression-Network" class="headerlink" title="1. Regression Network:"></a>1. <strong>Regression Network</strong>:</h3><p>This component is responsible for predicting future price movement using the Forex pricing data. However, to predict the exact price is not the goal of this model. The aim is to use the combination of predicted future price and the historical data to determine the future trend so that we can decide either conduct long or short trades. The model proposed in the paper is called <strong>Scaled Conjugate Gradient (SCG)</strong> where introduces in <a href="https://www.sciencedirect.com/science/article/abs/pii/S0893608005800565">A scaled conjugate gradient algorithm for fast supervised learning</a> by _Martin Fodslette Moller_ to accelerate the time needed to train the model to reach global minima.</p><p>I spent some time to find the find the source code and made a few adjustments for this model to work. It works perfectly and greatly reduces the time needed for train the model. However, I found that you won’t be able to load this model on <a href="https://www.quantconnect.com/">QuantConnect</a> platform to conduct further backtest in order to obtain a more accurate backtest result. Therefore, I fallback to  the <strong>Long Short-Term Memory (LSTM)</strong> model, which is the most popular model for time series forecasting. Below is the structure of the LSTM model that I use.</p><img data-src="/2024/11/11/2024-11-11-new-type-of-grid-trading-system/lstm_model_structure.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>The structure of the LSTM model</i></p><p>The RMSE validation result is as below:<br><img data-src="/2024/11/11/2024-11-11-new-type-of-grid-trading-system/validation.png" class="" width="600"></p><p style="text-align:center; color: grey;">  <i>Vest Validation RMSE: 0.0126%</i></p><h3 id="2-Trend-Classification-Block-TCB"><a href="#2-Trend-Classification-Block-TCB" class="headerlink" title="2. Trend Classification Block (TCB):"></a>2. <strong>Trend Classification Block (TCB)</strong>:</h3><p>This component classifies the current trend based on the predicted price movement. It uses the output of the regression network to determine whether the trend is bullish, bearish, or neutral. As we learned in school’s physics class, there is a great chance that the we’re going to see a positive momentum if both the speed and the acceleration is positive, and vice versa. Here we use the differential of the price formula as the speed parameter and its second differential as the acceleration parameter. According to the <a href="https://en.wikipedia.org/wiki/Finite_difference"><strong>Central Difference Quotien</strong> and <strong>Second Central Difference Approximation</strong> rules</a>, we can simplify these differentials into following formula:</p><script type="math/tex; mode=display">\begin{aligned}&f^{\prime}(x) \approx \frac{f(x+h) - f(x-h)}{2h}\\&f^{\prime\prime}(x) \approx \frac{f(x+h) - 2f(x) + f(x-h)}{h^2}\\\text{where}&\text{:}\\&f(x+h) \text{ represents the predicted price}\\&f(x) \text{ and } f(x-h) \text{ represents the historical price}\\&h \text{ would be the time differences between the predicted price and the historical price}\end{aligned}</script><p>Using the above rules, let’s add the data we have and predicted to rewrite these equations:</p><script type="math/tex; mode=display">\begin{aligned}\frac{dc_{predict}^{Close}(k+1)}{dk}&=c_{predict}^{Close}(k+1)-c_{real}^{Close}(k)\\\frac{d^{2}c_{predict}^{Close}(k+1)}{dk^{2}}&=c_{predict}^{Close}(k+1)+c_{real}^{Close}(k-1)-2c_{real}^{Close}(k)\end{aligned}</script><script type="math/tex; mode=display">\begin{aligned}\text{where}&\text{:}\\&c_{predict}^{Close}(k+1) &\text{ is the predicted price}\\&c_{real}^{Close}(k) &\text{ is the yesterday's price}\\&c_{real}^{Close}(k-1) &\text{ is the price two days ago}\end{aligned}</script><p>Once we confirm that both these two differentials are positive, we can conclude that the price is going to increase and send the bullish signal to the next component to open a long trade. If both differentials are negative, send the bearish signal to the next component to open a short trade. If none of the above conditions met, we consider the market is still volatile and won’t send the signal to the next component to open any trade.</p><h3 id="3-Grid-System-Manager-Block-GSM"><a href="#3-Grid-System-Manager-Block-GSM" class="headerlink" title="3. Grid System Manager Block (GSM):"></a>3. <strong>Grid System Manager Block (GSM)</strong>:</h3><p>Once this component receive either bullish or bearish signal from the TCB and then place long/short trades accordingly. It will do a few check before placing any trade:</p><ol><li>We set the maximum holding positions of this strategy to be 15. In the paper also mentioned that this number is preferably to be an odd number, so that the long and short positions ca offset each other.</li><li>We define x_threshold to be 15 (samples), meaning that we won’t open a new trade until 15 minutes after our previous opened trade. This is to make sure that we don’t overtrade our capital in one single upward or downward trend.</li><li>We set y_threshold to 2 (pips). We need to make sure that the new trade we open is above all the previous long trades at least 2 pips, or below all the previous short trades at least 2 pips. In essence, this is the grid size of this trading system.</li></ol><h3 id="4-Basket-Equity-System-Manager-BESM"><a href="#4-Basket-Equity-System-Manager-BESM" class="headerlink" title="4. Basket Equity System Manager (BESM):"></a>4. <strong>Basket Equity System Manager (BESM)</strong>:</h3><p>Lastly, the BESM component act essentially as a risk management component. It monitors each opened positions and will close them once hitting the take-profit points. Interestingly, the paper mentioned that the BESM component is specifically designed not to set any stop-loss points. The authors believe that this new type of grid system would open an opposite trade to compensate the wrongly opened trade when the TCB component realize the trend is turning around.</p><p>Let’s see how to can backtest this strategy to examine its performance against the real world.</p><h1 id="How-to-implement-this-GTSbot"><a href="#How-to-implement-this-GTSbot" class="headerlink" title="How to implement this GTSbot?"></a>How to implement this GTSbot?</h1><script src="https://gist.github.com/mikelhsia/7859bb21a1738a1744880a0d326bef16.js"></script><img data-src="/2024/11/11/2024-11-11-new-type-of-grid-trading-system/grid_trading_strategy_performance.png" class="" width="1000"><p style="text-align:center; color: grey;">  <i>Final backtest strategy performance</i></p><h1 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h1><p>The strategy seems to be very profitable in the beginning half a year. However, as we all know that this strategy lacks the proper stop-loss point for each trade, so these trades get stuck in the market for a long time at one point. This results leaves us a lot of room to further improve this strategy:</p><ol><li>Instead of simply adding a stop-loss point, we can adopt the <strong>Triple Barrier Method</strong> technique mentioned in <a href="https://mikelhsia.github.io/2022/10/21/2022-10-15-meta-label/">this post</a> to confine the risk of each trade.</li><li>You can’t place both long and short order on <a href="https://www.quantconnect.com/">QuantConnect</a> system as the forex trades will net themselves off. Due to this reason, we can’t use QuantConnect to further backtest this strategy.</li><li>You CAN place both long and short order via FXCM. However, it doesn’t provide a simulation platform to conduct backtest against the price history.</li><li><a href="https://www.fxcm.com/markets/algorithmic-trading/api-trading/">FXCM</a> provides the API and DEMO account so that you can build a simulation backtest in paper trading. Yet building this tool takes a lot of effort</li></ol><p>I wouldn’t judge this strategy as either a good or bad one. However, this paper does provide us a tons of ideas to rethink what a grid trading system can and should do. I hope you enjoy reading this paper as much as I do.</p>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2024/11/11/2024-11-11-new-type-of-grid-trading-system/cover.png&quot; class=&quot;&quot; width=&quot;600&quot;&gt;
&lt;p style=&quot;text-align:center; color: grey;&quot;&gt;
  &lt;i&gt;Cover image created through &lt;a href=&#39;https://https://copilot.microsoft.com/&#39;&gt;Copilot&lt;/a&gt;&lt;/i&gt;
&lt;/p&gt;


&lt;p&gt;Traditional grid trading systems, despite their popularity, face critical limitations with static price levels and poor performance in trending markets. Also, the vulnerability to sudden market gaps and the inability to adapt to changing market volatility is another huge obstacle. Given the promising and speedy advance in both hard and software of machine learning, &lt;em&gt;Francesco Rundo&lt;/em&gt;, &lt;em&gt;*Francesca Trenta&lt;/em&gt;, Agatino Luigi di Stallo&lt;em&gt;, and &lt;/em&gt;Sebastiano Battiato* in the article &lt;a href=&quot;https://www.mdpi.com/2076-3417/9/9/1796&quot;&gt;Grid Trading System Robot (GTSbot): A Novel Mathematical Algorithm for Trading FX Market&lt;/a&gt; proposed a new type of Grid Trading System that solves the challenges of the traditional one. This system dynamically optimizes entry/exit points and adapts to market conditions in real-time, maximizing profit potential while managing risks. Join me as I break down the complete process, from data preparation to model training, and reveal how you can implement this advanced trading strategy yourself.&lt;/p&gt;</summary>
    
    
    <category term="Quantitative Trading" scheme="http://mikelhsia.github.io/categories/Quantitative-Trading/"/>
    
    
    <category term="Backtesting" scheme="http://mikelhsia.github.io/tags/Backtesting/"/>
    
    <category term="Grid trading" scheme="http://mikelhsia.github.io/tags/Grid-trading/"/>
    
    <category term="Forex" scheme="http://mikelhsia.github.io/tags/Forex/"/>
    
  </entry>
  
  <entry>
    <title>【How 2】Explain Bayes&#39; Theorem Without Using Big Words</title>
    <link href="http://mikelhsia.github.io/2024/10/04/2024-10-07-explain-bayes-theorm-without-using-big-words/"/>
    <id>http://mikelhsia.github.io/2024/10/04/2024-10-07-explain-bayes-theorm-without-using-big-words/</id>
    <published>2024-10-04T15:37:29.000Z</published>
    <updated>2024-10-06T03:49:23.091Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2024/10/04/2024-10-07-explain-bayes-theorm-without-using-big-words/cover.jpg" class="" width="800"><p style="text-align:center; color: grey;">  <i>Cover image created through <a href='https://https://copilot.microsoft.com/'>Copilot</a></i></p><p>Have you ever struggled to understand Bayes’ Theorem? You’re not alone. I used to find the formal definitions and explanations of Bayes’ Theorem on Wikipedia confusing and hard to grasp. But after revisiting the basics of Venn Diagrams, everything suddenly became clear! Let me break down the complex formula into easy-to-digest parts using color coding and visual aids with Venn diagrams. Let’s dive in!</p><a id="more"></a><hr><h3 id="Previous-readings"><a href="#Previous-readings" class="headerlink" title="Previous readings"></a>Previous readings</h3><ul><li><a href="https://mikelhsia.github.io/2024/04/23/2024-04-24-spin-up-docker-container-for-your-tws/">【How 2】Breaking Free! Use Docker to Create Hands-Off Interactive Broker TWS Managing Experience</a></li><li><a href="https://mikelhsia.github.io/2022/12/12/2022-12-10-IBKR-Broker/">【How 2】 Set Up Trading API Template In Python - Connecting My Trading Strategies To Interactive Brokers</a></li><li><a href="https://mikelhsia.github.io/2020/10/19/2020-10-19-get-all-tradable-tickers/">【How 2】 Vol. 1. How 2 get all tradable tickers in US markets</a></li></ul><hr><h1 id="What-is-Bayes’-Theorem"><a href="#What-is-Bayes’-Theorem" class="headerlink" title="What is Bayes’ Theorem?"></a>What is Bayes’ Theorem?</h1><p>We know that Bayes’s Theorem has many modern implications such as machine learning and artificial intelligence. Below is the definition extracted from Wikipedia:</p><blockquote><p> Bayesian inference, a particular approach to statistical inference, where it is used to invert the probability of observations given a model configuration (i.e., the likelihood function) to obtain the probability of the model configuration given the observations (i.e., the posterior probability)…</p></blockquote><p>Um…, this part of the definition has already killed half of my brain cells. As a non-math major, it took me a while to understand the vague concept of prior knowledge and posterior probabilities. Is there any other way to explain Bayes’ Theorem without using those big long words?</p><h1 id="Venn-Diagram-Visualization"><a href="#Venn-Diagram-Visualization" class="headerlink" title="Venn Diagram Visualization"></a>Venn Diagram Visualization</h1><p>Ok, maybe this is not a new method at all. However, I’ve found that using the Venn Diagram to explain Bayes’ Theorem is much easier to grasp. Let us start with the Bayes’ Formula used in the theorem:</p><script type="math/tex; mode=display">P(A|B) = \frac{P(B|A) * P(A)}{P(B)}</script><p>$P(A|B)$ is the probability of event A occurring given that event B has occurred. With reference to the below Venn Diagram, the explanation could be simplified to <strong><em>the probability of the intersection $P(A\cap B)$ divided by the probability of event B $P(B)$</em></strong>.</p><img data-src="/2024/10/04/2024-10-07-explain-bayes-theorm-without-using-big-words/P-A-B-min.png" class="" width="400"><p style="text-align:center; color: grey;">  <i>Venn Diagram of P(A|B)</i></p><p>The same method applies to $P(B|A)$, that the probability of it is equivalent to the probability of the intersection $P(B\cap A) \text{ (which is equal to } P(A\cap B)\text{)}$ divided by the probability of event A ($P(A)$).</p><img data-src="/2024/10/04/2024-10-07-explain-bayes-theorm-without-using-big-words/P-B-A-min.png" class="" width="400"><p style="text-align:center; color: grey;">  <i>Venn Diagram of P(B|A)</i></p><p>By visualizing Bayes’ Formula with the Venn Diagram, we can easily grasp that the $P(A\cap B)$ is actually the centerpiece of the entire Bayes’ Formula. And, the whole thing that Bayes’ Formula is trying to figure out the probability of a defined cause (<em>event A</em>) given the observed effect (<em>event B</em>). In other words, it is actually calculating the probability of the intersection $P(A\cap B)$ under the occurrence of event A $P(A)$ by using the probability of the occurrence of event B $P(B)$.</p><img data-src="/2024/10/04/2024-10-07-explain-bayes-theorm-without-using-big-words/P-AB-min.png" class="" width="400"><p style="text-align:center; color: grey;">  <i>The centerpiece of the probability of A intersects B</i></p><p>Let’s now step-by-step derive the Bayes’ Formula to see the details behind the curtain.</p><script type="math/tex; mode=display">\textcolor{blue}{P(A|B)} = \frac{P(B|A) * P(A)}{P(B)}</script><script type="math/tex; mode=display">\frac{P(A\cap B)}{P(B)} = \frac{\textcolor{blue}{P(B|A)} * P(A)}{P(B)}</script><script type="math/tex; mode=display">\frac{P(A\cap B)}{P(B)} = \frac{\frac{P(A\cap B)}{P(A)} * P(A)}{P(B)}</script><img data-src="/2024/10/04/2024-10-07-explain-bayes-theorm-without-using-big-words/bayes_process.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Putting three diagrams in one chart</i></p><h1 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h1><p>Understanding Bayes’ Theorem doesn’t have to be a daunting task filled with complex jargon and intimidating formulas. By breaking it down with simple visual aids like Venn diagrams, we can see that it’s all about finding the probability of one event given the occurrence of another. The key takeaway is that Bayes’ Theorem helps us update our beliefs based on new information, making it a powerful tool in many fields, from medicine to machine learning.</p><p>Hope this helps.</p>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2024/10/04/2024-10-07-explain-bayes-theorm-without-using-big-words/cover.jpg&quot; class=&quot;&quot; width=&quot;800&quot;&gt;
&lt;p style=&quot;text-align:center; color: grey;&quot;&gt;
  &lt;i&gt;Cover image created through &lt;a href=&#39;https://https://copilot.microsoft.com/&#39;&gt;Copilot&lt;/a&gt;&lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;Have you ever struggled to understand Bayes’ Theorem? You’re not alone. I used to find the formal definitions and explanations of Bayes’ Theorem on Wikipedia confusing and hard to grasp. But after revisiting the basics of Venn Diagrams, everything suddenly became clear! Let me break down the complex formula into easy-to-digest parts using color coding and visual aids with Venn diagrams. Let’s dive in!&lt;/p&gt;</summary>
    
    
    <category term="How2" scheme="http://mikelhsia.github.io/categories/How2/"/>
    
    
    <category term="How2" scheme="http://mikelhsia.github.io/tags/How2/"/>
    
  </entry>
  
  <entry>
    <title>Supercharge Your Trading Strategy - Bull? Bear? Here’s How to Profit</title>
    <link href="http://mikelhsia.github.io/2024/09/23/2024-09-30-market-indicator/"/>
    <id>http://mikelhsia.github.io/2024/09/23/2024-09-30-market-indicator/</id>
    <published>2024-09-23T04:13:13.000Z</published>
    <updated>2024-10-01T05:10:17.645Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2024/09/23/2024-09-30-market-indicator/cover.jpeg" class="" width="600"><p style="text-align:center; color: grey;">  <i>Cover image created through <a href='https://copilot.microsoft.com/'>Copilot</a></i></p><p>Ever wondered how to tell if the market is roaring like a bull or growling like a bear? This article will dive deep into how these indicators perform in different market conditions, comparing manual identification with indicator-based methods. By adding this market indicator, you’ll see how momentum strategies shine in bull markets, while mean reversion and pair trading strategies come into play during bear markets. Plus, we’ll uncover some insights that you might never have considered while applying market indicators in your own algorithm trading script. Now, let’s dive in and uncover the secrets of the market together!</p><a id="more"></a><p><strong><em>Previous researches</em></strong></p><ul><li><a href="https://mikelhsia.github.io/2024/06/28/2024-06-28-why-fit-and-transform/">【ML algo trading】One Pitfall You Definitely Need to Avoid in Feature Engineering</a></li><li><a href="https://mikelhsia.github.io/2024/06/24/2024-06-24-test-oauth-via-postman/">【How 2】A Productive Way to Manage OAuth 2.0 Tokens</a></li><li><a href="https://mikelhsia.github.io/2024/04/23/2024-04-24-spin-up-docker-container-for-your-tws/">【How 2】Breaking Free! Use Docker to Create Hands-Off Interactive Broker TWS Managing Experience</a></li></ul><hr><h1 id="Context"><a href="#Context" class="headerlink" title="Context"></a>Context</h1><p>Various quantitative trading strategies are currently executed in the market, and each strategy has its own characteristics and advantages. For example, <strong>momentum trading strategies</strong> are often used to profit from the trend of the bull/bear market, while <strong>mean reversion strategies</strong> are often used to profit from the fluctuation of the market; <strong>machine learning algorithms</strong> can be used to discover the hidden pattern in the market and to profit from it accordingly; <strong>pair trading strategies</strong> aim to exploit the short-term turmoil in the market and to make profit from the pricing adjustment movement among multiple correlated assets.</p><p>However, the market is not always in the same stable state, and various influences, macro or micro, could affect and change the market conditions. Therefore, executing one single trading strategy throughout the entire market cycle is not always the best choice. Hence, inventing a method to constantly detect and monitor market conditions has become an important topic in the quantitative trading field.</p><h1 id="What-is-the-Market-Indicator"><a href="#What-is-the-Market-Indicator" class="headerlink" title="What is the Market Indicator?"></a>What is the Market Indicator?</h1><p>Before we dive into the details of the market indicator, let’s first understand what the market looks like. Below is the SPY index market return n from <strong>2020-01-03</strong> to <strong>2024-09-16</strong>:</p><img data-src="/2024/09/23/2024-09-30-market-indicator/benchmark_return.png" class="" width="1000"><p style="text-align:center; color: grey;">  <i>The SPY market return from 2020-01-03 to 2024-09-16</i></p><p>As the first step to decide what kind of trading strategy we’re using, we would need to visually identify what is the market condition now. I believe most of the investors like me would classify the market as below:</p><img data-src="/2024/09/23/2024-09-30-market-indicator/benchmark_return_manual.png" class="" width="1000"><p style="text-align:center; color: grey;">  <i>Identified bull/bear market conditions</i></p><p>What is the purpose of a market indicator? In essence, it is a quantitative, mathematical, and analytical tool that allows algorithms to assess current market conditions without the need for human intervention. By accurately identifying the existing market situation, investors can implement trading strategies that are best suited to the current climate, maximizing the performance of their strategies.</p><p>In this post, I’m going to introduce the following four basic market indicators that help the investor to identify the market condition and to make the trading strategy more robust.</p><ul><li><p>Using SPY daily close price and its history data</p><ul><li>SPY index stands for the S&amp;P 500 index, which is a market-weighted index of 500 large-cap U.S. stocks. Therefore, we can use the 200-day Simple Moving Average of SPY close price as a threshold. When the SPY daily close price is above the 200-day Simple Moving Average, the market is considered to be bullish; otherwise, it is considered to be bearish.</li></ul></li><li><p>The Average Directional Indicator (ADX)</p><ul><li>The ADX is a technical indicator that measures the strength of a trend. It is calculated by comparing the difference between the current high and low prices with the previous high and low prices. To see more details of ADX, you can refer to <a href="https://www.investopedia.com/terms/a/adx.asp">this</a> and <a href="https://www.investopedia.com/articles/trading/07/adx-trend-indicator.asp">this</a> article.</li><li>In the book <a href="https://www.amazon.com/Quantitative-Trading-Strategies-Harnessing-McGraw-Hill/dp/0071412395/">Quantitative Trading Strategy</a> (<em>by LARS KESTNER</em>), he mentioned that the 14-day ADX is a good indicator to identify the market condition, and recommends the following frameworks to identify the market condition:</li></ul></li></ul><div class="table-container"><table><thead><tr><th>Regime</th><th>Trend</th><th>Strategy</th></tr></thead><tbody><tr><td>ADX &lt; 15</td><td>Mean reverting prices</td><td>RSI oscillator to take counter-trend signals</td></tr><tr><td>15 &lt; ADX &lt; 25</td><td>Random walk, no trend, no mean reversion in prices</td><td>No trading</td></tr><tr><td>ADX &gt; 25</td><td>Trending prices</td><td>40-day/20-day channel breakout</td></tr></tbody></table></div><p style="text-align:center; color: grey;">  <i>Regime switching rule I</i></p><div class="table-container"><table><thead><tr><th>Regime</th><th>Trend</th><th>Strategy</th></tr></thead><tbody><tr><td>ADX &lt; 20</td><td>Trend to begin soon</td><td>40-day/20-day channel breakout</td></tr><tr><td>20 &lt; ADX &lt; 30</td><td>Mean reversion in prices</td><td>14 day RSI strategy</td></tr><tr><td>ADX &gt; 30</td><td>Trending prices</td><td>40-day/20-day channel breakout</td></tr></tbody></table></div><p style="text-align:center; color: grey;">  <i>Regime switching rule II</i></p><ul><li><p>CBOE VIX Index</p><ul><li>The VIX index is a measure of the market’s expectation of future volatility based on S&amp;P 500 index options. The VIX index goes high when the market is volatile and low when the market is stable. So we will use the 15-day VIX SMA plus 15% of it as a threshold. When the VIX index close price is above the threshold, the market is considered to be volatile/bearish; otherwise, it is considered to be stable/bullish.</li></ul></li><li><p>Yield Curve Inversion</p><ul><li>I assume most of you have been quite familiar with this indicator. The yield curve inversion is a situation where long-term interest rates are lower than short-term interest rates. This situation is considered to be a sign of an impending recession. Therefore, we will use the 10-year Treasury yield and the 2-year Treasury yield as indicators. When the short-term yield curve is above the long-term yield curve, we consider the yield curve inversion is happening, and the market is considered to be bearish and vice versa. <a href="https://www.investopedia.com/terms/i/invertedyieldcurve.asp">Here</a> is the definition of the yield curve inversion if you want to know more.</li></ul></li></ul><h1 id="Backtest-Results"><a href="#Backtest-Results" class="headerlink" title="Backtest Results"></a>Backtest Results</h1><h2 id="Platform"><a href="#Platform" class="headerlink" title="Platform"></a>Platform</h2><p><a href="https://quantconnect.com/">QuantConnect</a></p><h2 id="Backtest-Period"><a href="#Backtest-Period" class="headerlink" title="Backtest Period"></a>Backtest Period</h2><p><code>2020-01-03</code> to <code>2024-09-16</code></p><h2 id="Trading-Framework"><a href="#Trading-Framework" class="headerlink" title="Trading Framework"></a>Trading Framework</h2><h3 id="Universe"><a href="#Universe" class="headerlink" title="Universe"></a>Universe</h3><p>The constituents of the SPDR S&amp;P 500 ETF Trust (SPY)</p><h3 id="Strategy-Benchmark"><a href="#Strategy-Benchmark" class="headerlink" title="Strategy Benchmark"></a>Strategy Benchmark</h3><p>SPDR S&amp;P 500 ETF Trust (SPY)</p><h3 id="Trading-Rules"><a href="#Trading-Rules" class="headerlink" title="Trading Rules"></a>Trading Rules</h3><ul><li>First of all, we need to prepare two strategies: one is the momentum trading strategy that can capture the momentum when the market is bullish, and the other is the mean reversion trading strategy that can capture the momentum when the market is bearish. I’ve created a weekly rebalancing momentum strategy and a RSI reverse trading strategy to be executed when the market is bearish. Unfortunately, my bullish and bearish trading strategies are highly correlated, which makes it less likely to compensate when the momentum is diminished. Therefore, I’ve decided to use the cash strategy to hold cash only when the market is bearish. You can replace it with investing in other fixed-income securities.</li></ul><img data-src="/2024/09/23/2024-09-30-market-indicator/strategy_correlation.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Strategy performance correlation heatmap</i></p><ul><li>For each trading strategy, we design the capacity to hold ten stocks maximum.</li><li>All the positions are created or closed right after the market opens.</li><li>Strategy-specific trading rules<ul><li>Momentum trading strategy<ul><li>Calculate the momentum of each stock by averaging the one-month, three-month, six-month, and one-year returns evenly.</li><li>Rebalance the portfolio by holding the top 10 stocks that have the highest momentum scores.</li></ul></li><li>Cash holding strategy<ul><li>Liquidate all the positions and hold cash only.</li></ul></li></ul></li><li>According to the market condition detected by the market indicator, we will switch the trading strategy to momentum trading strategy when bullish, while switching to the cash holding strategy when bearish.</li></ul><p>One thing that I’ve noticed while conducting this backtest and would like to share with you is that there’s a situation that you need to be aware of. You might encounter the scene where you will need to switch to another trading strategy while having positions open in your portfolio. To further manage your assets, you could do the following to make sure the trading strategy executes the exact instructions you expected:</p><ol><li>Liquidate all the opening positions before you long the new assets from another trading strategy.</li><li>Hold the current positions and apply the exit rules of the new trading strategy.</li><li>Use the market indicator that is more robust to minimize the number of times you need to switch the trading strategy.</li></ol><p>Let me know if you have any other better ideas to handle this situation.</p><h2 id="Benchmark-performance-of-the-strategy-momentum-only"><a href="#Benchmark-performance-of-the-strategy-momentum-only" class="headerlink" title="Benchmark performance of the strategy (momentum only)"></a>Benchmark performance of the strategy (momentum only)</h2><p>Below is the backtest result of the momentum strategy performance. As you can see, the strategy performance dropped significantly during February 2020 and stayed relatively flat from November 2021 until 2023 September. We expect to increase the return in these periods by incorporating the market indicator and to switch to the cash holding strategy to reduce the loss and increase the overall return.</p><img data-src="/2024/09/23/2024-09-30-market-indicator/benchmark_strategy_return.png" class="" width="1000"><p style="text-align:center; color: grey;">  <i>Momentum strategy performance backtest results</i></p><h2 id="Results-attached-with-the-strategy-switching-plot"><a href="#Results-attached-with-the-strategy-switching-plot" class="headerlink" title="Results (attached with the strategy switching plot)"></a>Results (attached with the strategy switching plot)</h2><h3 id="Using-SPY-daily-close-price-and-its-history-data"><a href="#Using-SPY-daily-close-price-and-its-history-data" class="headerlink" title="Using SPY daily close price and its history data"></a>Using SPY daily close price and its history data</h3><img data-src="/2024/09/23/2024-09-30-market-indicator/spy_market_filter.png" class="" width="1000"><p style="text-align:center; color: grey;">  <i>Strategy performance when adopting SPY as a market indicator</i></p><p>The first plot above shows the performance of the strategy. The second plot indicates what strategy was used according to the SPY market indicator. The bottom plot is to plot the SPY daily close price and its 200-day moving average. As said when the daily close price soar above the 200-day moving average, the momentum strategy is used. Otherwise, the cash holding strategy is used.</p><p>Now let’s go back to the first plot. The red line marked the clear bear market seems pretty conforming to the grey dotted line, which indicates the perceived bear market. However, once you look closely and compare it to the original benchmark strategy performance, you will find out that this strategy missed the comeback of the market from the bottom. Even though the SPY market indicator did help us cap the loss, missing the comeback of the market greatly damaged the profitability of the strategy. If the SPY indicator can increase the sensitivity on the part to detect the market recovery earlier, the strategy could be more profitable.</p><p>The SPY market indicator results in <strong>32 changes</strong> between trading strategies. This high frequency of changes leads to frequent trading, incurring significant transaction costs and very short holding periods. Both these outcomes are often not beneficial and could be toxic if you can’t manage them well in the trading strategy. By smoothing the market indicator and reducing the frequency of strategy adjustments, we can minimize friction. This approach will decrease the incidence of short-term, unnecessary trades, thereby lowering transaction costs and enhancing the overall performance of the strategy.</p><h3 id="Average-Directional-Indicator-ADX"><a href="#Average-Directional-Indicator-ADX" class="headerlink" title="Average Directional Indicator (ADX)"></a>Average Directional Indicator (ADX)</h3><p>Let’s recap the trading rules of the strategies using ADX as the market indicator.</p><h4 id="ADX-Trading-Rule-One"><a href="#ADX-Trading-Rule-One" class="headerlink" title="ADX Trading Rule One"></a>ADX Trading Rule One</h4><div class="table-container"><table><thead><tr><th>Regime</th><th>Trend</th><th>Strategy</th></tr></thead><tbody><tr><td>ADX &lt; 15</td><td>Mean reverting prices</td><td>RSI oscillator to take counter-trend signals</td></tr><tr><td>15 &lt; ADX &lt; 25</td><td>Random walk, no trend, no mean reversion in prices</td><td>No trading</td></tr><tr><td>ADX &gt; 25</td><td>Trending prices</td><td>40-day/20-day channel breakout</td></tr></tbody></table></div><p style="text-align:center; color: grey;">  <i>Regime switching rule I</i></p><p>The basic idea of this is to execute the momentum strategy when the ADX market indicator detect the bull market, execute the cash holding strategy when the fluctuating market is detected, and execute the mean reversion strategy to capture the reversion of the market during the bear market. At first hearing, the core method of the trading strategy sounds legit. However, you see the different strategies got switched as many as <strong>73 times</strong> from the second plot in the strategy backtest results below. The idea to introduce the market indicator is to identify not just the current market state but also the beginning of the future market trend. Given the volatile market state predicted, switching the trading strategy too frequently diminished the effectiveness of all trading strategies.</p><img data-src="/2024/09/23/2024-09-30-market-indicator/adx_market_filter_rule_1.png" class="" width="1000"><p style="text-align:center; color: grey;">  <i>Strategy performance when adopting ADX rule 1 as a market indicator</i></p><h4 id="ADX-Trading-Rule-Two"><a href="#ADX-Trading-Rule-Two" class="headerlink" title="ADX Trading Rule Two"></a>ADX Trading Rule Two</h4><div class="table-container"><table><thead><tr><th>Regime</th><th>Trend</th><th>Strategy</th></tr></thead><tbody><tr><td>ADX &lt; 20</td><td>Trend to begin soon</td><td>40-day/20-day channel breakout</td></tr><tr><td>20 &lt; ADX &lt; 30</td><td>Mean reversion in prices</td><td>14 day RSI strategy</td></tr><tr><td>ADX &gt; 30</td><td>Trending prices</td><td>40-day/20-day channel breakout</td></tr></tbody></table></div><p style="text-align:center; color: grey;">  <i>Regime switching rule II</i></p><p>To improve the previous trading strategy, <em>LARS KESTNER</em> in his book <a href="https://www.amazon.com/Quantitative-Trading-Strategies-Harnessing-McGraw-Hill/dp/0071412395/">Quantitative Trading Strategy</a> recommended a different regime switching rule. In this new rule, he proposed treating the bull and bear market in the same condition and leaving the fluctuating market as a separate condition. We execute a momentum trading strategy not only during bull but also bear market, while executing the mean reversion strategy during the fluctuating market.</p><img data-src="/2024/09/23/2024-09-30-market-indicator/adx_market_filter_rule_2.png" class="" width="1000"><p style="text-align:center; color: grey;">  <i>Strategy performance when adopting ADX rule 2 as a market indicator</i></p><p>Surprisingly, even though the backtest results were not as satisfactory, there are a few noteworthy points:</p><ul><li>We expected fewer strategy switches. However, the strategy still switched <strong>83 times</strong>, which is more than the previous rule.</li><li>Despite the increased frequency of strategy switches, the actual number of trades was significantly lower than with the previous rule.</li><li>Upon cross-checking the order history and the timing of strategy changes, I discovered that when the strategy switched from a bull to a bear market, the signals from both trading strategies pointed to the same stock symbols. This indicates that <strong><em>the stocks with the greatest momentum in a bull market tend to be the same stocks that experience the greatest dip when the market turns bearish</em></strong>. This could be an interesting finding to explore further.</li></ul><p>Anyway, in general, the ADX indicator seems too volatile to be used as a market indicator and needs more investigation.</p><h3 id="CBOE-VIX-Index"><a href="#CBOE-VIX-Index" class="headerlink" title="CBOE VIX Index"></a>CBOE VIX Index</h3><p>Compared to other market indicators, the CBOE VIX Index behaves somewhat differently. As a measure of the implied volatility of S&amp;P 500 index options, the VIX can spike dramatically in a short period. When the market and investors recognize heightened volatility, the VIX tends to decrease swiftly as the negative sentiment has been absorbed by the market and other financial institutions.</p><p>As illustrated in the chart below, the frequent spikes indicate that the VIX primarily measures market sentiment and tends to return to a stable state quickly.</p><img data-src="/2024/09/23/2024-09-30-market-indicator/vix_history_price.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>VIX daily price captured from <a href='https://finance.yahoo.com/quote/%5EVIX/chart/'>Yahoo! Finance</a></i></p><p>The chart below shows that the strategy adopting VIX as market indicator has a very nice performance close to the benchmark trading strategy. The performance of the strategy has nearly no growth from February 2021 to January 2024, meaning this strategy took around three years to recover from the bottom. On the other hand, if you look at this from another perspective, adding the VIX market indicator to the trading strategy has helped reduce the volatility, and also the max drawdown, from 42% to 26%, giving a similar profitability, 20.193% vs 17.137% annually. Therefore, it’s quite clear that VIX does help detect the beginning of the bear market and further avoid the significant loss effectively.</p><img data-src="/2024/09/23/2024-09-30-market-indicator/vix_market_filter.png" class="" width="1000"><p style="text-align:center; color: grey;">  <i>Strategy performance while adopting VIX as a market indicator</i></p><div class="table-container"><table><thead><tr><th>Strategy</th><th>Sharpe Ratio</th><th>Total Return</th><th>Annual Return</th><th>Max Drawdown</th><th>Annual Variance</th></tr></thead><tbody><tr><td>Benchmark</td><td>20.012</td><td>137.07%</td><td>20.193%</td><td>41.9%</td><td>0.051</td></tr><tr><td>VIX</td><td><font color ='red'>16.462</font></td><td><font color ='red'>110.81%</font></td><td><font color ='red'>17.137%</font></td><td><font color ='green'>26.00%</font></td><td><font color ='green'>0.042</font></td></tr></tbody></table></div><p style="text-align:center; color: grey;">  <i>Comparison of the performance between VIX market indicator strategy and benchmark strategy</i></p><h3 id="Yield-Curve-Inversion"><a href="#Yield-Curve-Inversion" class="headerlink" title="Yield Curve Inversion"></a>Yield Curve Inversion</h3><p>As known, the inversion of the yield curve is widely regarded as a signal confirming a recession of the macroeconomics. The primary indicators used are usually focusing on the spread between 3-month and 10-year treasury yield. The backtest result is as below:</p><img data-src="/2024/09/23/2024-09-30-market-indicator/yield_curve_3m_10y.png" class="" width="1000"><p style="text-align:center; color: grey;">  <i>Strategy performance using 3-month and 10-year yield curve as indicators</i></p><p>Surprisingly, unlike previous market indicators, the yield curve inversion using the <strong>3-month</strong> and <strong>10-year</strong> treasury yields marks the market into two distinct states. The first half of the backtest period shows a bull market until early 2022, while the second half indicates a bear market. However, when comparing the strategy performance plots with the trading strategy used, the stock market trend doesn’t positively correlate with our bear market prediction. We stopped investing during the market turmoil around November 2022, but the market indicator didn’t turn bullish during the subsequent stock market rise. So, to increase the sensitivity of the yield curve market indicator, let’s try using <strong>3-month</strong> and <strong>2-year</strong> treasury yields to mark the market.</p><img data-src="/2024/09/23/2024-09-30-market-indicator/yield_curve_3m_2y.png" class="" width="1000"><p style="text-align:center; color: grey;">  <i>Strategy performance using 3-month and 2-year yield curve as indicators</i></p><p>The movement of both the 3-month and 2-year treasury yields is more pronounced and clearer than in our previous backtest. However, the new backtest doesn’t reveal any advantage of using the more sensitive market indicator. While the strategy’s performance does improve, there is no evidence of improvement in the max drawdown, annual variance, or standard deviation.</p><div class="table-container"><table><thead><tr><th>Strategy</th><th>Sharpe Ratio</th><th>Total Return</th><th>Annual Return</th><th>Max Drawdown</th><th>Annual Variance</th><th>Standard Deviation</th></tr></thead><tbody><tr><td>3-month vs. 10-year</td><td>0.424</td><td>77.30%</td><td>12.899%</td><td>28.40%</td><td>0.039</td><td>0.197</td></tr><tr><td>3-month vs. 2-year</td><td><font color ='green'>0.473</font></td><td><font color ='green'>89.69%</font></td><td><font color ='green'>14.466%</font></td><td><font color ='red'>30.40%</font></td><td><font color ='red'>0.04</font></td><td><font color ='red'>0.201</font></td></tr></tbody></table></div><p style="text-align:center; color: grey;">  <i>Comparison of the performance between strategies 3m-10y and 3m-26 yield curve</i></p><h1 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h1><p>After all the backtests and their plots clearly illustrated above, we can briefly summarize the following conclusion of these four market indicators:</p><ol><li>SPY market indicator does seem to possess the capability to mark the bullish/bearish market. Even though the timing wouldn’t 100% overlap between the market state predicted and the market state we visually marked, the SPY market indicator helps separate the market into a clear bullish and bearish market.</li><li>ADX market indicator seems to be too volatile to distinguish the bullish market from the bearish market. This approach would need a method to smooth the fluctuations before applying to any strategy.</li><li>VIX market indicator doesn’t seem to belong to the same arena as other market indicators. Given the characteristics of short-term fluctuations of the CBOE VIX, it could be used as a stop gain/loss signal rather than a market indicator.</li><li>Yield curve inversion market indicator is the most stable market indicator that separates the bullish from the bearish. However, it wouldn’t be able to reflect the coming back of the stock market swiftly.</li></ol><p>I guess finding an ultimate market indicator would be the topic that all individual and institutional investors are trying to discover. It’s like a hidden holy grail that everyone craves. So understanding these four basic market indicators, it would just be the start of the long journey to the holy grail.</p>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2024/09/23/2024-09-30-market-indicator/cover.jpeg&quot; class=&quot;&quot; width=&quot;600&quot;&gt;
&lt;p style=&quot;text-align:center; color: grey;&quot;&gt;
  &lt;i&gt;Cover image created through &lt;a href=&#39;https://copilot.microsoft.com/&#39;&gt;Copilot&lt;/a&gt;&lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;Ever wondered how to tell if the market is roaring like a bull or growling like a bear? This article will dive deep into how these indicators perform in different market conditions, comparing manual identification with indicator-based methods. By adding this market indicator, you’ll see how momentum strategies shine in bull markets, while mean reversion and pair trading strategies come into play during bear markets. Plus, we’ll uncover some insights that you might never have considered while applying market indicators in your own algorithm trading script. Now, let’s dive in and uncover the secrets of the market together!&lt;/p&gt;</summary>
    
    
    <category term="Quantitative Trading" scheme="http://mikelhsia.github.io/categories/Quantitative-Trading/"/>
    
    
    <category term="Technical Analysis" scheme="http://mikelhsia.github.io/tags/Technical-Analysis/"/>
    
    <category term="Backtesting" scheme="http://mikelhsia.github.io/tags/Backtesting/"/>
    
    <category term="QuantConnect" scheme="http://mikelhsia.github.io/tags/QuantConnect/"/>
    
  </entry>
  
  <entry>
    <title>【ML algo trading】One Pitfall You Definitely Need to Avoid in Feature Engineering</title>
    <link href="http://mikelhsia.github.io/2024/06/28/2024-06-28-why-fit-and-transform/"/>
    <id>http://mikelhsia.github.io/2024/06/28/2024-06-28-why-fit-and-transform/</id>
    <published>2024-06-28T06:44:36.000Z</published>
    <updated>2024-07-01T06:09:48.845Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2024/06/28/2024-06-28-why-fit-and-transform/cover.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Cover image created through <a href='https://https://copilot.microsoft.com/'>Copilot</a></i></p><p>Feature engineering is a make-or-break step in the machine learning pipeline, but even seasoned practitioners can fall victim to a subtle yet devastating mistake. While labeling or transforming data may seem like a straightforward task, a common pitfall lurks in the shadows, waiting to undermine your model’s performance in ways you might not expect. This often overlooked issue can lead to a phenomenon known as “look-ahead bias,” where your model inadvertently gains access to information it shouldn’t have during training. In this post, we’re going to talk about what this pitfall exactly is and how to address it.</p><a id="more"></a><p><strong><em>Previous researches</em></strong></p><ul><li><a href="https://mikelhsia.github.io/2022/10/21/2022-10-15-meta-label/">【Momentum Trading】Use machine learning to boost your day trading skill - meta-labeling</a></li><li><a href="https://mikelhsia.github.io/2022/03/18/2022-03-22-supertrend-indicator/">【Momentum Trading】Yes or No? Adopting the Supertrend indicator in your trading strategies?</a></li><li><a href="https://mikelhsia.github.io/2022/04/18/2022-04-20-forex-grid-trading-system/">Looking for no-loss trading strategy? Here’s the strategy that you should look at</a></li></ul><hr><h1 id="What-this-pitfall-is-about"><a href="#What-this-pitfall-is-about" class="headerlink" title="What this pitfall is about?"></a>What this pitfall is about?</h1><p>Winsorizing, min-max scaling, standardization scaling, outlier handling, and other transformations are all techniques of feature engineering that are used to improve the quality of the data and then further improve the performance of the machine-learning model. The modern coding tools and packages make these techniques easy to implement. However, there is one place that is easily overlooked while conducting the feature engineering above and then creates the so-called <code>look-ahead bias</code>.</p><p>Let’s consider an example to better understand the concept of look-ahead bias. Suppose we want to create a binary target variable (Y) for our machine learning model based on the historical stock prices. We plan to label each data point as <code>True</code> if the corresponding daily price falls within the top 30th percentile of the entire dataset, and <code>False</code> otherwise. This labeling process is a form of target encoding or target transformation, which is a common step in supervised learning tasks.</p><img data-src="/2024/06/28/2024-06-28-why-fit-and-transform/labeling_w_all.png" class="" width="400"><p style="font-size: 0.8em; text-align:center; color: grey;">    <i>Labeling each data point according to its percentile</i></p><p>At first glance, this seems like a straightforward approach. However, there is a subtle but crucial place that we need to be careful of.　In the above approach, we are trying to find the data points whose price is in the top 30% percentile among the <strong>entire</strong> dataset, meaning the data points in the testing dataset are also being considered in the labeling process. This could lead to:</p><ol><li>As you can see in the above table, there is only one data point marked as <code>True</code> because most of the data points in the top 30% percentile are in the testing dataset. This would create a very imbalanced dataset for our machine-learning model to be properly trained.</li><li>The purpose of the testing dataset is to evaluate the performance of the model on unseen data. However, in this case, the testing dataset is also used to label the data points. This can lead to a situation where the model is being trained on data that it has already seen, which can result in overfitting and poor generalization.</li></ol><h1 id="How-do-we-address-it"><a href="#How-do-we-address-it" class="headerlink" title="How do we address it?"></a>How do we address it?</h1><p>To address this issue, we can modify the labeling process to only consider the data points within the training dataset. This would ensure that the testing dataset is not used to label the data points, resulting in a more fair and accurate evaluation of the model’s performance. Let’s have a look at the results below.</p><img data-src="/2024/06/28/2024-06-28-why-fit-and-transform/labeling_w_training.png" class="" width="500"><p style="font-size: 0.8em; text-align:center; color: grey;">    <i>Labeling each data point according to its percentile within the training data period</i></p><p>See? In our second approach, we only consider the data points within the training dataset, and the testing dataset is not used to label the data points. Then we used the pattern learned from the training dataset and labeled the data points in the testing dataset, in which we created a much more balanced training dataset. The results produced are drastically different when looking at the labels of the testing dataset.</p><h1 id="How-do-we-implement-it"><a href="#How-do-we-implement-it" class="headerlink" title="How do we implement it?"></a>How do we implement it?</h1><p>So how are we going to incorporate this into the feature engineering part of our machine learning pipeline? If you have experience with <code>Sklearn</code>, the answer is quite obvious and easy: we separate the entire labeling or transforming step into <code>fit</code> and <code>transform</code>. Pretty much in every transform function of <code>Sklearn</code> has three methods: <code>fit</code>, <code>transform</code>, and <code>fit_transform</code>.</p><ul><li>The <code>fit</code> method is to pick up the pattern you design from the training dataset.</li><li>The <code>transform</code> method is to apply the learned pattern to transform/label the given dataset.</li><li>The <code>fit_transform</code> method is used to learn the pattern from the given dataset and then transform the other given dataset using the just-learned pattern.</li></ul><p>Let’s have a look at my implementation of a customized labeler class below. This labeler is to label the top N% percentile to your designated value and label the rest to any other value.</p><ol><li><p>First, you import the necessary libraries.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> pandas <span class="keyword">as</span> pd</span><br><span class="line"><span class="keyword">import</span> numpy <span class="keyword">as</span> np</span><br></pre></td></tr></table></figure></li><li><p>Define a <code>DataLabeler</code> class to contain the methods we mentioned above.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">DataLabeler</span>:</span></span><br><span class="line">  <span class="function"><span class="keyword">def</span> <span class="title">__init__</span>(<span class="params">self</span>):</span></span><br><span class="line">    <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line">  <span class="function"><span class="keyword">def</span> <span class="title">fit</span>(<span class="params">self</span>):</span></span><br><span class="line">    <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line">  <span class="function"><span class="keyword">def</span> <span class="title">transform</span>(<span class="params">self</span>):</span></span><br><span class="line">    <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line">  <span class="function"><span class="keyword">def</span> <span class="title">fit_transform</span>(<span class="params">self</span>):</span></span><br><span class="line">    <span class="keyword">pass</span></span><br></pre></td></tr></table></figure></li><li><p>In the <code>__init__(self)</code> method, we define the column names of both the independent variable and dependent variable. Also, we will need variables to memorize the pattern we learned from the given dataset.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">DataLabeler</span>:</span></span><br><span class="line">  <span class="function"><span class="keyword">def</span> <span class="title">__init__</span>(<span class="params"></span></span></span><br><span class="line"><span class="function"><span class="params">      self,</span></span></span><br><span class="line"><span class="function"><span class="params">      target_col:str=<span class="string">&#x27;rtn&#x27;</span>,</span></span></span><br><span class="line"><span class="function"><span class="params">      result_col:str=<span class="string">&#x27;rtn_bin&#x27;</span>,</span></span></span><br><span class="line"><span class="function"><span class="params">      conditions:list=[<span class="number">0.7</span>],</span></span></span><br><span class="line"><span class="function"><span class="params">      categories:list=[<span class="number">0</span>, <span class="number">1</span>]</span></span></span><br><span class="line"><span class="function"><span class="params">    </span>):</span></span><br><span class="line">    self.target_col = target_col</span><br><span class="line">    self.result_col = result_col</span><br><span class="line">    self.conditions = conditions</span><br><span class="line">    self.categories = categories</span><br><span class="line">    self.boundary = <span class="literal">None</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> len(self.categories) != len(self.conditions) + <span class="number">1</span>:</span><br><span class="line">      <span class="comment"># Raise error if your conditions and categories are in the wrong shape</span></span><br><span class="line">      <span class="keyword">raise</span> ValueError(<span class="string">f&#x27;The number of categories should have one more than the number of the boundary&#x27;</span>)</span><br></pre></td></tr></table></figure></li><li><p>Implement the <code>fit</code> method, using the given data to calculate and memorize the boundary value.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">DataLabeler</span>:</span></span><br><span class="line">  <span class="function"><span class="keyword">def</span> <span class="title">fit</span>(<span class="params">self, data</span>):</span></span><br><span class="line">    series = data.loc[:, self.target_col]</span><br><span class="line">    self.upper_bound = series.quantile(self.conditions[<span class="number">0</span>])</span><br></pre></td></tr></table></figure></li><li><p>Implement the <code>transform</code> method, transforming the given data into the proper label based on the trained pattern. Here I won’t do it again for the <code>fit_transform</code> method, as it is simply a combination of both <code>fit</code> and <code>transform</code> methods.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">DataLabeler</span>:</span></span><br><span class="line">  <span class="function"><span class="keyword">def</span> <span class="title">transform</span>(<span class="params">self, data, inplace=False</span>):</span></span><br><span class="line">    <span class="keyword">if</span> <span class="keyword">not</span> self.upper_bound:</span><br><span class="line">      <span class="keyword">raise</span> ValueError(<span class="string">f&#x27;The labeler was not trained yet&#x27;</span>)</span><br><span class="line"></span><br><span class="line">    <span class="comment"># This inplace flag to decide whether to perform the work on the original dataframe or a copy of it</span></span><br><span class="line">    tmp = data.copy() <span class="keyword">if</span> <span class="keyword">not</span> inplace <span class="keyword">else</span> data</span><br><span class="line">    series = tmp.loc[:, self.target_col]</span><br><span class="line"></span><br><span class="line">    cond = [</span><br><span class="line">        (series &lt; self.upper_bound),</span><br><span class="line">        (series &gt;= self.upper_bound)</span><br><span class="line">    ]</span><br><span class="line"></span><br><span class="line">    tmp.loc[:, self.result_col] = np.select(cond, self.categories)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> tmp</span><br></pre></td></tr></table></figure></li></ol><p>Finally, let’s have a look at how we instantiate the labeler class and use it to label the data.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Initialize the parameter needed</span></span><br><span class="line">label_params = &#123;</span><br><span class="line">  <span class="string">&#x27;target_col&#x27;</span>: <span class="string">&#x27;price&#x27;</span>,  <span class="comment"># We evaluate the price column</span></span><br><span class="line">  <span class="string">&#x27;result_col&#x27;</span>: <span class="string">&#x27;label_y&#x27;</span>,  <span class="comment"># We output the results to the new column label_y</span></span><br><span class="line">  <span class="string">&#x27;conditions&#x27;</span>: [<span class="number">0.7</span>],  <span class="comment"># We want to label the top 30% percentile</span></span><br><span class="line">  <span class="string">&#x27;categories&#x27;</span>: [<span class="number">0</span>, <span class="number">1</span>]  <span class="comment"># Mark the bottom 70% to 0, and the rest to 1</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment"># Create the simulated data</span></span><br><span class="line">df = pd.DataFrame(np.random.randn(<span class="number">10</span>,<span class="number">1</span>), columns=[<span class="string">&#x27;price&#x27;</span>])</span><br><span class="line"></span><br><span class="line"><span class="comment"># Instantiate the labeler class with init parameters</span></span><br><span class="line">labeler = DataLabeler(**label_params)</span><br><span class="line"></span><br><span class="line"><span class="comment"># You fit the datalabeler</span></span><br><span class="line">labeler.fit(df)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Check whether it is trained</span></span><br><span class="line">print(labeler.upper_bound)</span><br><span class="line">&gt;&gt; <span class="number">-0.7071067811865476</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># You can now transform the data</span></span><br><span class="line">new_df = labeler.transform(df)</span><br></pre></td></tr></table></figure><h1 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h1><p>This concept can not just be applied to labeling, but also to other transforming techniques, such as IQR winsorizing and standardization scaler, that you wish to customize as you need. I will leave the implementation of the <code>fit_transform</code> method as an exercise for you to try. Remember, there is no right answer or process in the realm of machine learning. You need to figure out the best way to solve the problem you are facing instead of using the one-size-fits-all approach.</p>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2024/06/28/2024-06-28-why-fit-and-transform/cover.png&quot; class=&quot;&quot; width=&quot;600&quot;&gt;
&lt;p style=&quot;text-align:center; color: grey;&quot;&gt;
  &lt;i&gt;Cover image created through &lt;a href=&#39;https://https://copilot.microsoft.com/&#39;&gt;Copilot&lt;/a&gt;&lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;Feature engineering is a make-or-break step in the machine learning pipeline, but even seasoned practitioners can fall victim to a subtle yet devastating mistake. While labeling or transforming data may seem like a straightforward task, a common pitfall lurks in the shadows, waiting to undermine your model’s performance in ways you might not expect. This often overlooked issue can lead to a phenomenon known as “look-ahead bias,” where your model inadvertently gains access to information it shouldn’t have during training. In this post, we’re going to talk about what this pitfall exactly is and how to address it.&lt;/p&gt;</summary>
    
    
    <category term="Machine Learning" scheme="http://mikelhsia.github.io/categories/Machine-Learning/"/>
    
    
    <category term="Technical Analysis" scheme="http://mikelhsia.github.io/tags/Technical-Analysis/"/>
    
    <category term="Backtesting" scheme="http://mikelhsia.github.io/tags/Backtesting/"/>
    
  </entry>
  
  <entry>
    <title>【How 2】A Productive Way to Manage OAuth 2.0 Tokens</title>
    <link href="http://mikelhsia.github.io/2024/06/24/2024-06-24-test-oauth-via-postman/"/>
    <id>http://mikelhsia.github.io/2024/06/24/2024-06-24-test-oauth-via-postman/</id>
    <published>2024-06-24T06:35:11.000Z</published>
    <updated>2024-06-28T06:25:26.344Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2024/06/24/2024-06-24-test-oauth-via-postman/cover.png" class="" width="400"><p style="text-align:center; color: grey;">  <i>Cover image created through <a href='https://https://copilot.microsoft.com/'>Copilot</a></i></p><p>Testing APIs with OAuth 2.0 authentication can be a complex task, but Postman simplifies the process by providing built-in support for various OAuth 2.0 flows. Postman seamlessly complements OAuth 2.0 authentication, allowing developers to easily configure settings, obtain access tokens, and manage token lifecycle. In this post, we will quickly go through this process by utilizing the Postman software.</p><a id="more"></a><hr><h3 id="Previous-readings"><a href="#Previous-readings" class="headerlink" title="Previous readings"></a>Previous readings</h3><ul><li><a href="https://mikelhsia.github.io/2024/04/23/2024-04-24-spin-up-docker-container-for-your-tws/">【How 2】Breaking Free! Use Docker to Create Hands-Off Interactive Broker TWS Managing Experience</a></li><li><a href="https://mikelhsia.github.io/2022/12/12/2022-12-16-IBKR-broker-2/">【How 2】 Set Up Trading API Template In Python - Placing orders with Interactive Brokers</a></li><li><a href="https://mikelhsia.github.io/2022/12/12/2022-12-10-IBKR-Broker/">【How 2】 Set Up Trading API Template In Python - Connecting My Trading Strategies To Interactive Brokers</a></li><li><a href="https://mikelhsia.github.io/2020/10/19/2020-10-19-get-all-tradable-tickers/">【How 2】 Vol. 1. How 2 get all tradable tickers in US markets</a></li><li><a href="https://mikelhsia.github.io/2020/11/10/2020-11-08-macd-strategy-implementation/">【How 2】 Vol. 2. How to build an automated stock trading script</a></li></ul><hr><h1 id="Context"><a href="#Context" class="headerlink" title="Context"></a>Context</h1><p>As most of the self-employed quant traders know, <a href="https://www.investors.com/news/charles-schwab-boosts-online-brokerage-with-td-ameritrade-deal/#:~:text=In%20November%202019%2C%20Charles%20Schwab,deal%20closed%20a%20year%20later.&amp;text=So%2C%20for%20the%20past%20three,well%2Dregarded%20Thinkorswim%20trading%20platform.">Charles Schwab acquired TD Ameritrade in 2018</a>. Charles Schwab, as a major player among all the broker platforms, has recently released its API interface for trading using <a href="https://auth0.com/intro-to-iam/what-is-oauth-2">OAuth 2.0</a> as a means to authenticate users. As a quant trader/developer, I found that it’s quite troublesome to probe the API using OAuth 2.0 authentication as you need to manage the refresh and access token constantly. Therefore, I will quickly go through this process by using the OAuth 2.0 authorization code grant flow in this post.</p><h1 id="What-is-OAuth"><a href="#What-is-OAuth" class="headerlink" title="What is OAuth"></a>What is OAuth</h1><p>OAuth 2.0 is a popular web security protocol that allows an end user to grant a third-party application access to the proprietary data of a web service. The third-party application will issue the client ID and the client secret, which the end user can use to request the authorization code. The application can then use the authorization code to request an access token from the service provider. Finally, the users will be allowed to use the access token to access protected resources. There are many articles and details about OAuth 2.0 that I will not go into detail here. You can see below for the graphical reference, or see <a href="https://auth0.com/intro-to-iam/what-is-oauth-2">here</a> to further understand the detail.</p><img data-src="/2024/06/24/2024-06-24-test-oauth-via-postman/oauth_2_0.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>OAuth 2.0 authentication process</a></i></p><p>To access the server resources, you need to first make sure your access token is not expired. If the access token has expired, then you are required to use your refresh token to request a new access token in order to request the resource you need. The expiration time of the refresh token is usually around 7~30 days depends on the OAuth 2.0 service provider, and the expiration time of the access token is around 5 minutes (300 seconds) in industry convention. Therefore, keeping track of the state of the refresh and access tokens is very inconvenient as you need to constantly validate the state of both tokens. It’ll be efficient enough to have an automation tool to streamline the process and the test of interaction between the API and the client.</p><h1 id="What-is-Postman"><a href="#What-is-Postman" class="headerlink" title="What is Postman"></a>What is Postman</h1><img data-src="/2024/06/24/2024-06-24-test-oauth-via-postman/postman_logo.png" class="" width="100"><img data-src="/2024/06/24/2024-06-24-test-oauth-via-postman/postman_interface.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Postman software user interface</a></i></p><p>Postman is a productivity software for developers to facilitate the work of building, testing, and developing APIs. To achieve this purpose, abundant authentication methods have been integrated into the software itself. Needless to say, it includes the OAuth 2.0. In this post, we will go through the steps of enabling OAuth 2.0 capability in Postman.</p><h1 id="Enabling-OAuth-2-0-capability-in-Postman"><a href="#Enabling-OAuth-2-0-capability-in-Postman" class="headerlink" title="Enabling OAuth 2.0 capability in Postman"></a>Enabling OAuth 2.0 capability in Postman</h1><h2 id="1-Build-an-API-collection"><a href="#1-Build-an-API-collection" class="headerlink" title="1. Build an API collection"></a>1. Build an API collection</h2><p>Our ultimate purpose is to create one setting to retrieve the request and refresh the token and then apply it to all APIs instead of managing the state of the token for every single API. Therefore, we need to create a collection to contain all the APIs we need to test so that we can apply the same OAuth 2.0 authentication setting to all APIs.</p><img data-src="/2024/06/24/2024-06-24-test-oauth-via-postman/postman_collection.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Postman software user interface - API collection</a></i></p><h2 id="2-Set-up-Authorization-for-OAuth-2-0-for-every-API-in-the-collection"><a href="#2-Set-up-Authorization-for-OAuth-2-0-for-every-API-in-the-collection" class="headerlink" title="2. Set up Authorization for OAuth 2.0 for every API in the collection"></a>2. Set up Authorization for OAuth 2.0 for every API in the collection</h2><p>Next, we need to configure the parameters for our OAuth 2.0 authentication which later will be applied to all the underlying APIs.</p><img data-src="/2024/06/24/2024-06-24-test-oauth-via-postman/postman_authentication.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Postman software user interface - Authentication</a></i></p><p>See below for all the parameters that we need to set up:</p><ul><li><strong>Auth</strong><ul><li>Auth Type =&gt; OAuth 2.0</li><li>Add auth data to =&gt; Request Headers</li></ul></li><li><strong>Configure New Token</strong><ul><li>Token Name =&gt; [Name of the token as you prefer]</li><li>Grant type =&gt; Authorization Code</li><li>Callback URL =&gt; <a href="https://127.0.0.1">https://127.0.0.1</a></li><li>Auth URL =&gt; [Look it up from your service provider]</li><li>Access Token URL =&gt; [Look it up from your service provider]</li><li>Client ID =&gt; [Your App Key]</li><li>Client Secret =&gt; [Your App Secret]</li></ul></li></ul><p>After you have completed the above steps, you should be able to successfully request a new access token. You can find the <code>Get New Access Token</code> at the bottom of the <code>Authorization</code> tab.</p><img data-src="/2024/06/24/2024-06-24-test-oauth-via-postman/postman_access_token.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Postman software user interface - Get new access token</a></i></p><p>Once you click that button, you’ll be redirected to the OAuth login page of your service provider. After you complete the proprietary login process, the refresh token and the access token should be saved in the Postman if you have successfully walked through the login process.</p><h2 id="3-Apply-the-access-token-for-all-APIs"><a href="#3-Apply-the-access-token-for-all-APIs" class="headerlink" title="3. Apply the access token for all APIs"></a>3. Apply the access token for all APIs</h2><p>Now let’s apply the requested access token to all the APIs under the collection we created. Create an API or pick any API that you have created under the collection you just created. In the configuration window of this picked API, you choose <code>OAuth 2.0</code> as the Auth Type and choose the token that was just created as the current token under the <code>Authentication</code> tab.</p><img data-src="/2024/06/24/2024-06-24-test-oauth-via-postman/postman_apply_token.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Postman software user interface - Apply access token to every API</a></i></p><p>If every step above is done correctly, you should be able to successfully request the API you have created.</p><h2 id="4-Refresh-access-token-for-all-APIs"><a href="#4-Refresh-access-token-for-all-APIs" class="headerlink" title="4. Refresh access token for all APIs"></a>4. Refresh access token for all APIs</h2><p>As mentioned above, the access token expires every 5 minutes. Therefore, you’ll find that there are messages displayed below the token string as long as the access token has expired. Postman provides this tool to easily refresh the access token.</p><img data-src="/2024/06/24/2024-06-24-test-oauth-via-postman/postman_refresh.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Postman software user interface - Refresh access token</a></i></p><h1 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h1><p>In this post, we have gone through the steps of enabling OAuth 2.0 capability in Postman. We have also demonstrated how to apply the access token to all APIs in the collection and how to refresh the access token. Hope this post helps.</p><p>Cheers</p>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2024/06/24/2024-06-24-test-oauth-via-postman/cover.png&quot; class=&quot;&quot; width=&quot;400&quot;&gt;
&lt;p style=&quot;text-align:center; color: grey;&quot;&gt;
  &lt;i&gt;Cover image created through &lt;a href=&#39;https://https://copilot.microsoft.com/&#39;&gt;Copilot&lt;/a&gt;&lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;Testing APIs with OAuth 2.0 authentication can be a complex task, but Postman simplifies the process by providing built-in support for various OAuth 2.0 flows. Postman seamlessly complements OAuth 2.0 authentication, allowing developers to easily configure settings, obtain access tokens, and manage token lifecycle. In this post, we will quickly go through this process by utilizing the Postman software.&lt;/p&gt;</summary>
    
    
    <category term="How2" scheme="http://mikelhsia.github.io/categories/How2/"/>
    
    
    <category term="How2" scheme="http://mikelhsia.github.io/tags/How2/"/>
    
  </entry>
  
  <entry>
    <title>【How 2】Breaking Free! Use Docker to Create Hands-Off Interactive Broker TWS Managing Experience</title>
    <link href="http://mikelhsia.github.io/2024/04/23/2024-04-24-spin-up-docker-container-for-your-tws/"/>
    <id>http://mikelhsia.github.io/2024/04/23/2024-04-24-spin-up-docker-container-for-your-tws/</id>
    <published>2024-04-23T04:37:41.000Z</published>
    <updated>2024-04-24T09:52:06.150Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2024/04/23/2024-04-24-spin-up-docker-container-for-your-tws/cover.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Cover image created through <a href='https://https://copilot.microsoft.com/'>Copilot</a></i></p><p>In my previous <strong>How2</strong> column <a href="https://mikelhsia.github.io/2022/12/07/2022-12-10-IBKR-Broker/">Connecting My Trading Strategies To Interactive Brokers</a>, I shared how to set up the Interactive Brokers API connection through the Trader Workstation (TWS) on the local machine. However, the enforced rules like daily auto restart and weekly log-out can be a hassle if you are away from your local machine for many days. In this article, we’ll leverage the power of <a href="https://www.docker.com/">docker</a> to free you from managing your locally-run TWS attentively.</p><a id="more"></a><hr><p><strong><em>Previous readings</em></strong></p><ul><li><a href="https://mikelhsia.github.io/2022/12/12/2022-12-16-IBKR-broker-2/">【How 2】 Set Up Trading API Template In Python - Placing orders with Interactive Brokers</a></li><li><a href="https://mikelhsia.github.io/2022/12/12/2022-12-10-IBKR-Broker/">【How 2】 Set Up Trading API Template In Python - Connecting My Trading Strategies To Interactive Brokers</a></li><li><a href="https://mikelhsia.github.io/2020/10/19/2020-10-19-get-all-tradable-tickers/">【How 2】 Vol. 1. How 2 get all tradable tickers in US markets</a></li><li><a href="https://mikelhsia.github.io/2020/11/10/2020-11-08-macd-strategy-implementation/">【How 2】 Vol. 2. How to build an automated stock trading script</a></li></ul><hr><h1 id="Recap-how-to-connect-to-Interactive-Brokers-API-service"><a href="#Recap-how-to-connect-to-Interactive-Brokers-API-service" class="headerlink" title="Recap how to connect to Interactive Brokers API service"></a>Recap how to connect to Interactive Brokers API service</h1><p>To connect to the broker’s API service, IBKR APIs are connected to the API service through its proprietary software called Trader Workstation (TWS). The TWS software runs on the local machine and provides a secure connection to the IBKR API.</p><img data-src="/2024/04/23/2024-04-24-spin-up-docker-container-for-your-tws/tws_connection.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>How does your API call reach the IBKR API service</i></p><h1 id="The-challenge-faced"><a href="#The-challenge-faced" class="headerlink" title="The challenge faced"></a>The challenge faced</h1><p>Even though we have set up the TWS connection on the local machine, there are enforced rules that prohibit us from leaving the TWS running for an extended period of time. The TWS software requires a daily auto-restart and a weekly log-out. Plus, logging back into the TWS or IB gateway requires two-factor authentication through your mobile device, which can be a hassle if you are away from the local machine for multiple days. If you are traveling or on vacation, you may not want to carry your laptop just to keep the TWS connection alive. Also, suppose the TWS application is down when you’re away. In that case, you won’t be able to monitor your trading strategies and adjust them whenever there’s an issue with your connection between your trading script and your TWS application running locally.</p><h1 id="Potential-solution"><a href="#Potential-solution" class="headerlink" title="Potential solution"></a>Potential solution</h1><p>Therefore, we need a tool that can monitor the status of the application and automatically restart it if it goes down. This will free us from the burden of manually managing the TWS connection and allow us to focus on our trading strategies. In this context, Docker indeed seems to be the most appropriate technology to help us achieve our goals. <a href="https://hub.docker.com/">Docker Hub</a>, an open platform to store and share your Docker images, provides a wide range of pre-built Docker images that can be used to run the TWS software in a containerized environment. Here we’re going to explore a few potential options and compare the pros and cons of these pre-built Docker images.</p><h1 id="Choices-of-Interactive-Brokers-Docker-Images"><a href="#Choices-of-Interactive-Brokers-Docker-Images" class="headerlink" title="Choices of Interactive Brokers Docker Images"></a>Choices of Interactive Brokers Docker Images</h1><h2 id="1-IBEAM"><a href="#1-IBEAM" class="headerlink" title="1. IBEAM"></a>1. <a href="https://github.com/Voyz/ibeam?tab=readme-ov-file">IBEAM</a></h2><img data-src="/2024/04/23/2024-04-24-spin-up-docker-container-for-your-tws/ibeam_logo.png" class="" width="400"><p style="text-align:center; color: grey;">  <i><a href='https://github.com/Voyz/ibeam'>IBEAM</a></i></p><p>IBEAM is a Docker image that provides a containerized version of the Trader Workstation (TWS) software, exposing the <a href="https://interactivebrokers.github.io/cpwebapi/endpoints">TWS Web API</a> to call remotely. It has the following advantages compared to running TWS on your local machine:</p><ol><li>Headless run of the Gateway.</li><li>No physical display is required.</li><li>No user interaction such as login or 2FA required</li></ol><h3 id="How-to-spin-up-an-IBEAM-container"><a href="#How-to-spin-up-an-IBEAM-container" class="headerlink" title="How to spin up an IBEAM container"></a>How to spin up an IBEAM container</h3><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">docker run --env IBEAM_ACCOUNT&#x3D;[tws_account]--env IBEAM_PASSWORD&#x3D;[tws_password] --env IBEAM_PAGE_LOAD_TIMEOUT&#x3D;120 -p 5000:5000 voyz&#x2F;ibeam</span><br></pre></td></tr></table></figure><p>It’s just that simple.</p><blockquote><p>PS: Use 5001 port on your local machine if you’re using Mac devices.</p></blockquote><h3 id="How-to-test-the-IBEAM-API"><a href="#How-to-test-the-IBEAM-API" class="headerlink" title="How to test the IBEAM API"></a>How to test the IBEAM API</h3><p>You can use the APIs used are available in <a href="https://interactivebrokers.github.io/cpwebapi/endpoints">TWS Web API Doc</a>. For example:</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">curl -X GET &quot;https:&#x2F;&#x2F;localhost:5001&#x2F;v1&#x2F;api&#x2F;one&#x2F;user&quot; -k</span><br><span class="line"></span><br><span class="line">curl -X GET &quot;https:&#x2F;&#x2F;localhost:5001&#x2F;v1&#x2F;api&#x2F;portfolio&#x2F;accounts&quot; -k</span><br><span class="line"></span><br><span class="line">curl -X GET &quot;https:&#x2F;&#x2F;localhost:5001&#x2F;v1&#x2F;api&#x2F;trsrv&#x2F;stocks?symbols&#x3D;AAPL,WDC&quot; -k</span><br></pre></td></tr></table></figure><p style="text-align:center; color: grey;">  <i>API call examples</i></p><h3 id="Notes"><a href="#Notes" class="headerlink" title="Notes"></a>Notes</h3><p>One thing that you need to pay additional attention to is that, since you’re using the TWS Web API services, the APIs are asynchronous. This means that you’ll need to handle the asynchronous responses properly in your own trading scripts instead of utilizing the existing library like <code>ib_insync</code>. Also, you’ll need to handle the extra SSL certification validation when making API calls. For example, if you’re testing API calls, you’ll need to disable the SSL certification validation, as shown in the following screenshot.</p><img data-src="/2024/04/23/2024-04-24-spin-up-docker-container-for-your-tws/postman_test.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>PS: If you want to test it through Postman, you need to switch off the SSL certification exam</i></p><h2 id="2-ib-gateway-docker"><a href="#2-ib-gateway-docker" class="headerlink" title="2. ib-gateway-docker"></a>2. <a href="https://github.com/UnusualAlpha/ib-gateway-docker">ib-gateway-docker</a></h2><img data-src="/2024/04/23/2024-04-24-spin-up-docker-container-for-your-tws/ib_gateway_docker_logo.png" class="" width="400"><p style="text-align:center; color: grey;">  <i><a href='https://github.com/UnusualAlpha/ib-gateway-docker'>ib-gateway-docker</a></i></p><p>The principle of the <a href="https://github.com/UnusualAlpha/ib-gateway-docker">ib-gateway-docker</a> image is to use a GUI framework called <a href="https://www.youtube.com/watch?v=zjaqmfP28QE">X11</a> to simulate a virtual desktop environment and run the IB Gateway software inside a docker. In this docker image, <a href="https://github.com/IbcAlpha/IBC">IBC application</a> are pre-installed to construct the pre-defined interactions between remote access and IB Gateway inside these docker images. This docker image is capable of:</p><ul><li>Auto-login with username and password provided.</li><li>Keep the application alive</li><li>Support 2FA login</li><li>Support auto-restarted each day during the week, without the user having to re-authenticate</li><li>Remote access the IB Gateway inside the docker container</li></ul><p>The benefit of using <a href="https://github.com/UnusualAlpha/ib-gateway-docker">ib-gateway-docker</a> compared to using <a href="https://github.com/Voyz/ibeam?tab=readme-ov-file">IBEAM</a> is that you can still use the existing async library like <code>ib_insync</code> in your trading script to communicate with IB Gateway. This saves the effort of reinventing the wheel from scratch.</p><h2 id="3-ibkr-docker"><a href="#3-ibkr-docker" class="headerlink" title="3. ibkr-docker"></a>3. <a href="https://github.com/extrange/ibkr-docker?tab=readme-ov-file">ibkr-docker</a></h2><p>I consider <a href="https://github.com/extrange/ibkr-docker?tab=readme-ov-file">ibkr-docker</a> to be the advanced version of the <a href="https://github.com/UnusualAlpha/ib-gateway-docker">ib-gateway-docker</a>. Both docker images use a similar approach to containerize the <a href="https://github.com/IbcAlpha/IBC">IBC application</a>. In addition, <a href="https://github.com/extrange/ibkr-docker?tab=readme-ov-file">ibkr-docker</a> contains much more features and possibilities to be extended and explored. See the below table for the major differences in features:</p><div class="table-container"><table><thead><tr><th>-</th><th>ib-gateway-docker</th><th>ibkr-docker</th></tr></thead><tbody><tr><td>Github Maintainer</td><td>UnusualAlpha</td><td>extrange</td></tr><tr><td>Based on</td><td>Alpine Linux</td><td>Alpine Linux</td></tr><tr><td>Purpose</td><td>Focused on running the IB Gateway Application</td><td>Designed for running both TWS and IB Gateway, with additional features like noVNC access</td></tr><tr><td>GUI</td><td>Minimal GUI (primarily for API access)</td><td>Fully featured TWS platform accessible via noVNC</td></tr><tr><td>Components</td><td>Includes IB Gateway, IBC, Xvfb, and optional X11vnc</td><td>Provides a fully containerized TWS/IB Gateway setup</td></tr><tr><td>Use Case</td><td>Ideal for automated trading and API access</td><td>Suitable for both manual trading (via TWS) and automated tasks (via IB Gateway)</td></tr></tbody></table></div><h2 id="4-Others"><a href="#4-Others" class="headerlink" title="4. Others"></a>4. Others</h2><ul><li><a href="https://github.com/ryankennedyio/ib-docker?tab=readme-ov-file">ib-docker</a></li><li><a href="https://github.com/robolyst/ibportal">ibportal</a></li><li>…</li></ul><h1 id="How-to-address-the-challenges-we-faced"><a href="#How-to-address-the-challenges-we-faced" class="headerlink" title="How to address the challenges we faced"></a>How to address the challenges we faced</h1><p>To resolve the challenges we’re facing right now, I choose <a href="https://github.com/extrange/ibkr-docker?tab=readme-ov-file">ibkr-docker</a> over the other options because: 1. It provides a fully containerized TWS/IB Gateway setup with more features, 2. The maintainer still actively updates this Github repo, and 3. In the workflow that I’m adopting (see <a href="https://mikelhsia.github.io/2022/12/07/2022-12-10-IBKR-Broker/">Connecting My Trading Strategies To Interactive Brokers</a>), I barely need to change anything in my trading script but spin up a docker container instead. So, let’s get to it.</p><h2 id="1-Spin-up-a-default-docker-container"><a href="#1-Spin-up-a-default-docker-container" class="headerlink" title="1. Spin up a default docker container"></a>1. Spin up a default docker container</h2><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Download the ibkr-docker image with the proper tag/version</span></span><br><span class="line">docker pull ghcr.io/extrange/ibkr:10.19.2h</span><br><span class="line"></span><br><span class="line"><span class="comment"># Spin up the container</span></span><br><span class="line">docker run -d -p <span class="string">&quot;127.0.0.1:6080:6080&quot;</span> -p <span class="string">&quot;127.0.0.1:8888:8888&quot;</span> \</span><br><span class="line">-e USERNAME=&#123;username&#125; -e PASSWORD=&#123;password&#125; \</span><br><span class="line">ghcr.io/extrange/ibkr:10.19.2h</span><br></pre></td></tr></table></figure><p>This command will spin up a default docker container and expose two external ports: <code>6080</code> is used for accessing the GUI via noVNC, and <code>8888</code> is used for the TWS/IB Gateway API calls.</p><h2 id="2-Ensure-the-software-restarts-properly"><a href="#2-Ensure-the-software-restarts-properly" class="headerlink" title="2. Ensure the software restarts properly"></a>2. Ensure the software restarts properly</h2><p>To ensure the TWS/IB Gateway software restarts properly, you need to set the environment variable <code>autoRestartTime</code>. It is an <a href="https://github.com/IbcAlpha/IBC">IBC</a> setting that instructs IBC to handle the restart of the TWS/IB Gateway Application. I would recommend you check out the <a href="https://github.com/IbcAlpha/IBC/blob/master/resources/config.ini"><code>config.ini</code> file</a>. As stated in the section of <a href="https://github.com/extrange/ibkr-docker?tab=readme-ov-file#environment-variables">Environment Variables</a> the variables inside <code>config.ini</code> can be accessed by putting prefix <code>IBC_</code> in front of it. Therefore, the command to spin up the container would be:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">docker run -d -p <span class="string">&quot;127.0.0.1:6080:6080&quot;</span> -p <span class="string">&quot;127.0.0.1:8888:8888&quot;</span> \</span><br><span class="line">-e USERNAME=&#123;username&#125; -e PASSWORD=&#123;password&#125; \</span><br><span class="line">-e IBC_AutoRestartTime=<span class="string">&quot;03:00 AM&quot;</span> \</span><br><span class="line">ghcr.io/extrange/ibkr:10.19.2h</span><br></pre></td></tr></table></figure><h3 id="2-1-Wait…-It-didn’t-restart-as-the-time-I-instructed-to"><a href="#2-1-Wait…-It-didn’t-restart-as-the-time-I-instructed-to" class="headerlink" title="2.1. Wait… It didn’t restart as the time I instructed to."></a>2.1. Wait… It didn’t restart as the time I instructed to.</h3><p>I asked this question to myself while I was waiting for the application to restart. I was very confused. I double-checked the setting in TWS through noVNC and it was correctly set to the restart time that I wanted it to. Then something struck me: I saw the time on the top right corner of TWS is still in the UTC timezone, not the local timezone that I expected. Therefore, <a href="https://stackoverflow.com/questions/57607381/how-do-i-change-timezone-in-a-docker-container">one more environment variable</a> needs to be added.</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">docker run -d -p <span class="string">&quot;127.0.0.1:6080:6080&quot;</span> -p <span class="string">&quot;127.0.0.1:8888:8888&quot;</span> \</span><br><span class="line">-e USERNAME=&#123;username&#125; -e PASSWORD=&#123;password&#125; \</span><br><span class="line">-e IBC_AutoRestartTime=<span class="string">&quot;03:00 AM&quot;</span> \</span><br><span class="line">-e TZ=<span class="string">&quot;US/Eastern&quot;</span> \</span><br><span class="line">ghcr.io/extrange/ibkr:10.19.2h</span><br></pre></td></tr></table></figure><h2 id="3-Tackle-the-2FA-failure"><a href="#3-Tackle-the-2FA-failure" class="headerlink" title="3. Tackle the 2FA failure"></a>3. Tackle the 2FA failure</h2><p>There are two more environment variables to be added to tell the application to handle the 2FA login failure.</p><p>In some circumstances, even though you acknowledge the alert, login doesn’t complete successfully. IBC can deal with this situation automatically by shutting down and restarting by setting the <code>TWOFA_TIMEOUT_ACTION=&quot;restart&quot;</code>.</p><p>If you use the IBKR Mobile app for two-factor authentication, and you fail to complete the process before the time limit imposed by IBKR, <code>ReloginAfterSecondFactorAuthenticationTimeout</code> tells IBC whether to automatically restart the login sequence, giving you another opportunity to complete two-factor authentication.</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">docker run -d -p <span class="string">&quot;127.0.0.1:6080:6080&quot;</span> -p <span class="string">&quot;127.0.0.1:8888:8888&quot;</span> \</span><br><span class="line">-e USERNAME=&#123;username&#125; -e PASSWORD=&#123;password&#125; \</span><br><span class="line">-e IBC_AutoRestartTime=<span class="string">&quot;03:00 AM&quot;</span> -e IBC_TWOFA_TIMEOUT_ACTION=<span class="string">&#x27;restart&#x27;</span> \</span><br><span class="line">-e IBC_ReloginAfterSecondFactorAuthenticationTimeout=<span class="string">&#x27;yes&#x27;</span> \</span><br><span class="line">-e TZ=<span class="string">&quot;US/Eastern&quot;</span> \</span><br><span class="line">ghcr.io/extrange/ibkr:10.19.2h</span><br></pre></td></tr></table></figure><h2 id="4-Handle-the-weekly-restart"><a href="#4-Handle-the-weekly-restart" class="headerlink" title="4. Handle the weekly restart"></a>4. Handle the weekly restart</h2><p>This may be the most crucial step to free us from the predicament of periodically checking our desktop/laptop to ensure the TWS/IB Gateway software is running. There are two ways of doing this.</p><h3 id="4-1-Using-the-existing-docker-command-to-restart-the-container-every-week"><a href="#4-1-Using-the-existing-docker-command-to-restart-the-container-every-week" class="headerlink" title="4.1. Using the existing docker command to restart the container every week"></a>4.1. Using the existing docker command to restart the container every week</h3><p>We can use the <code>docker restart</code> command to restart the container. By doing this, we need to:</p><ol><li>Name the docker container using the <code>--name</code> parameter when running the initial <code>docker run</code> command.</li><li>Use the <code>docker restart [docker_name]</code> command to restart the container and complete the following 2FA in time.</li><li>Set up a cron job on your laptop to run the second step every week at a specific time.</li></ol><p>The above solution seems plausible, but it still requires an additional tool to schedule the restart task. To make this process fully automated, I prefer to use the tools that we’re already using. Also, I noticed a few things while experimenting various parameters of the <a href="https://github.com/extrange/ibkr-docker?tab=readme-ov-file">ibkr-docker</a>. Assigning the environment variable <code>ClosedownAt</code> would not only shutdown the TWS/IB Gateway Application, but also terminate the docker container. In the meantime, docker has the capability to keep the container alive when the container was shuted down unexpectedly. With these two characteristics discovered, I’ve formulated the following approach:</p><ol><li>Use <code>ClosedownAt</code> parameter to force shutting down the TWS application and the docker container</li><li>Use docker parameter <code>--restart unless-stopped</code> to instruct docker container to restart once the container is not up.</li></ol><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">docker run -d -p <span class="string">&quot;127.0.0.1:6080:6080&quot;</span> -p <span class="string">&quot;127.0.0.1:8888:8888&quot;</span> \</span><br><span class="line">-e USERNAME=&#123;username&#125; -e PASSWORD=&#123;password&#125; \</span><br><span class="line">-e IBC_AutoRestartTime=<span class="string">&quot;03:17 AM&quot;</span> -e IBC_TWOFA_TIMEOUT_ACTION=<span class="string">&#x27;restart&#x27;</span> \</span><br><span class="line">-e IBC_ReloginAfterSecondFactorAuthenticationTimeout=<span class="string">&#x27;yes&#x27;</span> -e IBC_ClosedownAt=<span class="string">&#x27;Monday 03:00&#x27;</span>\</span><br><span class="line">-e TZ=<span class="string">&quot;US/Eastern&quot;</span> \</span><br><span class="line">--restart unless-stopped \</span><br><span class="line">ghcr.io/extrange/ibkr:10.19.2h</span><br></pre></td></tr></table></figure><p>With the command above, the <code>ClosedownAt</code> will shutdown the container. Then the <code>--restart</code> will notice the container is shuted down and restart the container again.</p><p>Voilà! We tackled all the challenges listed! Now we can go on to vacation without worrying about the TWS application stopping or login session expiring.</p><h1 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h1><p>In this article, we have talked about the different challenges encountered when running IBKR TWS/IB Gateway in a local environment. Then we introduced a few popular docker images available in the Docker hub and explained the pros and cons. Finally, we came to the conclusion that with the proper configuration and automation tools, the container now will handle restarting itself and re-establishing the login session automatically every week.</p><p>See you next time.</p><blockquote><p>If you enjoy reading this and my other articles, come check out my <a src='https://medium.com/@mikelhsia'>Medium page</a> to read more about Quantitative Trading Strategy.</p></blockquote><h1 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h1><ul><li><a href="https://github.com/Voyz/ibeam?tab=readme-ov-file">IBEAM github</a></li><li><a href="https://github.com/ryankennedyio/ib-docker?tab=readme-ov-file">ib-docker github</a></li><li><a href="https://github.com/extrange/ibkr-docker?tab=readme-ov-file">ibkr-docker github</a></li><li><a href="https://github.com/UnusualAlpha/ib-gateway-docker">ib-gateway-docker github</a></li><li><a href="https://github.com/robolyst/ibportal">ibportal github</a> and <a href="https://github.com/robolyst/ibportal">here2</a></li><li><a href="https://github.com/IbcAlpha/IBC">IbcAlpha/IBC github</a></li><li><a href="https://github.com/IbcAlpha/IBC/blob/master/resources/config.ini">IbcAlpha/IBC configuration.ini file</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2024/04/23/2024-04-24-spin-up-docker-container-for-your-tws/cover.png&quot; class=&quot;&quot; width=&quot;600&quot;&gt;
&lt;p style=&quot;text-align:center; color: grey;&quot;&gt;
  &lt;i&gt;Cover image created through &lt;a href=&#39;https://https://copilot.microsoft.com/&#39;&gt;Copilot&lt;/a&gt;&lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;In my previous &lt;strong&gt;How2&lt;/strong&gt; column &lt;a href=&quot;https://mikelhsia.github.io/2022/12/07/2022-12-10-IBKR-Broker/&quot;&gt;Connecting My Trading Strategies To Interactive Brokers&lt;/a&gt;, I shared how to set up the Interactive Brokers API connection through the Trader Workstation (TWS) on the local machine. However, the enforced rules like daily auto restart and weekly log-out can be a hassle if you are away from your local machine for many days. In this article, we’ll leverage the power of &lt;a href=&quot;https://www.docker.com/&quot;&gt;docker&lt;/a&gt; to free you from managing your locally-run TWS attentively.&lt;/p&gt;</summary>
    
    
    <category term="Quantitative Trading" scheme="http://mikelhsia.github.io/categories/Quantitative-Trading/"/>
    
    <category term="How2" scheme="http://mikelhsia.github.io/categories/Quantitative-Trading/How2/"/>
    
    
    <category term="Docker" scheme="http://mikelhsia.github.io/tags/Docker/"/>
    
    <category term="How2" scheme="http://mikelhsia.github.io/tags/How2/"/>
    
    <category term="Python3" scheme="http://mikelhsia.github.io/tags/Python3/"/>
    
    <category term="Interactive Broker" scheme="http://mikelhsia.github.io/tags/Interactive-Broker/"/>
    
  </entry>
  
  <entry>
    <title>Testing the Waters - Backtesting HAA and Its Variations Towards Success</title>
    <link href="http://mikelhsia.github.io/2024/04/09/2024-03-25-hybrid-asset-allocation/"/>
    <id>http://mikelhsia.github.io/2024/04/09/2024-03-25-hybrid-asset-allocation/</id>
    <published>2024-04-09T06:01:23.000Z</published>
    <updated>2024-04-12T04:08:29.933Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2024/04/09/2024-03-25-hybrid-asset-allocation/cover.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Cover image created through <a href='https://https://copilot.microsoft.com/'>Copilot</a></i></p><p>As is known to all, diversification is the key to managing the risk of your investment. However, it’ll take a lot of effort and time to hand-pick quality assets that have little correlation with each other. In the <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4346906">Hybrid Asset Allocation trading strategy</a>, <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4346906">Wouter J. Keller and Jan Willem Keuning</a> actually brought up an easier way to introduce diversification into the trading strategy. In this article, we’re going to first talk about HAA strategy and see how it boosts the diversification of our portfolio. Then, we’re also going to backtest the HAA (Hybrid Asset Allocation) trading strategy and several of its variations against the benchmark.</p><a id="more"></a><hr><p>Become a <a href="https://medium.com/@mikelhsia/membership">Medium member</a> to support me in writing more interesting articles. I’ll receive a small portion of your membership fee if you use the following link, at no extra cost to you.</p><blockquote><p>If you enjoy reading this and my other articles, feel free to <a src='https://medium.com/@mikelhsia/membership'> join Medium membership program</a> to read more about Quantitative Trading Strategy.</p></blockquote><hr><p><strong><em>Previous researches</em></strong></p><ul><li><a href="https://mikelhsia.github.io/2024/01/18/2024-01-18-Revisit-Buy-On-Gao-Strategy/">From Theory to Profits - Elevating the Buy-on-Gap Strategy with Advanced Techniques</a></li><li><a href="https://mikelhsia.github.io/2021/07/19/2021-07-20-advanced-macd-strategy/">Optimize your MACD strategies with advanced indicators</a></li><li><a href="https://mikelhsia.github.io/2023/04/26/2023-05-01-pair-trading-cointegration-part2/">【Pair Trading】 Complete Guide to Backtest Cointegration Pair Trading Strategy</a></li><li><a href="https://mikelhsia.github.io/2021/05/10/2021-05-14-machine-learning-prototype/">【ML algo trading】 II - How to build a machine learning boilerplate?</a></li></ul><hr><h1 id="What-is-Hybrid-Asset-Allocation-HAA"><a href="#What-is-Hybrid-Asset-Allocation-HAA" class="headerlink" title="What is Hybrid Asset Allocation (HAA)"></a>What is Hybrid Asset Allocation (HAA)</h1><p>Hybrid Asset Allocation (HHA) was first introduced in <u><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4346906">Dual and Canary Momentum with Rising Yields/Inflation: Hybrid Asset Allocation (HAA)</a></u> from <a href="https://www.ssrn.com/index.cfm/en/">SSRN</a> in Feb 2023. The intention is to develop a much simpler strategy for the retail investors. This strategy is balanced and rather aggressive compared to the author’s previous trading strategy. It uses <code>US Treasury Inflation-Protected Securities (TIPS, ETF: TIP)</code> as the <strong>canary asset</strong> to determine whether the market possesses positive momentum. Once the canary asset shows positive momentum, we shall invest the <strong>offensive assets</strong> to profit. Otherwise, we invest the <strong>defensive asset</strong> to further protect against loss.</p><p>This strategy not only uses the single <strong>canary asset</strong> as the market momentum indicator. In the meantime, it also adopts the traditional <strong>dual momentum</strong> method to further confirm which assets we should invest among our offensive assets. Combining these two methods, its simplicity greatly reduced the possibility of unintentional overfitting, leaving much room for the intelligent retail investor to compose and customize their asset universe that could be potentially more aggressive or risk-averse.</p><h1 id="How-HAA-adds-diversification-to-our-portfolio"><a href="#How-HAA-adds-diversification-to-our-portfolio" class="headerlink" title="How HAA adds diversification to our portfolio"></a>How HAA adds diversification to our portfolio</h1><p>Even though the canary asset is the crucial part of the strategy that helps identify the current momentum trend of the market, the actual power of risk-aversion resides inside the assets picked: the ETFs in the offensive assets. These eight assets <code>US large caps (represented by SPY)</code>, <code>US small caps (IWM)</code>, <code>developed international stocks (EFA)</code>, <code>emerging market stocks (EEM)</code>, <code>US real estate (VNQ)</code>, <code>commodities (PDBC)</code>, <code>intermediate-term US Treasuries (IEF)</code> and <code>long-term US Treasuries (TLT)</code> in the offensive assets represent the strong and quality stocks worldwide covering most of the industries. This diverse universe is the part that adds diversity to our portfolio.</p><h1 id="Backtesting-HAA-and-its-variations"><a href="#Backtesting-HAA-and-its-variations" class="headerlink" title="Backtesting HAA and its variations"></a>Backtesting HAA and its variations</h1><p>Now let us have a look at our backtesting setup.</p><h2 id="Platform"><a href="#Platform" class="headerlink" title="Platform"></a>Platform</h2><p>I’m using <a href="https://quantconnect.com/">QuantConnect</a> to backtest this strategy.</p><h2 id="Backtest-period"><a href="#Backtest-period" class="headerlink" title="Backtest period"></a>Backtest period</h2><p>I’m conducting two series of backtests. One from 2016-01-03 to inspect the strategy performance as much as possible. The other one is from 2020-01-03 to inspect the recent changes and to compare it with the previous series.</p><h2 id="Trading-Framework"><a href="#Trading-Framework" class="headerlink" title="Trading Framework"></a>Trading Framework</h2><h3 id="Universe"><a href="#Universe" class="headerlink" title="Universe"></a>Universe</h3><ol><li><strong>Canary asset</strong>: US Treasury Inflation-Protected Securities (TIPS, ETF: TIP)</li><li><strong>Offensive assets</strong>: US large caps (represented by SPY), US small caps (IWM), developed international stocks (EFA), emerging market stocks (EEM), US real estate (VNQ), commodities (PDBC), intermediate-term US Treasuries (IEF) and long-term US Treasuries (TLT)</li><li><strong>Defensive asset</strong>: intermediate-term US Treasuries (IEF)</li></ol><h3 id="Strategy-Benchmark"><a href="#Strategy-Benchmark" class="headerlink" title="Strategy Benchmark"></a>Strategy Benchmark</h3><p>In the article <a href="https://allocatesmartly.com/hybrid-asset-allocation/">Hybrid Asset Allocation</a>, the 60/40 benchmark was constructed by 60% of SPY and 40% of IEF. To look at the strategy from a more aggressive angle, I use simply the SPY buy-and-hold strategy as the benchmark, which also helps identify the stock market situation.</p><h3 id="Trading-Rules"><a href="#Trading-Rules" class="headerlink" title="Trading Rules"></a>Trading Rules</h3><ol><li>The definition of momentum is defined as the unweighted average return of one-month, three-month, six-month, and twelve-month returns (in %)</li><li>All the positions are created and closed right after the market opens.</li><li>If the canary asset shows positive momentum, we pick the four ETFs with the highest momentum from offensive assets to invest evenly.</li><li>Otherwise, we allocate our capital to defensive asset.</li><li>We rebalance our portfolio every month.</li></ol><script type="math/tex; mode=display">\text{Momentum} = \frac{\text{1-month return} + \text{3-month return} + \text{6-month return} + \text{12-month return}}{4}</script><p style="text-align:center; color: grey;">  <i>Formula to calculate the momentum</i></p><h3 id="Variation-Strategies"><a href="#Variation-Strategies" class="headerlink" title="Variation Strategies"></a>Variation Strategies</h3><p>There are two switches that I added to this HAA strategy. First is whether to disable the canary asset indicator that we used to detect the market momentum. Once we switch this off, we simply purchase four ETFs that have the highest momentum from the offensive assets no matter the market momentum, turning this strategy into a pure ETF momentum strategy. Second, I added an additional filter to rule out those ETFs whose momentum ranks top four but is actually negative. Below is the backtest results.</p><h2 id="Results"><a href="#Results" class="headerlink" title="Results"></a>Results</h2><div class="table-container"><table><thead><tr><th>Year start</th><th>2016~</th><th>2020~</th></tr></thead><tbody><tr><td>HAA basic</td><td><img data-src="/2024/04/09/2024-03-25-hybrid-asset-allocation/haa_basic_16.png" class="" width="800"></td><td><img data-src="/2024/04/09/2024-03-25-hybrid-asset-allocation/haa_basic_20.png" class="" width="800"></td></tr><tr><td>HAA ignore canary</td><td><img data-src="/2024/04/09/2024-03-25-hybrid-asset-allocation/haa_ignore_tip_16.png" class="" width="800"></td><td><img data-src="/2024/04/09/2024-03-25-hybrid-asset-allocation/haa_ignore_tip_20.png" class="" width="800"></td></tr><tr><td>HAA ignore negative</td><td><img data-src="/2024/04/09/2024-03-25-hybrid-asset-allocation/haa_ignore_negative_16.png" class="" width="800"></td><td><img data-src="/2024/04/09/2024-03-25-hybrid-asset-allocation/haa_ignore_negative_20.png" class="" width="800"></td></tr><tr><td>HHA ignore both</td><td><img data-src="/2024/04/09/2024-03-25-hybrid-asset-allocation/haa_ignore_both_16.png" class="" width="800"></td><td><img data-src="/2024/04/09/2024-03-25-hybrid-asset-allocation/haa_ignore_both_20.png" class="" width="800"></td></tr></tbody></table></div><p style="text-align:center; color: grey;">  <i>Backtesting HHA and its variations against SPY buy-and-hold strategy</i></p><p>If you look at the backtest results carefully, you’ll notice that none of the HAA strategy and its variations exceeds the performance of the benchmark SPY buy-and-hold strategy. However, the HAA strategy would help us avoid the loss during the market setback and grow the investment steadily. Take the basic HAA strategy backtest results since 2020 for example. Compared to the HAA ignore canary scenario, our canary asset seems to be able to identify the market turmoil so that our investment focus will turn to the defensive asset to avoid further loss.</p><img data-src="/2024/04/09/2024-03-25-hybrid-asset-allocation/haa_basic_20_compare.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Basic HAA V.S. HAA ignore canary asset since 2020</i></p><p>Besides, if you compare the basic HAA strategy with the HAA ignore negative strategy, you will see that the HAA ignore negative strategy indeed slightly outperforms the HAA basic strategy for a certain period.</p><img data-src="/2024/04/09/2024-03-25-hybrid-asset-allocation/haa_negative_20_compare.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Basic HAA V.S. HAA ignore negative strategy since 2020</i></p><div class="table-container"><table><thead><tr><th>2020</th><th>Benchmark (SPY buy-and_hold)</th><th>HAA basic</th><th>HAA ignore canary</th><th>HAA ignore negative</th><th>HAA ignore both</th></tr></thead><tbody><tr><td>Annualized Return</td><td><strong>13.553%</strong></td><td>9.892%</td><td>8.889%</td><td>11.554%</td><td>10.120%</td></tr><tr><td>Annualized Volatility</td><td><strong>0.138</strong></td><td>0.109</td><td>0.121</td><td><font color="green">0.108</font></td><td>0.133</td></tr><tr><td>Sharpe Ratio</td><td><strong>0.48</strong></td><td>0.473</td><td>0.382</td><td><font color="green">0.578</font></td><td>0.422</td></tr><tr><td>Sortino Ratio</td><td><strong>0.491</strong></td><td>0.523</td><td>0.423</td><td><font color="green">0.699</font></td><td>0.451</td></tr><tr><td>Max DD</td><td><strong>33.600%</strong></td><td>18.900%</td><td>19.400%</td><td><font color="green">14.000%</font></td><td>26.200%</td></tr><tr><td>Number of orders</td><td><strong>1</strong></td><td>143</td><td>210</td><td>131</td><td>163</td></tr></tbody></table></div><p style="text-align:center; color: grey;">  <i>Stats HHA and its variations against SPY buy-and-hold strategy (since 2020)</i></p><div class="table-container"><table><thead><tr><th>2016</th><th>Benchmark (SPY buy-and_hold)</th><th>HAA basic</th><th>HAA ignore canary</th><th>HAA ignore negative</th><th>HAA ignore both</th></tr></thead><tbody><tr><td>Annualized Return</td><td><strong>14.128%</strong></td><td>10.014%</td><td>8.615%</td><td>10.184%</td><td>8.693%</td></tr><tr><td>Annualized Volatility</td><td><strong>0.151</strong></td><td>0.091</td><td>0.103</td><td>0.091</td><td>0.111</td></tr><tr><td>Sharpe Ratio</td><td><strong>0.579</strong></td><td>0.568</td><td>0.423</td><td><font color="green">0.581</font></td><td>0.406</td></tr><tr><td>Sortino Ratio</td><td><strong>0.573</strong></td><td>0.597</td><td>0.447</td><td><font color="green">0.653</font></td><td>0.423</td></tr><tr><td>Max DD</td><td><strong>33.7%</strong></td><td>18.900%</td><td>19.400%</td><td>14.000%</td><td>26.200%</td></tr><tr><td>Number of orders</td><td><strong>1</strong></td><td>310</td><td>374</td><td>295</td><td>324</td></tr></tbody></table></div><p style="text-align:center; color: grey;">  <i>Stats HHA and its variations against SPY buy-and-hold strategy (since 2016)</i></p><p>Based on the provided stats above, the HAA strategy and its variations did mitigate the annualized volatility and decrease the maximum drawdown drastically. Yet, the annualized return also shrank along with the decrease of the volatility. Combining both the chart and tables above, the Sharpe ratio and Sortino ratio both improved after we adopted the HAA ignore negative scenario by removing the ETFs that have negative momentum from our target ETFs. This would be an interesting case to further explore.</p><h1 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h1><p>HAA seems to be a strategy that tries to exploit the advantage of diversification. In the meantime, adding the ETF element to the universe would further elevate the quality level of the universe as the stocks constituted by the ETF are deliberately chosen. As the next step, maybe we can try to find a better diversified ETF to represent the other industries and other aspects of the entire market, improving further the portfolio performance in the future.</p><hr><h1 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h1><ul><li><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4346906">Dual and Canary Momentum with Rising Yields/Inflation: Hybrid Asset Allocation (HAA)</a></li><li><a href="https://allocatesmartly.com/hybrid-asset-allocation/">Hybrid Asset Allocation</a></li><li><a href="https://allocatesmartly.com/dr-keller-keunings-simple-variation-of-hybrid-asset-allocation/">Dr. Keller &amp; Keuning’s Simple Variation of “Hybrid Asset Allocation”</a></li><li><a href="https://nlxfinance.wordpress.com/2024/03/07/the-haa-strategy-revisited/">The HAA strategy revisited</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2024/04/09/2024-03-25-hybrid-asset-allocation/cover.png&quot; class=&quot;&quot; width=&quot;800&quot;&gt;
&lt;p style=&quot;text-align:center; color: grey;&quot;&gt;
  &lt;i&gt;Cover image created through &lt;a href=&#39;https://https://copilot.microsoft.com/&#39;&gt;Copilot&lt;/a&gt;&lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;As is known to all, diversification is the key to managing the risk of your investment. However, it’ll take a lot of effort and time to hand-pick quality assets that have little correlation with each other. In the &lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4346906&quot;&gt;Hybrid Asset Allocation trading strategy&lt;/a&gt;, &lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4346906&quot;&gt;Wouter J. Keller and Jan Willem Keuning&lt;/a&gt; actually brought up an easier way to introduce diversification into the trading strategy. In this article, we’re going to first talk about HAA strategy and see how it boosts the diversification of our portfolio. Then, we’re also going to backtest the HAA (Hybrid Asset Allocation) trading strategy and several of its variations against the benchmark.&lt;/p&gt;</summary>
    
    
    <category term="Quantitative Trading" scheme="http://mikelhsia.github.io/categories/Quantitative-Trading/"/>
    
    
    <category term="Research" scheme="http://mikelhsia.github.io/tags/Research/"/>
    
    <category term="Backtesting" scheme="http://mikelhsia.github.io/tags/Backtesting/"/>
    
    <category term="Python3" scheme="http://mikelhsia.github.io/tags/Python3/"/>
    
  </entry>
  
  <entry>
    <title>From Theory to Profits - Elevating the Buy-on-Gap Strategy with Advanced Techniques</title>
    <link href="http://mikelhsia.github.io/2024/01/18/2024-01-18-Revisit-Buy-On-Gao-Strategy/"/>
    <id>http://mikelhsia.github.io/2024/01/18/2024-01-18-Revisit-Buy-On-Gao-Strategy/</id>
    <published>2024-01-18T04:05:56.000Z</published>
    <updated>2024-01-31T17:05:47.881Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2024/01/18/2024-01-18-Revisit-Buy-On-Gao-Strategy/cover.jpeg" class="" width="800"><p style="text-align:center; color: grey;">  <i>Cover image created by <a href='https://https://copilot.microsoft.com/'>Copilot</a></i></p><p>The best ideas are often inspired by or adapted from the work of others. Likewise, profitable quantitative trading strategies are not necessarily original, but they can be generated by adding personal insights regarding the market or strategy itself. In this post, I’m going to introduce the process that I usually do when discovering a prospering trading strategy.</p><a id="more"></a><hr><p>Become a <a href="https://medium.com/@mikelhsia/membership">Medium member</a> to support me in writing more interesting articles. I’ll receive a small portion of your membership fee if you use the following link, at no extra cost to you.</p><blockquote><p>If you enjoy reading this and my other articles, feel free to <a src='https://medium.com/@mikelhsia/membership'> join Medium membership program</a> to read more about Quantitative Trading Strategy.</p></blockquote><hr><p><strong><em>Previous researches</em></strong></p><ul><li><a href="https://mikelhsia.github.io/2021/07/19/2021-07-20-advanced-macd-strategy/">Optimize your MACD strategies with advanced indicators</a></li><li><a href="https://mikelhsia.github.io/2023/04/26/2023-05-01-pair-trading-cointegration-part2/">【Pair Trading】 Complete Guide to Backtest Cointegration Pair Trading Strategy</a></li><li><a href="https://mikelhsia.github.io/2021/05/10/2021-05-14-machine-learning-prototype/">【ML algo trading】 II - How to build a machine learning boilerplate?</a></li></ul><hr><h1 id="Buy-on-Gap-trading-strategy"><a href="#Buy-on-Gap-trading-strategy" class="headerlink" title="Buy-on-Gap trading strategy"></a>Buy-on-Gap trading strategy</h1><h2 id="Strategy-origin"><a href="#Strategy-origin" class="headerlink" title="Strategy origin"></a>Strategy origin</h2><p>First of all, we need to find a promising trading strategy either from gossip, posts in the forum, or even books. In this post, I’m going to use the <code>buy-on-gap</code> strategy that I’ve found in <em>Ernest P. Chan</em>‘s book <strong><em>“Algorithm Trading - Winning Strategies and Their Rationale”</em></strong>.</p><h2 id="Introduction-of-the-benchmark-trading-strategy"><a href="#Introduction-of-the-benchmark-trading-strategy" class="headerlink" title="Introduction of the benchmark trading strategy"></a>Introduction of the benchmark trading strategy</h2><p>The <strong>buy-on-gap trading strategy</strong> is a popular trading technique that involves identifying price gaps and making trades based on the anticipated price action following the gap. Price gaps typically occur when the market opens or due to unexpected fundamental and technical events. These events usually trigger panic selling and cause a disproportional drop in price. Since price gaps tend to get filled after the panic selling is over, professional traders use these blank areas to find trading opportunities.</p><img data-src="/2024/01/18/2024-01-18-Revisit-Buy-On-Gao-Strategy/panic_selling.jpeg" class="" width="600"><p style="text-align:center; color: grey;">  <i>Panic selling created by <a href='https://https://copilot.microsoft.com/'>Copilot</a></i></p><h2 id="Platform"><a href="#Platform" class="headerlink" title="Platform"></a>Platform</h2><p>I’m using <a href="https://quantconnect.com/">QuantConnect</a> to backtest this strategy.</p><h2 id="Backtest-period"><a href="#Backtest-period" class="headerlink" title="Backtest period"></a>Backtest period</h2><p>From 2020-12-27 to now.</p><h2 id="Trading-Frameworks-and-Trading-Rules"><a href="#Trading-Frameworks-and-Trading-Rules" class="headerlink" title="Trading Frameworks and Trading Rules"></a>Trading Frameworks and Trading Rules</h2><p><a href="https://www.quantconnect.com/docs/v2/writing-algorithms/algorithm-framework/overview">QunatConnect</a> separates the quantitative trading script into five different parts. <code>Universe model</code> is to select the desirable equities, bonds, futures, options, or ETFs that you would like to trade against. <code>Alpha model</code> is to monitor the status of each asset, and then emit the trading signals accordingly. You need to adopt the  <code>Portfolio Construction model</code> when you would like to specify the weight of each desirable asset in your portfolio. The <code>Risk Management model</code> also monitors the custom risk level of each asset. The model will help dispose of your holding asset when a certain risk management threshold has been hit. Therefore, you can develop your own model and mismatch it with other models with minimum work.</p><p>Now, let’s have a look at the models we need to build to build this <code>buy-on-gap</code> trading strategy:</p><h3 id="Universe-model"><a href="#Universe-model" class="headerlink" title="Universe model"></a>Universe model</h3><p>In the book <strong><em>“Algorithm Trading - Winning Strategies and Their Rationale”</em></strong>, Ernest P. Chan didn’t specify any preference regarding the universe selection. Therefore, I just pick the constituent stocks of ETF <code>SPY</code> as the benchmark scenario. The <code>SPY (SPDR S&amp;P 500 ETF Trust)</code> is an exchange-traded fund that tracks the performance of the S&amp;P 500 index, which is a basket of the largest publicly traded companies in the United States. Therefore we save the effort to track and to exam which fundamentals of each stock/company and hence select over 500 stocks.</p><script src="https://gist.github.com/mikelhsia/687e60c1b74b054784b25cccf99d7f73.js"></script><h3 id="Alpha-model"><a href="#Alpha-model" class="headerlink" title="Alpha model"></a>Alpha model</h3><p>First of all, we need to set up our trading rules for the Alpha Model to monitor the market data and generate trading signals. The alpha trading signals generated essentially indicate that there are certain assets that we monitor in the <code>Universe Model</code> that have passed the predefined trading rules. Therefore, we can consider these assets as the constituents of our portfolio. Here are the trading signals generation rules:</p><ol><li>Select all stocks near the market open whose returns from their previous day’s lows to today’s opens are lower than on standard deviation. The standard deviation is computed using the daily close-to-close returns of the last 90 days. These are the stocks that ‘gapped down’</li><li>Narrow down this list of stocks by requiring their open prices to be higher than the 20-day moving average of the closing prices</li><li>Buy the 10 stocks within this list that have the lowest returns from their previous day’s lows. If the list has fewer than 10 stocks, they buy the entire list</li><li>Liquidate all positions at the market close</li></ol><p>I intentionally skip rule No.2 as the benchmark strategy to gain a full picture of this buy-on-gap trading strategy. Therefore, we can build our Alpha Model based on the above trading rules:</p><script src="https://gist.github.com/mikelhsia/a6604b2c6816ed7e7f53fbb762a98bc1.js"></script><h3 id="Portfolio-Construction-model"><a href="#Portfolio-Construction-model" class="headerlink" title="Portfolio Construction model"></a>Portfolio Construction model</h3><p>After the trading signals are generated, we must determine the percentage of these assets to be included in our portfolio. According to trading rule No. 3 above, we will allocate the capital evenly to each asset candidate. If we have less than 10 signals generated in one day, then we still allocate all the capital to these assets evenly.</p><script src="https://gist.github.com/mikelhsia/1e4d2481b392d670a7609b676c28dc47.js"></script><h3 id="Risk-Management-Model"><a href="#Risk-Management-Model" class="headerlink" title="Risk Management Model"></a>Risk Management Model</h3><p>The Risk Management Model by meaning is a model used to manage the risk level of each single asset. This is also where to implement the stop loss and stop gain mechanism. Even though this mechanism was not mentioned anywhere in the trading rules from the book, I believe it can be one of the variations of our trading scenarios. In this simple setup, we define the stop gain rate to be 2% and the stop loss rate to be 1%, meaning we will sell specific holding assets once the daily return is above 2% or below 1%.</p><script src="https://gist.github.com/mikelhsia/0524fab92d6e643e0d790b85261f4e0a.js"></script><h3 id="Main-Script"><a href="#Main-Script" class="headerlink" title="Main Script"></a>Main Script</h3><p>Lastly, we’re going to use <code>main.py</code> to include all the sub-models that we created so far. Noted that we don’t include the Risk Model yet in the backtest scenario of our trading algorithm.</p><script src="https://gist.github.com/mikelhsia/1610c573218291c31afbf0924fc82e7f.js"></script><h3 id="Plain-Vanilla-Scenario-Result"><a href="#Plain-Vanilla-Scenario-Result" class="headerlink" title="Plain Vanilla Scenario Result"></a>Plain Vanilla Scenario Result</h3><div class="table-container"><table><thead><tr><th>Strategy</th><th>Total Return</th><th>Annualized Return</th><th>Total Trades</th><th>Win Rate</th><th>Sharpe Ratio</th><th>Drawdown</th><th>Annual Variance</th><th>Expectancy</th></tr></thead><tbody><tr><td>Plain Vanilla</td><td>-83.389%</td><td>-57.726%</td><td>3366</td><td>47%</td><td>-0.826</td><td>85.300%</td><td>0.254</td><td>-0.128</td></tr></tbody></table></div><img data-src="/2024/01/18/2024-01-18-Revisit-Buy-On-Gao-Strategy/buy_on_gap_plain_vanilla.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Backtest results of plain vanilla buy-on-gap strategy</a></i></p><p>Sadly, the course of true love never did run smooth. The backtest result of our plain vanilla <code>buy-on-gap</code> trading strategy is not as good as suggested in the book, and it’s even cost our entire initial capital. Thanks to the author of this <a href="https://medium.com/@financialnoob/buy-on-gap-strategy-and-its-performance-over-time-2a474a25cf2e">article</a>, that the author had kind of explaining the trend that this strategy seems to have been ineffective since 2008 and has become worse since 2018. That also gives us the incentives to turn this around.</p><hr><h1 id="Now-the-Show-Begins"><a href="#Now-the-Show-Begins" class="headerlink" title="Now the Show Begins"></a>Now the Show Begins</h1><p>As said, we’re not going to stop right here. Now let’s try to find out some other hidden patterns from the different aspects of fundamental methodologies.</p><h2 id="Enhance-the-strength-of-the-signal-Add-sma-20-to-confirm-the-trend"><a href="#Enhance-the-strength-of-the-signal-Add-sma-20-to-confirm-the-trend" class="headerlink" title="Enhance the strength of the signal (Add sma-20 to confirm the trend)"></a>Enhance the strength of the signal (Add sma-20 to confirm the trend)</h2><p>As the first variation, let’s add back the trading rule No.2 <code>Narrow down this list of stocks by requiring their open prices to be higher than the 20-day simple moving average of the closing prices</code> to see how it impacts the buy-on-gap trading strategy.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="string">&quot;&quot;&quot;In CustomDiversifiedAlphaModel.py&quot;&quot;&quot;</span></span><br><span class="line"><span class="keyword">if</span> gap_rtn &lt; (mean - <span class="number">1</span> * std) <span class="keyword">and</span> td_open &gt; sma_20:</span><br><span class="line"><span class="comment"># if gap_rtn &lt; (mean - 1 * std):</span></span><br><span class="line">  insights_dict[key] = gap_rtn</span><br></pre></td></tr></table></figure><div class="table-container"><table><thead><tr><th>Strategy</th><th>Total Return</th><th>Annualized Return</th><th>Total Trades</th><th>Win Rate</th><th>Sharpe Ratio</th><th>Drawdown</th><th>Annual Variance</th><th>Expectancy</th></tr></thead><tbody><tr><td>Plain Vanilla with sma20</td><td>-5.35%</td><td>-2.604%</td><td>604</td><td>46%</td><td>-0.249</td><td>21.4%</td><td>0.027</td><td>-0.007</td></tr></tbody></table></div><img data-src="/2024/01/18/2024-01-18-Revisit-Buy-On-Gao-Strategy/buy_on_gap_sma20.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Backtest results of plain vanilla buy-on-gap strategy plus sma20 of the closing price</a></i></p><p>Even though we have not yet turned this trading strategy into a profitable one, the loss of our portfolio has significantly reduced. However, once you lay your eyes on the report, the first thing you’re going to notice is the number of total trades which has dropped from 3000+ to 600 trades within our trading periods. Therefore, trading rule No. 2 served the purpose of filtering out those poisonous trades but failed to increase the win rate.</p><p>Other than the insights drawn above, the intention of adding more rules to further confirm the trend is considered as overfitting your strategy to the past market data. Therefore, I would not recommend applying too many additional rules to your trading strategy.</p><h2 id="Weighting-method-All-in-or-equal-weighted"><a href="#Weighting-method-All-in-or-equal-weighted" class="headerlink" title="Weighting method: All-in or equal-weighted"></a>Weighting method: All-in or equal-weighted</h2><p>As Ernest P. Chan said in the book, this trading strategy has a relatively small capacity compared to others, and only a couple of signals or no signals are generated each day. This also means that it is very likely that our capital will be allocated to one individual stock when there’s only one signal was generated on that particular day. Once this is a bad trade, we’ll suffer a devastating loss. To reduce the risk and the potential damage caused by this worst-case scenario, we can apply the equal-weighted method to diversify this risk.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="string">&quot;&quot;&quot;In CustomPortfolioConstructionModel.py&quot;&quot;&quot;</span></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">DetermineTargetPercent</span>(<span class="params">self, activeInsights</span>):</span></span><br><span class="line">  <span class="string">&quot;&quot;&quot;...&quot;&quot;&quot;</span></span><br><span class="line"></span><br><span class="line">  <span class="comment"># Update portfolio state</span></span><br><span class="line">  <span class="keyword">for</span> _, insight <span class="keyword">in</span> enumerate(activeInsights):</span><br><span class="line">    <span class="comment"># targets[insight] = insight.Direction * (1 / len(activeInsights))</span></span><br><span class="line"></span><br><span class="line">    <span class="comment"># Where we define the self.capacity as 10 in the def __init__(self) function.</span></span><br><span class="line">    targets[insight] = insight.Direction * (<span class="number">1</span> / self.capacity)</span><br><span class="line"></span><br><span class="line">  <span class="string">&quot;&quot;&quot;...&quot;&quot;&quot;</span></span><br></pre></td></tr></table></figure><div class="table-container"><table><thead><tr><th>Strategy</th><th>Total Return</th><th>Annualized Return</th><th>Total Trades</th><th>Win Rate</th><th>Sharpe Ratio</th><th>Drawdown</th><th>Annual Variance</th><th>Expectancy</th></tr></thead><tbody><tr><td>Equal-weighted</td><td>-38.91%</td><td>-21.048%</td><td>3364</td><td>47%</td><td>-1.189</td><td>45.50%</td><td>0.022</td><td>-0.101</td></tr><tr><td>Equal-weighted with sma20</td><td>0.535%</td><td>0.256%</td><td>604</td><td>46%</td><td>-0.967</td><td>5.80%</td><td>0.001</td><td>0.012</td></tr></tbody></table></div><img data-src="/2024/01/18/2024-01-18-Revisit-Buy-On-Gao-Strategy/buy_on_gap_equalweight.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Backtest results of buy-on-gap strategy applying the equal-weighted method</a></i></p><img data-src="/2024/01/18/2024-01-18-Revisit-Buy-On-Gao-Strategy/buy_on_gap_equalweight_sma20.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Plus additional 20-day Simple Moving Average</a></i></p><p>The results of applying the equal-weighted method are phenomenal! This tells us that with the right tools, we still can turn the dreadful situation around. Of course, with the overall performance and the fact that the win rate is still lower than 50%, this variation still can’t qualify as a good qualitative trading strategy.</p><h2 id="Mean-Reversion-V-S-Momentum"><a href="#Mean-Reversion-V-S-Momentum" class="headerlink" title="Mean-Reversion V.S. Momentum"></a>Mean-Reversion V.S. Momentum</h2><p>Mean-reversion and momentum strategies are essentially two sides of the same coin. When the stock price of a specific stock reaches a certain point (either upward or downward), the believer of the mean-reversion theory would expect the stock price to return to equilibrium. On the other hand, the followers of the market momentum are convinced that the possibility of the stock price would ride on the trend and continue to either grow or drop when that certain point is passed. Since the buy-on-gap trading strategy is a strategy that expects the stock price to revert to its normal level after the previous overnight trading anomaly, we can also hold the contrary point of view saying that the trend will continue after the anomaly.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="string">&quot;&quot;&quot;In CustomDiversifiedAlphaModel.py&quot;&quot;&quot;</span></span><br><span class="line"><span class="keyword">if</span> gap_rtn &lt; (mean + <span class="number">1</span> * std):</span><br><span class="line"><span class="comment"># if gap_rtn &lt; (mean - 1 * std):</span></span><br><span class="line">  insights_dict[key] = gap_rtn</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> len(insights_dict) &gt; <span class="number">0</span>:</span><br><span class="line">  insights_symbols = [k <span class="keyword">for</span> k, v <span class="keyword">in</span> sorted(</span><br><span class="line">    insights_dict.items(),</span><br><span class="line">    key=<span class="keyword">lambda</span> x: x[<span class="number">1</span>],</span><br><span class="line">    reverse=<span class="literal">True</span> <span class="comment"># Most positive return got picked</span></span><br><span class="line">    <span class="comment"># reverse=False # Most negative return got picked</span></span><br><span class="line">  )][:self.capacity]</span><br></pre></td></tr></table></figure><div class="table-container"><table><thead><tr><th>Strategy</th><th>Total Return</th><th>Annualized Return</th><th>Total Trades</th><th>Win Rate</th><th>Sharpe Ratio</th><th>Drawdown</th><th>Annual Variance</th><th>Expectancy</th></tr></thead><tbody><tr><td>Momentum</td><td>-87.263%</td><td>-63.248%</td><td>10001</td><td>43%</td><td>-2.26</td><td>87.3%</td><td>0.053</td><td>-0.189</td></tr><tr><td>Momentum with sma20</td><td>-78.422%</td><td>-52.704%</td><td>9747</td><td>42%</td><td>-2.192</td><td>78.40%</td><td>0.037</td><td>-0.164</td></tr></tbody></table></div><img data-src="/2024/01/18/2024-01-18-Revisit-Buy-On-Gao-Strategy/buy_on_gap_momentum.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Backtest results of buy-on-gap strategy as momentum trading strategy</a></i></p><img data-src="/2024/01/18/2024-01-18-Revisit-Buy-On-Gao-Strategy/buy_on_gap_momentum_sma20.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Backtest results of momentum buy-on-gap strategy plus sma20 of closing price</a></i></p><p>Ok, our point of view does not hold. Turns out that there is no either mean-reversion or momentum DNA in the buy-on-gap trading strategy anymore. However, let us take a step back and see this chart from a different angle. Since this downward performance is an inevitable outcome, why don’t we change our trading direction to turn this situation “upside-down”?</p><h2 id="Long-v-s-Short"><a href="#Long-v-s-Short" class="headerlink" title="Long v.s. Short"></a>Long v.s. Short</h2><p>The idea is relatively simple. Originally, we long a stock at \$10.00 and expected it to rise after the anomaly last night. Unfortunately, the thing went south and we got to sell it at \$9.5 before the market closed. Accumulating these constant losing trades cost us the capital of our portfolio. On the contrary, if we short a stock at \$10.00 and close it at \$9.5, wouldn’t the same scenario help us accumulate our fortune?</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="string">&quot;&quot;&quot;In CustomDiversifiedAlphaModel.py&quot;&quot;&quot;</span></span><br><span class="line"><span class="keyword">for</span> symbol <span class="keyword">in</span> insights_symbols:</span><br><span class="line">    <span class="comment"># direction = InsightDirection.Up</span></span><br><span class="line">    direction = InsightDirection.Down</span><br><span class="line">    insight = Insight.Price(</span><br><span class="line">      symbol,</span><br><span class="line">      datetime.timedelta(days=<span class="number">1</span>),</span><br><span class="line">      direction</span><br><span class="line">    )</span><br><span class="line">    insights.append(insight)</span><br><span class="line">    self.insightCollection.Add(insight)</span><br></pre></td></tr></table></figure><div class="table-container"><table><thead><tr><th>Strategy</th><th>Total Return</th><th>Annualized Return</th><th>Total Trades</th><th>Win Rate</th><th>Sharpe Ratio</th><th>Drawdown</th><th>Annual Variance</th><th>Expectancy</th></tr></thead><tbody><tr><td>Short trades</td><td>-77.69%</td><td>-51.181%</td><td>3369</td><td>44%</td><td>-0.59</td><td>83.10%</td><td>0.262</td><td>-0.110</td></tr></tbody></table></div><img data-src="/2024/01/18/2024-01-18-Revisit-Buy-On-Gao-Strategy/buy_on_gap_short.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Backtest results when turning all the long trades into short trades</a></i></p><p>The first half of the backtest was quite inspiring to see that my point was proved, … until the long red bar popped in my face… I browsed through my transaction history, and I found out this spoilsport turns out to be <a href="https://www.cnbc.com/2023/05/01/first-republic-bank-failure.html">FRC (First Republic Bank)</a>. If we’re able to use sentiment analysis as suggested in <a href="http://mikelhsia.github.io/2023/11/02/2023-11-03-sentiment-analysis/">Sentiment Analysis lesson 101 and hands-on practice session</a>, we might be able to filter out this poisonous stock and further avoid significant loss in our portfolio.</p><h2 id="Stop-loss-and-stop-gain"><a href="#Stop-loss-and-stop-gain" class="headerlink" title="Stop loss and stop gain"></a>Stop loss and stop gain</h2><p>The initiative of adopting the stop loss and stop gain in the buy-on-gap trading strategy is that, even though the price will eventually revert to the original level before the anomaly as we believed, the momentum of that reversion will dwindle at one point and then in the end price will start to fall before the market close. Therefore, adding the stop gain and stop loss will help us harvest the crop before the upward momentum is depleted.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="string">&quot;&quot;&quot;In main.py&quot;&quot;&quot;</span></span><br><span class="line"><span class="comment"># Here we add a risk management model</span></span><br><span class="line">self.AddRiskManagement(</span><br><span class="line">  StopLossNStopGain(</span><br><span class="line">    stop_gain=<span class="number">0.02</span>,</span><br><span class="line">    stop_loss=<span class="number">0.04</span></span><br><span class="line">  )</span><br><span class="line">)</span><br></pre></td></tr></table></figure><div class="table-container"><table><thead><tr><th>Strategy</th><th>Total Return</th><th>Annualized Return</th><th>Total Trades</th><th>Win Rate</th><th>Sharpe Ratio</th><th>Drawdown</th><th>Annual Variance</th><th>Expectancy</th></tr></thead><tbody><tr><td>Short trades</td><td>-85.27%</td><td>-59.970%</td><td>3367</td><td>29%</td><td>-3.695</td><td>85.40%</td><td>0.018</td><td>-0.328</td></tr></tbody></table></div><img data-src="/2024/01/18/2024-01-18-Revisit-Buy-On-Gao-Strategy/buy_on_gap_stop.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Backtest results when applying stop gain and stop loss</a></i></p><p>There are a few things that I’ve found after applying the stop gain and stop loss:</p><ul><li>After adding the stop loss and stop gain, I found out that there were a lot of stocks were sold right after our long positions were opened.</li><li>The win rate dropped considerably from ~46% to 29%.<br>The above two clues give us a premature idea that the price-upward momentum happens sometime after the market opens instead of the time that the market just opens. Therefore we had another idea to discover the pattern of the stock price movement after the overnight anomaly, which I’m not going to further develop in this scenario here.</li></ul><h2 id="Others-variations"><a href="#Others-variations" class="headerlink" title="Others variations"></a>Others variations</h2><p>There are some more variations suggested in the book <em>Algorithmic Trading - Winning Strategies and Their Rationale</em> by <a href="https://epchan.blogspot.com/">E.P. Chan</a> plus some ideas we discovered while conducting the backtest above:</p><ul><li>Turning it into a market-neutral - long-short strategy</li><li>Turning the strategy into a sector-neutral trading strategy so that it is less concentrated in the same industry</li><li>Adopt machine learning technique to better identify the trading signal (Reference: <a href="https://mikelhsia.github.io/2022/10/21/2022-10-15-meta-label/">【Momentum Trading】Use machine learning to boost your day trading skill - meta-labeling</a>)</li><li>Adopt alternative data to develop brand-new insights. (Reference: <a href="https://mikelhsia.github.io/2023/11/02/2023-11-03-sentiment-analysis/">Sentiment analysis lesson 101 and hands-on practice session</a>)</li></ul><h1 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h1><p>In this post, we have explored different aspects and angles to approach this classic trading strategy, trying to revive this strategy into a profitable one. Even though the backtests we conducted didn’t present a positive enough Sharpe Ratio to put this trading strategy forward, we did develop a few interesting insights that are worth further looking into. In the end, let’s keep in mind that there are some limitations mentioned in E.P. Chan’s book:</p><ul><li>Drops caused by negative news are less likely to revert.</li><li>This strategy can succeed in a news-heavy environment where traditional intraday stock-pair trading will likely fail.</li><li>This strategy might not have a large capacity</li><li>A pitfall of it is using consolidated prices v.s. using primary exchange prices.</li></ul><p>Feel free to let me know how you like this subject. Cheers,</p><h1 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h1><ul><li><em>Algorithmic Trading - Winning Strategies and Their Rationale</em> by <a href="https://epchan.blogspot.com/">E.P. Chan</a></li><li><a href="https://www.quantconnect.com/docs/v2/writing-algorithms">QuantConnect Official API Document</a></li><li><a href="https://medium.com/@financialnoob/buy-on-gap-strategy-and-its-performance-over-time-2a474a25cf2e">Buy on gap strategy and its performance over time</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2024/01/18/2024-01-18-Revisit-Buy-On-Gao-Strategy/cover.jpeg&quot; class=&quot;&quot; width=&quot;800&quot;&gt;
&lt;p style=&quot;text-align:center; color: grey;&quot;&gt;
  &lt;i&gt;Cover image created by &lt;a href=&#39;https://https://copilot.microsoft.com/&#39;&gt;Copilot&lt;/a&gt;&lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;The best ideas are often inspired by or adapted from the work of others. Likewise, profitable quantitative trading strategies are not necessarily original, but they can be generated by adding personal insights regarding the market or strategy itself. In this post, I’m going to introduce the process that I usually do when discovering a prospering trading strategy.&lt;/p&gt;</summary>
    
    
    <category term="Quantitative Trading" scheme="http://mikelhsia.github.io/categories/Quantitative-Trading/"/>
    
    <category term="How2" scheme="http://mikelhsia.github.io/categories/Quantitative-Trading/How2/"/>
    
    
    <category term="Research" scheme="http://mikelhsia.github.io/tags/Research/"/>
    
    <category term="Backtesting" scheme="http://mikelhsia.github.io/tags/Backtesting/"/>
    
    <category term="Python3" scheme="http://mikelhsia.github.io/tags/Python3/"/>
    
  </entry>
  
  <entry>
    <title>Sentiment analysis lesson 101 and hands-on practice session</title>
    <link href="http://mikelhsia.github.io/2023/11/02/2023-11-03-sentiment-analysis/"/>
    <id>http://mikelhsia.github.io/2023/11/02/2023-11-03-sentiment-analysis/</id>
    <published>2023-11-02T03:24:37.000Z</published>
    <updated>2023-11-10T03:52:50.138Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2023/11/02/2023-11-03-sentiment-analysis/cover.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Photo by <a href='https://www.shutterstock.com/'>Shutterstock</a></i></p><p>In this article, we will discuss how sentiment analysis impacts the financial market, the basics of NLP(<em>Natural Language Processing</em>), and showcase how to process the financial headlines by batch to generate an indicator of the market sentiment.</p><a id="more"></a><hr><p>Become a <a href="https://medium.com/@mikelhsia/membership">Medium member</a> to continue learning without limits. I’ll receive a small portion of your membership fee if you use the following link, at no extra cost to you.</p><blockquote><p>If you enjoy reading this and my other articles, feel free to <a src='https://medium.com/@mikelhsia/membership'> join Medium membership program</a> to read more about Quantitative Trading Strategy.</p></blockquote><hr><h1 id="What-is-Sentiment-Analysis"><a href="#What-is-Sentiment-Analysis" class="headerlink" title="What is Sentiment Analysis"></a>What is Sentiment Analysis</h1><p>Imagine this,</p><blockquote><p>You are a top-notch trader on the Wall Street. One day morning, you were reading the newspaper while sipping a cup of Americano from your favorite mug. You’re enjoying the beautiful sunlight shed on you. Suddenly, one piece of news grabbed your attention. The news seemed to be talking about the newly released product and financial forecast of the company. After reading the whole piece, the pessimistic tone throughout the article started worrying you. You stroke your chin and started contemplating, “Maybe I should dump the shares that I purchased yesterday”…</p></blockquote><img data-src="/2023/11/02/2023-11-03-sentiment-analysis/contemplating.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Contemplation photo by <a href='https://medium.com/r/?url=https%3A%2F%2Funsplash.com%2F%40dariusbashar%3Futm_source%3Dmedium%26utm_medium%3Dreferral'>Darius Bashar</a> on <a href='https://medium.com/r/?url=https%3A%2F%2Funsplash.com%3Futm_source%3Dmedium%26utm_medium%3Dreferral'>Unsplash</a></i></p><p>This is the perfect example of sentiment analysis. When you receive a piece of information, you start reading it and conducting analysis not just based on the intel hidden inside the information, but you also make the judgment using the sentiment you get from the words and punctuation in the sentence. Sentiment analysis is essentially the process of analyzing digital text to determine whether the emotional implication of the message is positive, negative, or neutral. The sentiment you extract from the text can help you further improve the accuracy of your decision-making process.</p><h1 id="What-is-the-application-of-Sentiment-Analysis-in-the-financial-market"><a href="#What-is-the-application-of-Sentiment-Analysis-in-the-financial-market" class="headerlink" title="What is the application of Sentiment Analysis in the financial market"></a>What is the application of Sentiment Analysis in the financial market</h1><p>The emotions of the investors mostly drive the financial market and they are usually influenced by the news released by the companies or the reporters. As the technology evolved, we’re in an information explosion era that the text-format intel will need to be processed by machine rather than by manpower. Therefore, there are already a lot of companies and organizations using machines to process the company press release, annual financial report, or even forum comments to build up a clear idea of where the public opinions are heading. In order to enable machines to do that, there are a lot of linguistic techniques that need to be applied. Thankfully, we already have a lot of mature technology and theories out there for us to choose from. All these tools, techniques, and theories are now under the hood of <strong>“NLP” (Natural Language Processing)</strong>.</p><h1 id="NLP-Introduction"><a href="#NLP-Introduction" class="headerlink" title="NLP Introduction"></a>NLP Introduction</h1><p>NLP is an interdisciplinary realm of computer science and linguistics, and the scholars in this field are dedicated to summarizing the languages we use into linguistic rules and then teaching computers to understand and even speak the languages. Currently, there are already AI products built to be able to conduct conversations with humans, such as ChatGPT from OpenAI, Bard from Google, and Claude from Anthropic. These are all state-of-the-art AI products for users to apply to their daily lives. However, we won’t be touching any of these in this article. Instead, we’re going back to the basics using <code>NLTK (Natural Language Tool Kit)</code> to showcase how we can transform a sentence into a number-based sentiment score to help us be better informed than the other retail investors..</p><p>As said, the goal is to process our language into the binaries that computers can understand. This is the so-called <strong>vectorizing ofgiven text</strong>. Once the text has been vectorized into a series of numbers, the serialized numbers can be treated as features and fed to the machine-learning model. Then, the following are the things that we get used to, such as feature engineering, model training, and result predicting. Before vectorizing the text, there are several steps to go through as the image demonstrated below:<br><img data-src="/2023/11/02/2023-11-03-sentiment-analysis/NLP_process.png" class="" width="400"></p><p style="text-align:center; color: grey;">  <i>NLP processes to vectorize text</i></p><h2 id="Tokenization"><a href="#Tokenization" class="headerlink" title="Tokenization"></a>Tokenization</h2><img data-src="/2023/11/02/2023-11-03-sentiment-analysis/tokenization.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>NLP processes: Tokenization</i></p><p>Tokenization, as the name suggests, is to break the sentence into words and to standardize these words into tokens that can be treated unanimously with the following steps:</p><p><strong>Split the document/sentence word by word</strong></p><p>This would be the very first step to process the text-based document input.<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> nltk</span><br><span class="line"></span><br><span class="line"><span class="comment"># This is the lexicon for processing text. We&#x27;re going to talk about it later</span></span><br><span class="line">nltk.download(<span class="string">&#x27;punkt&#x27;</span>)</span><br><span class="line"></span><br><span class="line">corporas = <span class="string">&quot;AMD’s Q3 earnings report exceeded Wall Street&#x27;s expectations. \</span></span><br><span class="line"><span class="string">Its growth indicates the PC market has finally bottomed out. ......&quot;</span></span><br><span class="line"></span><br><span class="line">print(nltk.sent_tokenize(corporas))</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>[<span class="string">&quot;AMD’s Q3 earnings report exceeded Wall Street&#x27;s expectations.&quot;</span>,</span><br><span class="line"> <span class="string">&#x27;Its growth indicates the PC market has finally bottomed out.&#x27;</span>,</span><br><span class="line"> <span class="string">&#x27;......&#x27;</span>]</span><br><span class="line"></span><br><span class="line">print(nltk.word_tokenize(corporas))</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>[<span class="string">&#x27;AMD&#x27;</span>, <span class="string">&#x27;’&#x27;</span>, <span class="string">&#x27;s&#x27;</span>, <span class="string">&#x27;Q3&#x27;</span>, <span class="string">&#x27;earnings&#x27;</span>, <span class="string">&#x27;report&#x27;</span>, <span class="string">&#x27;exceeded&#x27;</span>, <span class="string">&#x27;Wall&#x27;</span>, <span class="string">&#x27;Street&#x27;</span>, <span class="string">&quot;&#x27;s&quot;</span>, <span class="string">&#x27;expectations&#x27;</span>, <span class="string">&#x27;.&#x27;</span>, <span class="string">&#x27;Its&#x27;</span>, <span class="string">&#x27;growth&#x27;</span>, <span class="string">&#x27;indicates&#x27;</span>, <span class="string">&#x27;the&#x27;</span>, <span class="string">&#x27;PC&#x27;</span>, <span class="string">&#x27;market&#x27;</span>, <span class="string">&#x27;has&#x27;</span>, <span class="string">&#x27;finally&#x27;</span>, <span class="string">&#x27;bottomed&#x27;</span>, <span class="string">&#x27;out&#x27;</span>, <span class="string">&#x27;.&#x27;</span>, <span class="string">&#x27;......&#x27;</span>]</span><br></pre></td></tr></table></figure></p><p>Now you can see that all the words and punctuations are split into individual words. However, these words are not yet ready as there are irregular symbols or characters in the list that actually have no meaning at all. Therefore, we need to remove them from our token list.</p><p><strong>Remove symbols and punctuation</strong></p><p>In the token list above, we see a lot of punctuations such as <code>&#39;</code>, <code>.</code>, or <code>...</code> scattered here and there throughout the list. Even though they do mean something when they are combined into a sentence, removing them actually won’t prevent us or the machine from understanding the general structure of the sentence.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">tokens = [x <span class="keyword">for</span> x <span class="keyword">in</span> nltk.word_tokenize(corporas) <span class="keyword">if</span> x.isalpha()]</span><br><span class="line">print(tokens)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>[<span class="string">&#x27;AMD&#x27;</span>, <span class="string">&#x27;s&#x27;</span>, <span class="string">&#x27;earnings&#x27;</span>, <span class="string">&#x27;report&#x27;</span>, <span class="string">&#x27;exceeded&#x27;</span>, <span class="string">&#x27;Wall&#x27;</span>, <span class="string">&#x27;Street&#x27;</span>, <span class="string">&#x27;expectations&#x27;</span>, <span class="string">&#x27;Its&#x27;</span>, <span class="string">&#x27;growth&#x27;</span>, <span class="string">&#x27;indicates&#x27;</span>, <span class="string">&#x27;the&#x27;</span>, <span class="string">&#x27;PC&#x27;</span>, <span class="string">&#x27;market&#x27;</span>, <span class="string">&#x27;has&#x27;</span>, <span class="string">&#x27;finally&#x27;</span>, <span class="string">&#x27;bottomed&#x27;</span>, <span class="string">&#x27;out&#x27;</span>]</span><br></pre></td></tr></table></figure><p><strong>Remove stop words</strong></p><p>Stop words are a set of common words that add much meaning to a sentence. For example, if you want to know <em>“how to cook a piece of steak with a oven”</em>, you probably google with keywords <code>cook</code>, <code>steak</code>, and <code>oven</code>. <code>How</code>, <code>to</code>, <code>a</code>, <code>of</code>, and <code>with</code> would be considered stop words as they contain less information than the rest of the words. The stop words are actually used in every language (<em>but maybe not in programming languages lol</em>).</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> nltk.corpus <span class="keyword">import</span> stopwords</span><br><span class="line"></span><br><span class="line"><span class="comment"># Again, another lexicon that contains all the stop words</span></span><br><span class="line">nltk.download(<span class="string">&#x27;stopwords&#x27;</span>)</span><br><span class="line"></span><br><span class="line">stop_words = set(stopwords.words(<span class="string">&#x27;english&#x27;</span>))</span><br><span class="line">tokens_wo_stop_words = [x <span class="keyword">for</span> x <span class="keyword">in</span> tokens <span class="keyword">if</span> x <span class="keyword">not</span> <span class="keyword">in</span> stop_words]</span><br><span class="line">print(tokens_wo_stop_words)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>[<span class="string">&#x27;AMD&#x27;</span>, <span class="string">&#x27;earnings&#x27;</span>, <span class="string">&#x27;report&#x27;</span>, <span class="string">&#x27;exceeded&#x27;</span>, <span class="string">&#x27;Wall&#x27;</span>, <span class="string">&#x27;Street&#x27;</span>, <span class="string">&#x27;expectations&#x27;</span>, <span class="string">&#x27;Its&#x27;</span>, <span class="string">&#x27;growth&#x27;</span>, <span class="string">&#x27;indicates&#x27;</span>, <span class="string">&#x27;PC&#x27;</span>, <span class="string">&#x27;market&#x27;</span>, <span class="string">&#x27;finally&#x27;</span>, <span class="string">&#x27;bottomed&#x27;</span>]</span><br></pre></td></tr></table></figure><p>See! The tokens now look more unified, and would not prevent us from understanding the exact meaning of this sentence. Here we finish the first step of the processing.</p><h2 id="Stemming-amp-Lemmatization"><a href="#Stemming-amp-Lemmatization" class="headerlink" title="Stemming &amp; Lemmatization"></a>Stemming &amp; Lemmatization</h2><img data-src="/2023/11/02/2023-11-03-sentiment-analysis/stem_n_lemmi.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>NLP processes: Stemming and Lemmatization</i></p><p>The English language has many variations of a single common root form. For example, the word <code>love</code> has forms of <em>loves (verb.), loved(verb.), loving(adj.), loves(n)</em>. These variations do help human beings comprehend the context of the speakers’ intentions but inevitably create ambiguity for the machine-learning model to grasp the key point in the document. Therefore, it’s crucial to further process these variations and then convert them to an identical form that won’t confuse the machine learning model. <code>Stemming</code> or <code>lemmatization</code> are the techniques that facilitate finding the common root form of word variations in different ways, but ultimately they both aim to achieve the same goal.</p><p><strong>Lexicons</strong><br>First of all, let’s talk about lexicons. Lexicons are the fundamentals of the stemming and lemmatization techniques. It is like a dictionary to look up when finding the root form of a word variation. Therefore, choosing the right lexicons to use is very crucial for processing the words in the given document. <strong>LIWC</strong>, <strong>Harvard’s General Inquirer</strong>, <strong>SeticNet</strong>, and <strong>SentiWordNet</strong> are the most famous lexicons. <strong>Loughran-McDonald Master Dictionary</strong> is one of the most popular economy lexicons. <strong>SentiBigNomics</strong> is a detailed financial dictionary specialized in sentiment analysis. There are around 7300 terms and root forms documented in this lexicon. Also, if you’re looking to conduct sentiment analysis against the bio-medical paper, <strong>WordNet for Medical Events (WME)</strong> could be your better choice.</p><p><strong>Stemming</strong><br>Stemming is a process to reduce the morphological affixes from word variations, leaving only the word stem. The grammatical role, tense, and derivational morphology will be stripped away, leaving only the stem of the word, which is the common root. For example, both <code>loves</code> and <code>loving</code> will be stemmed back to the root form <code>love</code>. However, stemming has its dark side that sometimes will backfire. The words <code>universal</code>, <code>university</code>, and <code>universe</code> have different meanings, but share the same root form <code>univers</code> if you adopt the stemming method. This is the price you have to pay because stemming offers a faster and easier way to extract text features.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> nltk.stem <span class="keyword">import</span> PorterStemmer</span><br><span class="line">ps = PorterStemmer()</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> w <span class="keyword">in</span> tokens_wo_stop_words:</span><br><span class="line">    print(<span class="string">f&#x27;<span class="subst">&#123;w&#125;</span>: <span class="subst">&#123;ps.stem(w)&#125;</span>&#x27;</span>)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>AMD: amd</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>earnings: earn</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>report: report</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>exceeded: exceed</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>Wall: wall</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>Street: street</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>expectations: expect</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>Its: it</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>growth: growth</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>indicates: indic</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>PC: pc</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>market: market</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span><span class="keyword">finally</span>: final</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>bottomed: bottom</span><br></pre></td></tr></table></figure><p><strong>Lemmatization</strong><br>On the contrary, lemmatization can better discover the root form of the word variations with the cost of sacrificing the performance of speed. Lemmatization uses a thicker lexicon to compare and match with to find out the root form. Hence, it’ll return a more accurate word compared to stemming. Also, lemmatization also takes the part of speech into consideration. For example, lemmatize <code>saw</code> will get you <code>see</code> if you treat it as a verb and <code>saw</code> if you treat it as a noun.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> nltk.stem <span class="keyword">import</span> WordNetLemmatizer</span><br><span class="line"></span><br><span class="line">nltk.download(<span class="string">&#x27;wordnet&#x27;</span>)</span><br><span class="line">nltk.download(<span class="string">&#x27;omw-1.4&#x27;</span>)</span><br><span class="line"></span><br><span class="line">lemmatizer = WordNetLemmatizer()</span><br><span class="line"></span><br><span class="line">print(<span class="string">f&#x27;AMD: <span class="subst">&#123;lemmatizer.lemmatize(<span class="string">&quot;AMD&quot;</span>, pos=<span class="string">&quot;n&quot;</span>)&#125;</span>&#x27;</span>)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>AMD: AMD</span><br><span class="line"></span><br><span class="line">print(<span class="string">f&#x27;earnings: <span class="subst">&#123;lemmatizer.lemmatize(<span class="string">&quot;earnings&quot;</span>, pos=<span class="string">&quot;n&quot;</span>)&#125;</span>&#x27;</span>)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>earnings: earnings</span><br><span class="line"></span><br><span class="line">print(<span class="string">f&#x27;report: <span class="subst">&#123;lemmatizer.lemmatize(<span class="string">&quot;report&quot;</span>, pos=<span class="string">&quot;n&quot;</span>)&#125;</span>&#x27;</span>)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>report: report</span><br><span class="line"></span><br><span class="line">print(<span class="string">f&#x27;exceeded: <span class="subst">&#123;lemmatizer.lemmatize(<span class="string">&quot;exceeded&quot;</span>, pos=<span class="string">&quot;v&quot;</span>)&#125;</span>&#x27;</span>)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>exceeded: exceed</span><br><span class="line"></span><br><span class="line">print(<span class="string">f&#x27;Wall: <span class="subst">&#123;lemmatizer.lemmatize(<span class="string">&quot;Wall&quot;</span>, pos=<span class="string">&quot;n&quot;</span>)&#125;</span>&#x27;</span>)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>Wall: Wall</span><br><span class="line"></span><br><span class="line">print(<span class="string">f&#x27;Street: <span class="subst">&#123;lemmatizer.lemmatize(<span class="string">&quot;Street&quot;</span>, pos=<span class="string">&quot;n&quot;</span>)&#125;</span>&#x27;</span>)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>Street: Street</span><br><span class="line"></span><br><span class="line">print(<span class="string">f&#x27;expectations: <span class="subst">&#123;lemmatizer.lemmatize(<span class="string">&quot;expectations&quot;</span>, pos=<span class="string">&quot;n&quot;</span>)&#125;</span>&#x27;</span>)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>expectations: expectation</span><br><span class="line"></span><br><span class="line">print(<span class="string">f&#x27;Its: <span class="subst">&#123;lemmatizer.lemmatize(<span class="string">&quot;Its&quot;</span>, pos=<span class="string">&quot;n&quot;</span>)&#125;</span>&#x27;</span>)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>Its: Its</span><br><span class="line"></span><br><span class="line">print(<span class="string">f&#x27;growth: <span class="subst">&#123;lemmatizer.lemmatize(<span class="string">&quot;growth&quot;</span>, pos=<span class="string">&quot;n&quot;</span>)&#125;</span>&#x27;</span>)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>growth: growth</span><br><span class="line"></span><br><span class="line">print(<span class="string">f&#x27;indicates: <span class="subst">&#123;lemmatizer.lemmatize(<span class="string">&quot;indicates&quot;</span>, pos=<span class="string">&quot;v&quot;</span>)&#125;</span>&#x27;</span>)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>indicates: indicate</span><br><span class="line"></span><br><span class="line">print(<span class="string">f&#x27;PC: <span class="subst">&#123;lemmatizer.lemmatize(<span class="string">&quot;PC&quot;</span>, pos=<span class="string">&quot;n&quot;</span>)&#125;</span>&#x27;</span>)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>PC: PC</span><br><span class="line"></span><br><span class="line">print(<span class="string">f&#x27;market: <span class="subst">&#123;lemmatizer.lemmatize(<span class="string">&quot;market&quot;</span>, pos=<span class="string">&quot;n&quot;</span>)&#125;</span>&#x27;</span>)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>market: market</span><br><span class="line"></span><br><span class="line">print(<span class="string">f&#x27;finally: <span class="subst">&#123;lemmatizer.lemmatize(<span class="string">&quot;finally&quot;</span>, pos=<span class="string">&quot;r&quot;</span>)&#125;</span>&#x27;</span>)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span><span class="keyword">finally</span>: <span class="keyword">finally</span></span><br><span class="line"></span><br><span class="line">print(<span class="string">f&#x27;bottomed: <span class="subst">&#123;lemmatizer.lemmatize(<span class="string">&quot;bottomed&quot;</span>, pos=<span class="string">&quot;v&quot;</span>)&#125;</span>&#x27;</span>)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>bottomed: bottom</span><br></pre></td></tr></table></figure><p>One thing that is worth talking about is, that unless you have faithful confidence knowing your model needs both these techniques come into play, you probably don’t want to use these two techniques at the same time. For example, the stemming method will strip the word <code>saws</code> down to <code>saw</code>, which makes sense because <code>saws</code> is a plural format of the noun <code>saw</code>. If you then try to apply lemmatization to the word <code>saw</code>, you might get <code>see</code> if you didn’t specify it as a noun. So be aware.</p><img data-src="/2023/11/02/2023-11-03-sentiment-analysis/stem_n_lemmi_2.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Differences between stemming and lemmatization</i></p><h2 id="Part-of-speech-tagging"><a href="#Part-of-speech-tagging" class="headerlink" title="Part-of-speech tagging"></a>Part-of-speech tagging</h2><p>After learning the power of lemmatization, you probably wanna ask, <em>“Hey! If I’m going to specify the part of speech of every single word, that is no longer efficient at all”</em>. Worry not. <code>NLTK</code> is well-thought-out and has built this part-of-speech tagging as one of its sub-packages. You simply pass your tokens as parameters into <code>nltk.pos_tag()</code> function and the pre-defined <a href="https://www.ibm.com/docs/en/wca/3.5.0?topic=analytics-part-speech-tag-sets">part-of-speech tag</a> will be returned together with the tokens as tuples. You can then further define a function to replace the returned pos tag with the simple set of <code>[n, v, adj, adv, conj, ...]</code>, making lemmatization much more easier.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">nltk.download(<span class="string">&#x27;averaged_perceptron_tagger&#x27;</span>)</span><br><span class="line"></span><br><span class="line">nltk.pos_tag(tokens_wo_stop_words)</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>[(<span class="string">&#x27;AMD&#x27;</span>, <span class="string">&#x27;NNP&#x27;</span>), (<span class="string">&#x27;earnings&#x27;</span>, <span class="string">&#x27;NNS&#x27;</span>), (<span class="string">&#x27;report&#x27;</span>, <span class="string">&#x27;NN&#x27;</span>), (<span class="string">&#x27;exceeded&#x27;</span>, <span class="string">&#x27;VBD&#x27;</span>), (<span class="string">&#x27;Wall&#x27;</span>, <span class="string">&#x27;NNP&#x27;</span>), (<span class="string">&#x27;Street&#x27;</span>, <span class="string">&#x27;NNP&#x27;</span>), (<span class="string">&#x27;expectations&#x27;</span>, <span class="string">&#x27;NNS&#x27;</span>), (<span class="string">&#x27;Its&#x27;</span>, <span class="string">&#x27;PRP$&#x27;</span>), (<span class="string">&#x27;growth&#x27;</span>, <span class="string">&#x27;NN&#x27;</span>), (<span class="string">&#x27;indicates&#x27;</span>, <span class="string">&#x27;VBZ&#x27;</span>), (<span class="string">&#x27;PC&#x27;</span>, <span class="string">&#x27;NN&#x27;</span>), (<span class="string">&#x27;market&#x27;</span>, <span class="string">&#x27;NN&#x27;</span>), (<span class="string">&#x27;finally&#x27;</span>, <span class="string">&#x27;RB&#x27;</span>), (<span class="string">&#x27;bottomed&#x27;</span>, <span class="string">&#x27;VBD&#x27;</span>)]</span><br></pre></td></tr></table></figure><h2 id="NER-Named-Entity-Recognition-and-chunking"><a href="#NER-Named-Entity-Recognition-and-chunking" class="headerlink" title="NER (Named Entity Recognition) and chunking"></a>NER (Named Entity Recognition) and chunking</h2><p>What is NER (Named Entity Recognition)? Easy. Take the <code>New York Statue of Liberty</code> for example. Should we tokenize this into <code>New</code>, <code>York</code>, <code>Statue</code>, <code>of</code>, and <code>Liberty</code>, or should be <code>New York</code> and <code>Statue of Liberty</code> instead? The named entity is the unique name for places, people, things, locations, etc. This combination of words shouldn’t be treated as multiple tokens. Instead, it should be treated as one token. That’s why we need to regroup the words and find out the named entities, reducing the chances of confusing our following steps.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">nltk.download(<span class="string">&#x27;maxent_ne_chunker&#x27;</span>)</span><br><span class="line">nltk.download(<span class="string">&#x27;words&#x27;</span>)</span><br><span class="line"></span><br><span class="line">tagged_token = nltk.pos_tag(tokens_wo_stop_words)</span><br><span class="line">nltk.chunk.ne_chunk(tagged_token)</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> chunk <span class="keyword">in</span> nltk.chunk.ne_chunk(tagged_token):</span><br><span class="line">  <span class="keyword">if</span> hasattr(chunk, <span class="string">&#x27;label&#x27;</span>):</span><br><span class="line">    print(chunk.label(), <span class="string">&#x27; &#x27;</span>.join(c[<span class="number">0</span>] <span class="keyword">for</span> c <span class="keyword">in</span> chunk))</span><br><span class="line"></span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>FACILITY Wall Street</span><br><span class="line"></span><br><span class="line">processed_token = [(<span class="string">&#x27; &#x27;</span>.join(c[<span class="number">0</span>] <span class="keyword">for</span> c <span class="keyword">in</span> chunk), chunk.label()) <span class="keyword">if</span> hasattr(chunk, <span class="string">&#x27;label&#x27;</span>) <span class="keyword">else</span> chunk <span class="keyword">for</span> chunk <span class="keyword">in</span> nltk.chunk.ne_chunk(tagged_token)]</span><br><span class="line"><span class="meta">&gt;&gt;&gt; </span>[(<span class="string">&#x27;AMD&#x27;</span>, <span class="string">&#x27;NNP&#x27;</span>), (<span class="string">&#x27;earnings&#x27;</span>, <span class="string">&#x27;NNS&#x27;</span>), (<span class="string">&#x27;report&#x27;</span>, <span class="string">&#x27;NN&#x27;</span>), (<span class="string">&#x27;exceeded&#x27;</span>, <span class="string">&#x27;VBD&#x27;</span>), (<span class="string">&#x27;Wall Street&#x27;</span>, <span class="string">&#x27;FACILITY&#x27;</span>), (<span class="string">&#x27;expectations&#x27;</span>, <span class="string">&#x27;NNS&#x27;</span>), (<span class="string">&#x27;Its&#x27;</span>, <span class="string">&#x27;PRP$&#x27;</span>), (<span class="string">&#x27;growth&#x27;</span>, <span class="string">&#x27;NN&#x27;</span>), (<span class="string">&#x27;indicates&#x27;</span>, <span class="string">&#x27;VBZ&#x27;</span>), (<span class="string">&#x27;PC&#x27;</span>, <span class="string">&#x27;NN&#x27;</span>), (<span class="string">&#x27;market&#x27;</span>, <span class="string">&#x27;NN&#x27;</span>), (<span class="string">&#x27;finally&#x27;</span>, <span class="string">&#x27;RB&#x27;</span>), (<span class="string">&#x27;bottomed&#x27;</span>, <span class="string">&#x27;VBD&#x27;</span>)]</span><br></pre></td></tr></table></figure><p>See! <code>Wall Street</code> has been put together into one word as a named entity.</p><hr><p><strong><em>OK!</em></strong></p><p>I’m going to stop it right here. After all, we don’t need all the steps in place to conduct a simple sentiment analysis. We’ll now jump right into the simple sentiment analysis tool to evaluate the emotional implication of the news headline. However, if you want to know more details about the details of the rest of these steps and also how to apply them in the stock market, feel free to leave a message to me.</p><h1 id="VADER-Valence-Aware-Dictionary-and-sEntiment-Reasoner"><a href="#VADER-Valence-Aware-Dictionary-and-sEntiment-Reasoner" class="headerlink" title="VADER (Valence Aware Dictionary and sEntiment Reasoner)"></a>VADER (Valence Aware Dictionary and sEntiment Reasoner)</h1><p><code>VADER</code> is a model built in <code>NLTK</code> package that aims to evaluate the emotional intensity of a sentence. <code>VADER</code> not only determines whether a sentence is positive or negative, but it also evaluates the intensity level of the sentence, judging how positive or negative is a given sentence. Here are a few more things about <code>VADER</code>:</p><ul><li><code>VADER</code> returns four values for each sentence evaluation: positive level, negative level, neutral level, and compound score.</li><li>It takes into account of the emotional impact of special punctuations like <code>!!!</code> and <code>!?</code> and also the emojis such as <code>:)</code> and <code>;(</code>.</li><li>It also factors in the impact of the all-capitalized characters which enhance or dampen the emotional implication of a sentence.</li><li>It’s fast as it doesn’t need to train any model before using it</li><li>It’s best suited for the language used in social media because of its excellence in analyzing emojis and unconventional punctuation.</li></ul><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">%-)    -1.5    1.43178    [-2, 0, -2, -2, -1, 2, -2, -3, -2, -3]</span><br><span class="line">&amp;-:    -0.4    1.42829    [-3, -1, 0, 0, -1, -1, -1, 2, -1, 2]</span><br><span class="line">...</span><br><span class="line">advantaged    1.4    0.91652    [1, 0, 3, 0, 1, 1, 2, 2, 2, 2]</span><br><span class="line">advantageous    1.5    0.67082    [2, 0, 2, 2, 2, 1, 1, 1, 2, 2]</span><br><span class="line">...</span><br></pre></td></tr></table></figure><p style="text-align:center; color: grey;">  <i><b><a href='https://github.com/cjhutto/vaderSentiment/blob/master/vaderSentiment/vader_lexicon.txt'>vader_lexicon.txt</a></b> is used for finding the corresponding score of a word or a punctuation</i></p><p>The scoring method that <code>VADER</code> used and its source code are relatively straightforward and easy to understand. I would encourage you to spend half an hour to get to know what <code>VADER</code> does when it comes to evaluating the sentiment score. <em>(Check out the <a href="https://www.nltk.org/_modules/nltk/sentiment/vader.html">VADER source code</a>)</em>.</p><img data-src="/2023/11/02/2023-11-03-sentiment-analysis/vader_scores.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>A couple of examples of VADER <i>polarity_scores()</i> </i></p><h1 id="Get-started-with-the-stock-sentiment-analysis"><a href="#Get-started-with-the-stock-sentiment-analysis" class="headerlink" title="Get started with the stock sentiment analysis"></a>Get started with the stock sentiment analysis</h1><p>Let’s get down to business! I’m going to demonstrate how to conduct sentiment analysis with <code>VADER</code> against four stocks: <code>NVDA</code>, <code>AVGO</code>, <code>AMD</code>, <code>BABA</code>. As for the data sources of the news headline, I will scrape from the <a href="https://finviz.com/quote.ashx?t=amd&amp;p=d">https://finviz.com/</a> as suggested by the author of <a href="https://medium.datadriveninvestor.com/sentiment-analysis-of-stocks-from-financial-news-using-python-82ebdcefb638">this article</a>.</p><h2 id="Step-1-Global-variables"><a href="#Step-1-Global-variables" class="headerlink" title="Step 1. Global variables"></a>Step 1. Global variables</h2><p>First, let’s import the libraries we need, and define the tickers that we’re going to look into.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> pandas <span class="keyword">as</span> pd</span><br><span class="line"><span class="keyword">from</span> datetime <span class="keyword">import</span> datetime</span><br><span class="line"></span><br><span class="line"><span class="keyword">from</span> bs4 <span class="keyword">import</span> BeautifulSoup</span><br><span class="line"><span class="keyword">import</span> nltk</span><br><span class="line"><span class="keyword">from</span> nltk.sentiment.vader <span class="keyword">import</span> SentimentIntensityAnalyzer</span><br><span class="line"><span class="keyword">import</span> requests</span><br><span class="line"></span><br><span class="line"><span class="comment"># Define the ticker list</span></span><br><span class="line">tickers_list = [<span class="string">&#x27;NVDA&#x27;</span>, <span class="string">&#x27;AVGO&#x27;</span>, <span class="string">&#x27;AMD&#x27;</span>, <span class="string">&#x27;BABA&#x27;</span>]</span><br></pre></td></tr></table></figure><h2 id="Step-2-Fetch-the-headlines-of-the-tickers"><a href="#Step-2-Fetch-the-headlines-of-the-tickers" class="headerlink" title="Step 2. Fetch the headlines of the tickers"></a>Step 2. Fetch the headlines of the tickers</h2><p>In this step, we use <code>BeautifulSoup</code> and <code>requests</code> to scrape the news headline from <a href="https://finviz.com/">https://finviz.com/</a>. After you scrape the headlines and tuck them into the pd.DataFrame, you will notice that most cells in the <code>Date</code> column are actually empty. That is because the date format in the <a href="https://finviz.com/">https://finviz.com/</a> causes this issue. Hence, we need to further process the data in <code>Date</code> column and extract the time data to fill in the <code>Time</code> column. Once that is done properly, we can now concatenate all the scraped headlines to produce a complete headline table.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line">news = pd.DataFrame()</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> ticker <span class="keyword">in</span> tickers_list:</span><br><span class="line">    url = <span class="string">f&#x27;https://finviz.com/quote.ashx?t=<span class="subst">&#123;ticker&#125;</span>&amp;p=d&#x27;</span></span><br><span class="line">    ret = requests.get(</span><br><span class="line">        url,</span><br><span class="line">        headers=&#123;<span class="string">&#x27;User-Agent&#x27;</span>: <span class="string">&#x27;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36&#x27;</span>&#125;,</span><br><span class="line">    )</span><br><span class="line">    html = BeautifulSoup(ret.content, <span class="string">&quot;html.parser&quot;</span>)</span><br><span class="line">    <span class="keyword">try</span>:</span><br><span class="line">        df = pd.read_html(</span><br><span class="line">            str(html),</span><br><span class="line">            attrs=&#123;<span class="string">&#x27;class&#x27;</span>: <span class="string">&#x27;fullview-news-outer&#x27;</span>&#125;</span><br><span class="line">        )[<span class="number">0</span>]</span><br><span class="line">        <span class="comment"># print(f&quot;&#123;ticker&#125; Done&quot;)</span></span><br><span class="line">    <span class="keyword">except</span>:</span><br><span class="line">        print(<span class="string">f&quot;<span class="subst">&#123;ticker&#125;</span> No news found&quot;</span>)</span><br><span class="line">        <span class="keyword">continue</span></span><br><span class="line">    df.columns = [<span class="string">&#x27;Date&#x27;</span>, <span class="string">&#x27;Headline&#x27;</span>]</span><br><span class="line"></span><br><span class="line">    <span class="comment"># Process date and time columns to make sure this is filled in every headline each row</span></span><br><span class="line">    dateNTime = df.Date.apply(<span class="keyword">lambda</span> x: <span class="string">&#x27;,&#x27;</span>+x <span class="keyword">if</span> len(x)&lt;<span class="number">8</span> <span class="keyword">else</span> x).str.split(<span class="string">r&#x27; |,&#x27;</span>, expand = <span class="literal">True</span>).replace(<span class="string">&quot;&quot;</span>, <span class="literal">None</span>).ffill()</span><br><span class="line">    df = pd.merge(df, dateNTime, right_index=<span class="literal">True</span>, left_index=<span class="literal">True</span>).drop(<span class="string">&#x27;Date&#x27;</span>, axis=<span class="number">1</span>).rename(columns=&#123;<span class="number">0</span>:<span class="string">&#x27;Date&#x27;</span>, <span class="number">1</span>:<span class="string">&#x27;Time&#x27;</span>&#125;)</span><br><span class="line">    df.loc[:, <span class="string">&#x27;Date&#x27;</span>][df.loc[:,<span class="string">&#x27;Date&#x27;</span>]==<span class="string">&#x27;Today&#x27;</span>] = str(datetime.now().date())</span><br><span class="line">    df.Date = pd.to_datetime(df.Date)</span><br><span class="line">    df.Time = pd.to_datetime(df.Time).dt.time</span><br><span class="line">    df = df[df[<span class="string">&quot;Headline&quot;</span>].str.contains(<span class="string">&quot;Loading.&quot;</span>) == <span class="literal">False</span>].loc[:, [<span class="string">&#x27;Date&#x27;</span>, <span class="string">&#x27;Time&#x27;</span>, <span class="string">&#x27;Headline&#x27;</span>]]</span><br><span class="line">    df[<span class="string">&quot;Date&quot;</span>] = df[<span class="string">&quot;Date&quot;</span>].dt.date</span><br><span class="line"></span><br><span class="line">    df[<span class="string">&quot;Ticker&quot;</span>] = ticker</span><br><span class="line">    news = pd.concat([news, df], ignore_index = <span class="literal">True</span>)</span><br></pre></td></tr></table></figure><img data-src="/2023/11/02/2023-11-03-sentiment-analysis/step_1_for_sentiment_analysis.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>DataFrame of the scraped headlines</i></p><h2 id="Step-3-Generate-the-news-sentiment-score"><a href="#Step-3-Generate-the-news-sentiment-score" class="headerlink" title="Step 3. Generate the news sentiment score"></a>Step 3. Generate the news sentiment score</h2><p>This step will be fairly simple. We apply the <code>polarity_scores()</code> function to all the headlines. Once we get all the negative, neutral, positive, and compound scores, we concatenate them back to the original news dataframe. Notice, here we need to download the <code>vader_lexicon</code> first so that the <code>polarity_scores()</code> function can work properly. The way that vader package calculates the score is quite interesting and not difficult to understand. If you are interested in knowing how the scores get calculated, read the <a href="https://www.nltk.org/_modules/nltk/sentiment/vader.html">VADER source code</a>. Probably will take you half an hour to do so, but it will definitely pay off.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">nltk.download(<span class="string">&#x27;vader_lexicon&#x27;</span>)</span><br><span class="line">vader = SentimentIntensityAnalyzer()</span><br><span class="line"></span><br><span class="line">scored_news = news.join(pd.DataFrame(news[<span class="string">&#x27;Headline&#x27;</span>].apply(vader.polarity_scores).tolist()))</span><br></pre></td></tr></table></figure><img data-src="/2023/11/02/2023-11-03-sentiment-analysis/step_2_for_sentiment_analysis.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Attach the score back to the original DataFrame</i></p><h2 id="Step-4-To-further-add-a-flavor-to-the-sentiment-score"><a href="#Step-4-To-further-add-a-flavor-to-the-sentiment-score" class="headerlink" title="Step 4. To further add a flavor to the sentiment score"></a>Step 4. To further add a flavor to the sentiment score</h2><p>It is kind of a well-known fact that the impact influence of any newly released news will wane away as time passes. I use the EMA (Exponential Moving Average) method to factor this phenomenon into our sentiment score model. Here I adopt the 5-day EMA to calculate the sentiment score moving average.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">news_score = scored_news.loc[:, [<span class="string">&#x27;Ticker&#x27;</span>, <span class="string">&#x27;Date&#x27;</span>, <span class="string">&#x27;compound&#x27;</span>]].pivot_table(values=<span class="string">&#x27;compound&#x27;</span>, index=<span class="string">&#x27;Date&#x27;</span>, columns=<span class="string">&#x27;Ticker&#x27;</span>, aggfunc=<span class="string">&#x27;mean&#x27;</span>).ewm(<span class="number">5</span>).mean()</span><br><span class="line">news_score.dropna().plot()</span><br></pre></td></tr></table></figure><img data-src="/2023/11/02/2023-11-03-sentiment-analysis/step_3_for_sentiment_analysis.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>5-day EMA of the sentiment scores</i></p><p>By looking at the diagram above, it is easy to notice that the sentiment score of these four tickers ended up having different moving paths. However, the stock prices are not driven by the exact score but by the relative changes in the scores. Therefore, let’s take one more step to find out the changes in the emotional implications of these headlines.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">news_score.pct_change().dropna().plot()</span><br></pre></td></tr></table></figure><img data-src="/2023/11/02/2023-11-03-sentiment-analysis/step_4_for_sentiment_analysis.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Percentage change of the daily sentiment score of each ticker</i></p><p>After these many steps, the outcome became much more clear at last. Both <code>BABA</code> and <code>NVDA</code> have positive changes in terms of the sentiment score changes. This might indicate that the stock prices of these two stocks possibly will have a positive influence and the demand of these two stocks would rise against the supply, leading the stock prices to go up.</p><h1 id="Conclusion-and-other-thoughts"><a href="#Conclusion-and-other-thoughts" class="headerlink" title="Conclusion and other thoughts"></a>Conclusion and other thoughts</h1><p>This is the end of my sentiment analysis, but it shouldn’t be yours. There are actually more interesting things and ideas you can start building based on this sentiment framework, such as:</p><ul><li>Find a suitable lexicon when processing your token and when evaluating your scores.</li><li>Scrape not just the headline of the news but also the content of the news to run a much more detailed sentimental analysis.</li><li>Send the news_score data into the LSTM model instead of simply using the Exponential Moving Average.</li><li>…</li></ul><p>Welcome leaving a message to me telling me whether you like this article or not. Or, maybe just tell me what can be added to the analysis here.<br>Cheers.</p><h1 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h1><ul><li><a href="https://www.youtube.com/playlist?list=PLeo1K3hjS3uuvuAXhYjV2lMEShq2UYSwX">NLP Tutorial Python Youtube channel</a></li><li><a href="https://medium.datadriveninvestor.com/sentiment-analysis-of-stocks-from-financial-news-using-python-82ebdcefb638">Sentiment Analysis of Stocks from Financial News using Python</a></li><li><a href="https://medium.com/@mystery0116/nlp-how-does-nltk-vader-calculate-sentiment-6c32d0f5046b">NLP: How does NLTK.Vader Calculate Sentiment?</a></li><li><a href="https://www.datacamp.com/tutorial/stemming-lemmatization-python">Stemming and Lemmatization in Python</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2023/11/02/2023-11-03-sentiment-analysis/cover.png&quot; class=&quot;&quot; width=&quot;800&quot;&gt;
&lt;p style=&quot;text-align:center; color: grey;&quot;&gt;
  &lt;i&gt;Photo by &lt;a href=&#39;https://www.shutterstock.com/&#39;&gt;Shutterstock&lt;/a&gt;&lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;In this article, we will discuss how sentiment analysis impacts the financial market, the basics of NLP(&lt;em&gt;Natural Language Processing&lt;/em&gt;), and showcase how to process the financial headlines by batch to generate an indicator of the market sentiment.&lt;/p&gt;</summary>
    
    
    <category term="Factor Analysis" scheme="http://mikelhsia.github.io/categories/Factor-Analysis/"/>
    
    <category term="How2" scheme="http://mikelhsia.github.io/categories/Factor-Analysis/How2/"/>
    
    
    <category term="Scrapy" scheme="http://mikelhsia.github.io/tags/Scrapy/"/>
    
    <category term="Research" scheme="http://mikelhsia.github.io/tags/Research/"/>
    
    <category term="Technical Analysis" scheme="http://mikelhsia.github.io/tags/Technical-Analysis/"/>
    
    <category term="Python3" scheme="http://mikelhsia.github.io/tags/Python3/"/>
    
  </entry>
  
  <entry>
    <title>【How 2】 Upgrade your backtesting arsenal - trading multiple stocks with &quot;backtrader&quot;</title>
    <link href="http://mikelhsia.github.io/2023/08/31/2023-08-31-backtrader-multistocks-backtesting/"/>
    <id>http://mikelhsia.github.io/2023/08/31/2023-08-31-backtrader-multistocks-backtesting/</id>
    <published>2023-08-31T06:29:36.000Z</published>
    <updated>2024-11-12T07:31:58.431Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2023/08/31/2023-08-31-backtrader-multistocks-backtesting/cover.jpeg" class="" width="1000"><p style="text-align:center; color: grey;">  <i>Photo by <a href='https://unsplash.com/@priscilladupreez?utm_source=medium&utm_medium=referral'>Priscilla Du Preez</a> on <a href='https://unsplash.com/?utm_source=medium&utm_medium=referral'>Unsplash</a></i></p><p>Backtrader is a well-known Python open-source library to backtest your quantitative trading strategy. Most of its components can support trading against one single trading target. To step up the game to trade against multiple stocks, there are a few things that need to be fine-tuned to make sure the trading strategy would trade as you expected. In this post, I’m going to share my experience and crucial tips with you as a starting point to build your own.</p><a id="more"></a><hr><p>Become a <a href="https://medium.com/@mikelhsia/membership">Medium member</a> to continue learning without limits. I’ll receive a small portion of your membership fee if you use the following link, at no extra cost to you.</p><blockquote><p>If you enjoy reading this and my other articles, feel free to <a src='https://medium.com/@mikelhsia/membership'> join Medium membership program</a> to read more about Quantitative Trading Strategy.</p></blockquote><hr><p><strong><em>Previous researches</em></strong></p><ul><li><a href="https://mikelhsia.github.io/tags/Interactive-Broker/">【How 2】Set Up Trading API Template In Python</a></li><li><a href="https://mikelhsia.github.io/2021/02/15/2021-02-15-how2-snp500-historic-composition/">【How 2】 Vol. 4. How to produce the S&amp;P 500 Historical Components &amp; Changes</a></li><li><a href="https://mikelhsia.github.io/2020/11/26/2020-11-28-how-to-produce-a-quality-tradable-stock-set-for-backtesting/">【How 2】 Vol. 3. How to produce quality tradable securities for backtesting</a></li></ul><h1 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h1><p>Backtrader is a well-known Python open-source library that allows you to use it for backtesting, strategy visualization, and live trading. Unfortunately, active development stopped in about 2018 with only a few bug fixes being merged here and there, but it is still considered to be one of the important beginner’s backtesting tools to get familiar with the framework and crucial components in quantitative trading.</p><p>As said, a few things needed to be taken care of to move on to the next phase from trading one single target such as stock, ETF, or bond to trading multiple stocks. I’m going to use one of my trading strategies which trades against stocks that have high IC/IR ratios to demonstrate the differences between before and after the changes. Of course, let me know if you’re interested in knowing this trading strategy. I’ll draft another post if I find out it will help certain people.</p><h2 id="1-How-to-add-data-from-multiple-stocks"><a href="#1-How-to-add-data-from-multiple-stocks" class="headerlink" title="1. How to add data from multiple stocks"></a>1. How to add data from multiple stocks</h2><p>The first thing to trade against multiple stocks is for sure to add data of different symbols into your backtrader <code>cerebro</code>. Below is the method that the official guide suggested to add data to your script:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> backtrader <span class="keyword">as</span> bt</span><br><span class="line"><span class="keyword">import</span> pandas <span class="keyword">as</span> pd</span><br><span class="line"></span><br><span class="line">start_date = datetime.datetime(<span class="number">2022</span>, <span class="number">11</span>, <span class="number">28</span>)</span><br><span class="line">end_date = datetime.datetime(<span class="number">2023</span>, <span class="number">6</span>, <span class="number">12</span>)</span><br><span class="line">cerebro = bt.Cerebro()</span><br><span class="line">df = pd.read_csv(csv_file_path)</span><br><span class="line">data = bt.feeds.PandasData(</span><br><span class="line">    dataname=df,</span><br><span class="line">    fromdate=start_date,</span><br><span class="line">    todate=end_date,</span><br><span class="line">    name=ticker,</span><br><span class="line">    datetime=<span class="number">0</span>,</span><br><span class="line">    close=<span class="number">6</span>,</span><br><span class="line">    high=<span class="number">7</span>,</span><br><span class="line">    low=<span class="number">8</span>,</span><br><span class="line">    open=<span class="number">9</span>,</span><br><span class="line">    volume=<span class="number">10</span>,</span><br><span class="line">    openinterest=<span class="number">-1</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line">cerebro.adddata(data)</span><br></pre></td></tr></table></figure></p><p style="text-align:center; color: grey;">  <i>Method to add data suggested by the official guide</i></p><p>The way I added multiple symbols data into <code>cerebro</code> is:</p><ol><li>Stitched all the data into one big <code>csv</code> sheet.</li><li>Added <code>date</code> and <code>ticker</code> to remark the date and the symbol name of the data.</li><li>Looped through each ticker and sorted by <code>date</code> to add the data needed</li><li>Added <code>name</code> parameter into the loader function to remark the name of the data</li></ol><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line">cerebro = bt.Cerebro()</span><br><span class="line"></span><br><span class="line">results_df = pd.read_csv(csv_file_path)</span><br><span class="line">universe = <span class="comment"># add symbols to</span></span><br><span class="line"><span class="keyword">for</span> ticker <span class="keyword">in</span> results_df[<span class="string">&#x27;ticker&#x27;</span>]:</span><br><span class="line">    tmp = results_df[(results_df[<span class="string">&#x27;ticker&#x27;</span>] == ticker) &amp; (results_df[<span class="string">&#x27;date&#x27;</span>] &lt;= end_date) &amp; (results_df[<span class="string">&#x27;date&#x27;</span>] &gt;= start_date)].sort_values(<span class="string">&#x27;date&#x27;</span>)</span><br><span class="line">    df1 = pd.merge(benchmark_framework[<span class="string">&#x27;date&#x27;</span>], tmp, left_on=<span class="string">&#x27;date&#x27;</span>, right_on=<span class="string">&#x27;date&#x27;</span>, how=<span class="string">&#x27;outer&#x27;</span>).fillna(<span class="number">0</span>)</span><br><span class="line">    data = bt.feeds.PandasDatar(</span><br><span class="line">        dataname=df1,</span><br><span class="line">        fromdate=start_date,</span><br><span class="line">        todate=end_date,</span><br><span class="line">        name=ticker,</span><br><span class="line">        datetime=<span class="number">0</span>,</span><br><span class="line">        close=<span class="number">6</span>,</span><br><span class="line">        high=<span class="number">7</span>,</span><br><span class="line">        low=<span class="number">8</span>,</span><br><span class="line">        open=<span class="number">9</span>,</span><br><span class="line">        volume=<span class="number">10</span>,</span><br><span class="line">        openinterest=<span class="number">-1</span>,</span><br><span class="line">    )</span><br><span class="line"></span><br><span class="line">    cerebro.adddata(data)</span><br></pre></td></tr></table></figure><p style="text-align:center; color: grey;">  <i>Method to add data of multiple symbols</i></p><p>There you go!</p><h2 id="2-Wait-What-happens-to-my-plotting-visualization"><a href="#2-Wait-What-happens-to-my-plotting-visualization" class="headerlink" title="2. Wait! What happens to my plotting visualization?"></a>2. Wait! What happens to my plotting visualization?</h2><p>One of the features that backtrader is noted for is its plotting feature together with <code>matplotlib</code> library. This is how you enable this feature in your script:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># After you run cerebro.run()</span></span><br><span class="line">cerebro.plot()</span><br></pre></td></tr></table></figure></p><p style="text-align:center; color: grey;">  <i>How to enable the plotting feature</i></p><p>See! It’s that easy! But the thing is, once you add this line of code into our multiple stocks trading strategy script, it’s going to look like this:</p><img data-src="/2023/08/31/2023-08-31-backtrader-multistocks-backtesting/plot_true.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>The default diagrams when enabling the plotting feature</i></p><p>Obviously, there are too many price data lines that have been plotted in one single diagram. In order to make your diagram to be more readable, there are a few twists to be made:</p><h3 id="2-a-Disable-the-default-data-plotting"><a href="#2-a-Disable-the-default-data-plotting" class="headerlink" title="2.a. Disable the default data plotting"></a>2.a. Disable the default data plotting</h3><p>To do this, you have to disable the default plotting features and build your own plots.</p><p>First, you set the <code>plot</code> parameter in the data loader to <em>False</em>.<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">data = bt.feeds.PandasDatar(</span><br><span class="line">        dataname=df1,</span><br><span class="line">        fromdate=start_date,</span><br><span class="line">        todate=end_date,</span><br><span class="line">        name=ticker,</span><br><span class="line">        datetime=<span class="number">0</span>,</span><br><span class="line">        close=<span class="number">6</span>,</span><br><span class="line">        high=<span class="number">7</span>,</span><br><span class="line">        low=<span class="number">8</span>,</span><br><span class="line">        open=<span class="number">9</span>,</span><br><span class="line">        volume=<span class="number">10</span>,</span><br><span class="line">        openinterest=<span class="number">-1</span>,</span><br><span class="line">        plot=<span class="literal">False</span></span><br><span class="line">    )</span><br></pre></td></tr></table></figure></p><h3 id="2-b-Disable-the-default-backtesting-plotting"><a href="#2-b-Disable-the-default-backtesting-plotting" class="headerlink" title="2.b. Disable the default backtesting plotting"></a>2.b. Disable the default backtesting plotting</h3><p>Backtrader also embeds a few default plots when you enable the plotting feature. Let’s also disable them for now.</p><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">cerebro &#x3D; bt.Cerebro(</span><br><span class="line">    stdstats&#x3D;False,</span><br><span class="line">)</span><br></pre></td></tr></table></figure><h3 id="2-c-Add-your-customized-plot"><a href="#2-c-Add-your-customized-plot" class="headerlink" title="2.c. Add your customized plot"></a>2.c. Add your customized plot</h3><p>Lastly, the default plot is used to plot the price/value movement of a single stock. We need to transform the diagram into the way want it to be. Let’s add a customized observer to replace the original plot. In this plot, there are going to be two lines plotted: one is our benchmark data which is set to <code>SPY</code>. The second line would be the portfolio value. Here’s how we do this:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">BenchmarkBroker</span>(<span class="params">bt.Observer</span>):</span></span><br><span class="line">    _stclock = <span class="literal">True</span></span><br><span class="line"></span><br><span class="line">    alias = (<span class="string">&#x27;Value&#x27;</span>,)</span><br><span class="line">    lines = (<span class="string">&#x27;value&#x27;</span>, <span class="string">&#x27;benchmark&#x27;</span>)</span><br><span class="line"></span><br><span class="line">    plotinfo = dict(plot=<span class="literal">True</span>, subplot=<span class="literal">True</span>)</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">__init__</span>(<span class="params">self, *args, **kwargs</span>):</span></span><br><span class="line">        self.benchmarkFactor = <span class="literal">None</span></span><br><span class="line">        self.i = <span class="literal">None</span></span><br><span class="line">        <span class="keyword">if</span> <span class="string">&#x27;benchmark_symbol&#x27;</span> <span class="keyword">in</span> kwargs.keys():</span><br><span class="line">            self.benchmark_symbol = kwargs[<span class="string">&#x27;benchmark_symbol&#x27;</span>].lower()</span><br><span class="line">        <span class="keyword">else</span>:</span><br><span class="line">            self.benchmark_symbol = <span class="string">&#x27;spy&#x27;</span></span><br><span class="line"></span><br><span class="line">        <span class="keyword">for</span> i, data <span class="keyword">in</span> enumerate(self.datas):</span><br><span class="line">            <span class="keyword">if</span> data._name == self.benchmark_symbol:</span><br><span class="line">                self.i = i</span><br><span class="line">                <span class="keyword">break</span></span><br><span class="line">        super(bt.Observer, self).__init__()</span><br><span class="line"></span><br><span class="line">cerebro.addobserver(BenchmarkBroker, benchmark_symbol=<span class="string">&#x27;spy&#x27;</span>)</span><br><span class="line">cerebro.addobserver(</span><br><span class="line">    bt.observers.Benchmark,</span><br><span class="line">    data=benchmark_data,</span><br><span class="line">)</span><br></pre></td></tr></table></figure><p style="text-align:center; color: grey;">  <i>Customized observer class to show both portfolio value and benchmark value</i></p><p>In the end, you’ll get the following plot after running your backtest:</p><img data-src="/2023/08/31/2023-08-31-backtrader-multistocks-backtesting/plot_false.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>The final plotting we just built to track the movement of our portfolio value against the benchmark value</i></p><h2 id="3-Hey-The-start-date-and-the-end-date-won’t-match-my-customized-plot"><a href="#3-Hey-The-start-date-and-the-end-date-won’t-match-my-customized-plot" class="headerlink" title="3. Hey! The start date and the end date won’t match my customized plot!"></a>3. Hey! The start date and the end date won’t match my customized plot!</h2><p>I’ve been running backtests over and over again, and this issue bothered me for a long long time. Initially, I suspected it would be the problem of the missing price data as some of the symbols are either delisted or merged by the other corporations. So I tried not to import them into <code>cerebro</code> during the data importing stage. In the end, I found out that it caused the <a href="https://unacademy.com/content/jee/study-material/physics/what-is-forward-bias/"><strong><em>Forward bias</em></strong></a> in the algorithm. Then to eliminate this bias, I digged deeper and found this implicit rule hidden in the <code>backtrader</code> library:</p><ul><li>Scenario 1<ul><li>Data0 starts from 2018-01-02, end at 2018-01-30</li><li>Data1 starts from 2018-01-02, end at 2018-02-07</li><li>The missing stock price in data0 will be filled with the same price as 2018-01-30 throughout 2018-01-31 to 2018-02-07</li></ul></li></ul><img data-src="/2023/08/31/2023-08-31-backtrader-multistocks-backtesting/missing_data_scenario_1.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Missing data scenario 1</i></p><ul><li>Scenario 2<ul><li>Data0 starts from 2018-01-02, end at 2018-02-07</li><li>Data1 starts from 2018-01-09, end at 2018-02-07</li><li>The data from 2018-01-02 to 2018-01-08 in <strong>Data1</strong> will be discarded because <strong>Data0</strong> data is missing in this period. Therefore, the backtest won’t be performed during this particular period.</li></ul></li></ul><img data-src="/2023/08/31/2023-08-31-backtrader-multistocks-backtesting/missing_data_scenario_2.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Missing data scenario 2</i></p><p>Apparently, the backtesting period can be limited by the data length of the symbol that has the least data rows. However, we need to make sure that the stocks staying in our trading universe can still be backtested with the participation of the early delisted stocks or the later added stocks, I came up with a trick to pull this off:</p><h3 id="3-a-Use-price-data-of-SPY-to-produce-the-trading-calendar"><a href="#3-a-Use-price-data-of-SPY-to-produce-the-trading-calendar" class="headerlink" title="3.a. Use price data of SPY to produce the trading calendar"></a>3.a. Use price data of SPY to produce the trading calendar</h3><p>The first step would be using SPY to retrieve a series of trading dates. Then we use this trading calendar as the index for the data of every symbol to make sure every symbol shares the same index.</p><h3 id="3-b-Fill-NA"><a href="#3-b-Fill-NA" class="headerlink" title="3.b. Fill NA"></a>3.b. Fill NA</h3><p>After resetting the index for every symbol, there are a lot of empty cells in your pandas dataframe being generated. Then we need to fill a certain value into these empty cells. Which value we should use? In <code>backtrader</code>, using <code>None</code> or <code>NA</code> would cause a lot of problems. Therefore, I choose <code>0</code> to easily identify and process.</p><h3 id="3-c-Add-0-handling-logic-into-your-trading-script"><a href="#3-c-Add-0-handling-logic-into-your-trading-script" class="headerlink" title="3.c. Add 0 handling logic into your trading script"></a>3.c. Add 0 handling logic into your trading script</h3><p>The last step would be adjusting the logic when processing your <code>buy</code>, <code>sell</code>, and <code>close</code> actions. Make sure you don’t place any orders when the price equals <code>0</code>. It’s that easy to resolve this issue.</p><h2 id="4-What-I-don’t-have-enough-money-again-No-way"><a href="#4-What-I-don’t-have-enough-money-again-No-way" class="headerlink" title="4. What? I don’t have enough money again? No way!"></a>4. What? I don’t have enough money again? No way!</h2><p>Once you have all the processes above set, you can start conducting your backtest and filling in your trading strategy. In my case, after I started running my backtest, I saw one prompt message that showed up multiple times and bothered me the most, which was <code>&quot;[ticker] Order rejected due to Margin order status&quot;</code>. I checked here and there to see if there was anything wrong with my trading logic. I’ve made sure that I sell orders first to release the cash first, and then place the buy orders. I’ve also made sure the size of the order won’t exceed my available cash. How in the world that I still get this message all the time?</p><p>I’ve found two main reasons that cause this issue:</p><h3 id="4-a-The-time-to-execute-the-order"><a href="#4-a-The-time-to-execute-the-order" class="headerlink" title="4.a. The time to execute the order"></a>4.a. The time to execute the order</h3><p>To clearly understand the order execution logic of <code>backtrader</code>, let’s refer to the <a href="https://www.backtrader.com/docu/order/">backtrader’s documentation page</a>:</p><blockquote><p>Order.Market: A market order will be executed with the next available price. In backtesting it will be the opening price of the next bar.</p></blockquote><p>So the problem is almost vividly portrayed. The close price of yesterday would potentially have a gap against today’s open price. This rule renders the possibility that the actual cost of placing an order would deviate from the estimated cost higher. In reality, quantitative traders who use daily pricing data to predict the stocks to be bought on the next day would place their orders the next morning, and the estimated cost wouldn’t greatly deviate a lot from the actual cost. So how do we produce an effect that is similar to what is happening in real life?</p><p><strong><em>Cheat On Open</em></strong> is mentioned in the <a href="https://www.backtrader.com/docu/cerebro/cheat-on-open/cheat-on-open/">backtrader’s documentation page</a>.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">cerebro = bt.Cerebro(</span><br><span class="line">    stdstats=<span class="literal">False</span>,</span><br><span class="line">    <span class="comment"># Add the following parameter to enable &quot;cheat_on_open&quot; feature</span></span><br><span class="line">    cheat_on_open=<span class="literal">True</span>,</span><br><span class="line">)</span><br></pre></td></tr></table></figure><p>Enabling the <code>cheat_on_open</code> feature with the parameter above, you’ll gain access to an extra function named <code>next_open</code> which is similar to the timing of entering the market if using only open price to place any of your order.</p><img data-src="/2023/08/31/2023-08-31-backtrader-multistocks-backtesting/backtrader_process.png" class="" width="1000"><p style="text-align:center; color: grey;">  <i>The process after enabling <b>"cheat_on_open"</b> feature</i></p><p>By doing this, you’ll be able to use the open price of the next day to place the buy/sell order, which will greatly decrease the difference in the overnight price gap. This also helps lower the possibility of receiving the  <code>&quot;[ticker] Order rejected due to Margin order status&quot;</code> prompt message.</p><h2 id="4-b-Available-cash-insufficient"><a href="#4-b-Available-cash-insufficient" class="headerlink" title="4.b. Available cash insufficient"></a>4.b. Available cash insufficient</h2><p>There is another scenario that triggers the Margin order state. As the <code>backtrader</code> process displayed above, your orders are usually placed using <strong>for loop</strong> to loop through each symbol in the <code>next</code> or <code>next_open</code> stage. Yet, your orders placed only get executed when you reach the <code>notify_order</code> and <code>notify_store</code> stage. That is to say, your available cash won’t update while you place orders in your <strong>for loop</strong>. If your trading strategy requires you to optimize capital utilization by holding minimum cash, your available cash is very likely to be sufficient to place any buy order even if you try to release your cash by selling your holding stocks.</p><p>To make sure your strategy won’t have any glitches while placing buy and sell orders on the same day, you need to <strong>keep track of your available cash at all times</strong>.</p><h2 id="5-Use-hot-data-over-cold-data"><a href="#5-Use-hot-data-over-cold-data" class="headerlink" title="5. Use hot data over cold data"></a>5. Use hot data over cold data</h2><p>In your trading script, you might need to load a huge pandas dataframe with a lot of columns and factors to support your algorithm. This huge amount of data costs you a lot of time to load into your <code>backtrader</code> script. From my point of view, I would rather complete many backtests as fast as possible. Therefore, I would keep a copy of my data and load it into the trading script as the hot data. In that case, you can save time while loading data into <code>backtrader</code> to be processed a second time.</p><hr><p>These are the things that I found when transforming <code>backtrader</code> into a multiple-stock trading framework. Hope this helps you to build your backtesting tool. Cheers.</p><h1 id="Misc-Source-code"><a href="#Misc-Source-code" class="headerlink" title="Misc - Source code"></a>Misc - Source code</h1><script src="https://gist.github.com/mikelhsia/c92ba87c16200b1b98ac285b2d76f3a2.js"></script>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2023/08/31/2023-08-31-backtrader-multistocks-backtesting/cover.jpeg&quot; class=&quot;&quot; width=&quot;1000&quot;&gt;
&lt;p style=&quot;text-align:center; color: grey;&quot;&gt;
  &lt;i&gt;Photo by &lt;a href=&#39;https://unsplash.com/@priscilladupreez?utm_source=medium&amp;utm_medium=referral&#39;&gt;Priscilla Du Preez&lt;/a&gt; on &lt;a href=&#39;https://unsplash.com/?utm_source=medium&amp;utm_medium=referral&#39;&gt;Unsplash&lt;/a&gt;&lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;Backtrader is a well-known Python open-source library to backtest your quantitative trading strategy. Most of its components can support trading against one single trading target. To step up the game to trade against multiple stocks, there are a few things that need to be fine-tuned to make sure the trading strategy would trade as you expected. In this post, I’m going to share my experience and crucial tips with you as a starting point to build your own.&lt;/p&gt;</summary>
    
    
    <category term="Factor Analysis" scheme="http://mikelhsia.github.io/categories/Factor-Analysis/"/>
    
    <category term="Quantitative Trading" scheme="http://mikelhsia.github.io/categories/Factor-Analysis/Quantitative-Trading/"/>
    
    <category term="How2" scheme="http://mikelhsia.github.io/categories/Factor-Analysis/Quantitative-Trading/How2/"/>
    
    
    <category term="Research" scheme="http://mikelhsia.github.io/tags/Research/"/>
    
    <category term="Backtesting" scheme="http://mikelhsia.github.io/tags/Backtesting/"/>
    
    <category term="Python3" scheme="http://mikelhsia.github.io/tags/Python3/"/>
    
  </entry>
  
  <entry>
    <title>【Pair Trading】 Complete Guide to Backtest Cointegration Pair Trading Strategy</title>
    <link href="http://mikelhsia.github.io/2023/04/26/2023-05-01-pair-trading-cointegration-part2/"/>
    <id>http://mikelhsia.github.io/2023/04/26/2023-05-01-pair-trading-cointegration-part2/</id>
    <published>2023-04-26T04:48:46.000Z</published>
    <updated>2023-05-08T13:36:22.987Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/cover.jpeg" class="" width="1000"><p style="text-align:center; color: grey;">  <i>Photo by <a href='https://medium.com/r/?url=https%3A%2F%2Funsplash.com%2F%40aaronburden%3Futm_source%3Dmedium%26utm_medium%3Dreferral'>Aaron Burden</a> on <a href='https://medium.com/r/?url=https%3A%2F%2Funsplash.com%3Futm_source%3Dmedium%26utm_medium%3Dreferral'>Unsplash</a></i></p><p>In the <a href="https://mikelhsia.github.io/2023/02/25/2023-02-25-pair-trading-cointegration-part1/">last post</a>, we learned the basics of performing the pair trading strategy and using cointegration as a method to identify the potential tradable stocks pair. All the theories and the math formulas are so seemingly promising and convincing enough for us to believe it’s a profitable and stable trading strategy. But is it? In order to test and check the profitability and effectiveness of this strategy, we need to backtest this trading strategy to simulate real-world scenarios.</p><a id="more"></a><hr><p>Become a <a href="https://medium.com/@mikelhsia/membership">Medium member</a> to continue learning without limits. I’ll receive a small portion of your membership fee if you use the following link, at no extra cost to you.</p><blockquote><p>If you enjoy reading this and my other articles, feel free to <a src='https://medium.com/@mikelhsia/membership'> join Medium membership program</a> to read more about Quantitative Trading Strategy.</p></blockquote><hr><p><strong><em>Previous researches</em></strong></p><ul><li><a href="https://mikelhsia.github.io/2023/02/25/2023-02-25-pair-trading-cointegration-part1/">【Pair Trading】 Cointegration Test - A Key to Find High Probability Trading Strategy</a></li></ul><h1 id="Recap"><a href="#Recap" class="headerlink" title="Recap"></a>Recap</h1><p>Let’s pick up where we left off.</p><p>In the <a href="https://mikelhsia.github.io/2023/02/25/2023-02-25-pair-trading-cointegration-part1/">last post</a>, we spent time explaining the basic concepts of cointegration pair trading strategy such as what cointegration means, the meaning of stationary, and how we profit from these concepts. We chose <em><a href="https://mpra.ub.uni-muenchen.de/75967/1/MPRA_paper_75967.pdf">Engle-Granger 2-step approach</a></em> as a method to inspect the level of cointegration between two time series. Once we are able to find the stock pairs that have higher probabilities to stay cointegrated, then we can start monitoring the co-movement of their price and make trades when the pairs are temporarily not cointegrated.</p><p>We’ve also revealed the preliminary trading rules of the cointegration pair trading strategy. We</p><ul><li>Open a long position if the current spread is smaller than the mean of the spread $\mu - threshold * \sigma$</li><li>Close a long position if the current spread is bigger than the mean of the spread $\mu$</li><li>Open a short position if the current spread is bigger than the mean of the spread $\mu + threshold * \sigma$</li><li>Close a short position if the current spread is smaller than the mean of the spread $\mu$</li></ul><p><em>Where</em></p><ul><li>The <strong>mean of the residuals</strong> ($\mu$) as the benchmark line in our residual observation</li><li>The <strong>standard deviation of the residuals</strong> ($\sigma$) to calculate the trigger line in our residual observation</li><li>The <strong>threshold</strong> would be 2.32, indicating a 99% of the confidence level</li></ul><p>Even if we’ve done a lot of research to learn as much as we can about the pair trading method, there are some factors we can’t avoid in real-life settings. As a result, this is where backtesting comes into play. Conducting a successful backtest would mean a lot to simulate what is going to happen if you throw your trading strategy into the wild and unpredictable stock market.</p><h1 id="Build-our-trading-rules"><a href="#Build-our-trading-rules" class="headerlink" title="Build our trading rules"></a>Build our trading rules</h1><p>To conduct a backtest, we first need to set up the ground for this pair trading strategy. Here are a few trading rules that I have put together.</p><p><a id='pair_formation'></a></p><h2 id="Trading-pair-formation"><a href="#Trading-pair-formation" class="headerlink" title="Trading pair formation"></a>Trading pair formation</h2><ol><li>Selecting around 500 stocks based on their company financial fundamental data to find the companies that are stable and relatively financially healthy.</li><li>Obtaining the <strong>daily close price from the past two years</strong> for every stock that we picked</li><li>Using the <code>scipy.stats.pearsonr(Series_A, Series_B)</code> to calculate the <em>Pearson correlation</em>, and chose the stock pairs whose correlation value is bigger than <code>0.9</code> and the p-value is smaller than <code>0.05</code></li><li>Using the <strong>Engle-Granger 2-step approach</strong> to examine every existing pair<ol><li>Using <code>sm.OLS(Series_B as y, Series_A as x).fit()</code> to get the beta, intercept, and residual of each pair</li><li>Using <code>statsmodels.tsa.stattools.adfuller(model.resid, autolag = &#39;BIC&#39;)</code> to evaluate the level of stationary of the residual of this pair</li><li>Eliminating the stock pairs whose cointegration p-value is bigger than <code>0.05</code>, shows the stationary property is less significant.</li></ol></li><li>Lastly, we sort these pairs by their cointegration p-value, <strong>the smaller the better</strong>.</li><li>We repeat this process <strong><em>every month</em></strong> in order to closely follow the cointegration status of the potentially tradable stock pairs.</li></ol><h2 id="Monitoring-and-trading"><a href="#Monitoring-and-trading" class="headerlink" title="Monitoring and trading"></a>Monitoring and trading</h2><p>Since the pairs have been filtered and sorted by their level of correlation and cointegration, we consider these stock pairs to be our <strong>stock pair universe</strong>. As the methodology mentioned in the <a href="https://mikelhsia.github.io/2023/02/25/2023-02-25-pair-trading-cointegration-part1/">previous post</a>, we are already using the data of the past two years to build the upper and lower bands. There are more details to be defined in order to complete our trading strategy.</p><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/epsilon_channel.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>The upper and lower bands built from the residual of the OLS results</i></p><h3 id="Observing-frequency"><a href="#Observing-frequency" class="headerlink" title="Observing frequency"></a>Observing frequency</h3><p>We examine the status of the pairs in our observing universe and the pairs that we already traded for every 15 minutes. Since we’re using the daily close price to form the trading pairs and to trade accordingly, I presume that 15 minutes would be an ideal interval to inspect the status of the pairs.</p><h3 id="Enter-and-exit-signals"><a href="#Enter-and-exit-signals" class="headerlink" title="Enter and exit signals"></a>Enter and exit signals</h3><p>The basic idea of signal generation has been stated in the <strong>Recap</strong> section above, so I won’t waste any of your time and rewrite them here again.</p><h3 id="Stop-loss-and-stop-gain"><a href="#Stop-loss-and-stop-gain" class="headerlink" title="Stop loss and stop gain"></a>Stop loss and stop gain</h3><p>In the post of <a href="https://mikelhsia.github.io/2022/10/21/2022-10-15-meta-label/">【Momentum Trading】Use machine learning to boost your day trading skill - meta-labeling</a>, we have learned the idea of using the <strong>Triple Barrier Method (TBM)</strong> to control our gain/loss ratio. In this pair trading strategy, I’m using <strong>2:1</strong> as our gain/loss ratio. This <strong>2:1</strong> ratio essentially indicates that we will be able to tolerate 50% of our expected gain as our maximum loss per trade. For example, in our trading rules, we will close the pairs for those pairs whose residuals ($\mu$) return to the level of 0, meaning the stop gain would be 2.32 as we exit the trades when the residual is back to the level of 0. Then we define our stop loss to be at the level of <code>2.32 + 2 (2.32 - 0) / 2 = 4.64</code> and exit our trade at it in order to prevent greater loss when both stock prices further deviate.</p><p>As the ground rules have been set up to implement our trading strategy, I’m going to run the first round of backtest to see how it performs and to check the profitability of this strategy.<br><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/trial.png" class="" width="800"></p><p style="text-align:center; color: grey;">  <i>First-round backtest of the pair trading strategy</i></p><p><strong>Wow!</strong> At first glance, the performance was quite impressive and satisfying. However, if we get a closer look, some illogical mistakes hide behind this backtest result. In the bottom chart <code>HeldPositions</code>, the number of long positions and the number of short positions should equal all the time, as we trade pairs including one long stock and one short stock. Therefore, in the red circle, you can tell the number of one side was decreased and the other side wasn’t. This would leave our portfolio exposed to risk as our positions were not hedged properly, increasing the odds to lose money on such an anomaly in our portfolio.</p><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/trial_diagnostic.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>Red circles indicate weird things happened in our trading activities</i></p><h2 id="Several-scenarios-we-need-to-consider-and-address"><a href="#Several-scenarios-we-need-to-consider-and-address" class="headerlink" title="Several scenarios we need to consider and address"></a>Several scenarios we need to consider and address</h2><p>After looking into the log message in the tested backtest, several loopholes can be found and concluded in my trading strategy:<br><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/trial_diagnostic_w_log.png" class="" width="800"></p><p style="text-align:center; color: grey;">  <i>Incorrect trading activities that need to be taken care of properly</i></p><h3 id="Hazard-1-Margin-call"><a href="#Hazard-1-Margin-call" class="headerlink" title="Hazard 1: Margin call"></a>Hazard 1: Margin call</h3><p>According to the trading log message, the first red circle was due to the <strong>margin call</strong> getting executed to recover the remaining margin in your margin account. In order to trade stock options or to short-sell assets, you are required by brokers to retain the investor’s equity above a certain percentage so that you demonstrate your capability to repay the potential loss of your current investment. Once such a loss occurs and your equity falls under this percentage, you will receive this <strong>margin call</strong> notice from the broker that requires you to sell a part of your investment and turn it into cash to raise the percentage of equity.</p><p>In our backtest, the stock price of <code>TSLA</code> has declined drastically which brought the percentage of equity below that percentage, that’s why we were forced to sell any investment and turn it into cash. The point being, that we don’t want to sell <strong>ANY</strong> asset in our portfolio. Instead, we need to make sure we sell <strong>a pair of assets</strong> to remain our market neutrality. In the meantime, we also need to decide which pair to be liquidated that has the least impact on our trading strategy. In the paper <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1330689">An Anatomy of Pairs Trading: The Role of Idiosyncratic News, Common Information and Liquidity</a>, the experiments demonstrate that the probabilities and profitability of the pair start to increase day by day since the deviation has been detected, and then it will start to decline since reaching the top performance on day six. Therefore, I made an assumption that the longer a pair was held, the lower probability for this pair to be profitable. So we need to sell the pair that we hold the longest when the margin call occurs.</p><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/time_impact_profitability.png" class="" width="800"><p style="text-align:center; color: grey;">  <i>The profitability dwindles as the day pass</i></p><h3 id="Hazard-2-Fails-to-place-two-orders-one-long-and-one-short-simultaneously"><a href="#Hazard-2-Fails-to-place-two-orders-one-long-and-one-short-simultaneously" class="headerlink" title="Hazard 2: Fails to place two orders (one long and one short) simultaneously"></a>Hazard 2: Fails to place two orders (one long and one short) simultaneously</h3><p>The other two incidents in our <code>HeldPositions</code> plot are referring to <strong>Insufficient Buying Power</strong>. <strong>Buying power</strong> is a concept that is quite easy to understand but is relatively complex to calculate. According to <a href="https://www.interactivebrokers.co.uk/en/?f=%2Fen%2Fgeneral%2Feducation%2Fpdfnotes%2FWN-UnderstandingIBMargin.php#:~:text=Buying%20Power%20%E2%80%93%20value%20of%20securities,of%20held%20stock%20as%20collateral">Understanding IB Margin Webinar Notes</a>, the buying power can be defined as follow:</p><blockquote><p><strong>Buying Power</strong></p><p>Is the value of securities you can purchase without depositing additional funds. In cash accounts this is the settled cash. In a margin account, buying power is increased through the use of leverage using cash and the value of held stock as collateral. The amount of leverage depends upon whether you have a Reg. T Margin or Portfolio Margin account. Active traders can take advantage of reduced intraday margin for securities – generally 25% of the long stock value. But keep in mind this requirement reverts to the Reg T 50% of stock value to hold overnight.<br>$\text{Cash Account Buying Power} = Min(\text{Equity with Loan Value, Previous Day Equity with Loan Value}) –\text{Initial Margin}$<br>$\text{Margin Account Buying Power} = \text{Cash Account Buying Power} * \text{Leverage Ratio}$</p><p style="text-align:center; color: grey;">  <i>Formulas to calculate the Buying Power</i></p></blockquote><p>Given the buying power as the limitation of placing orders with the margin account, we will face the scenario that one of the orders in the pair is successfully executed but the other order failed to be executed because the buying power of the day is insufficient. Unfortunately, <a href="https://www.quantconnect.com">QuantConnect</a> doesn’t provide a function to check the current buying power in real-time. Therefore, we also need to come up with a solution to remedy this problem.</p><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/fixing_the_issue_of_the_backtest.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Backtest result after fixing the loopholes found</i></p><p>We have spotted two different potential scenarios that might endanger the profitability of our trading strategy, and we have successfully addressed them in a more theoretically trustworthy method. Looking back to the number of long positions and short positions in the plot <code>HeldPositions</code> are the same at all times. Now we can move on to the last part of the backtest.</p><h1 id="Backtest"><a href="#Backtest" class="headerlink" title="Backtest"></a>Backtest</h1><p>Even if we have successfully replicated the pair trading technique that has been profitable for the past two years, a single backtest cannot guarantee that this backtest will perform similarly in the real world. However, we may employ this backtest as a research tool to determine which parameters could potentially improve the win rate and Sharpe Ratio. This is so-called <strong>Hyperparameter Optimizing</strong>. In this section, I’m going to run backtest against several scenarios to answer the following questions:</p><ol><li>Should we use close price or log(close price) while calculating cointegration parameters and epsilon</li><li>Does the profitability impact by the holding period as stated in <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1330689">An Anatomy of Pairs Trading: The Role of Idiosyncratic News, Common Information and Liquidity</a>?<br>Therefore, I’m going to create <strong>eight</strong> scenarios using <code>price/log(price)</code> and close the pair trade when it’s been <code>10/22/132/264 days</code> indicating we close the trade after it’s 10 days, one month, six months, and one year.</li></ol><p>Furthermore, aside from the standard KPIs such as Total Return, Sharpe Ratio, and MaxDrawDown, other KPIs such as Win Rate will be incorrect because it is based on a single stock and always equals around 50%. Hence I wrote a script to process the order records into a pair-wise order record and visualize the pair-wise performance.</p><h2 id="Platform"><a href="#Platform" class="headerlink" title="Platform"></a>Platform</h2><p><a href="https://www.quantconnect.com">QuantConnect</a></p><h2 id="Backtest-Periods"><a href="#Backtest-Periods" class="headerlink" title="Backtest Periods"></a>Backtest Periods</h2><p>2020/12/27 - 2023/03/03</p><h2 id="Backtest-Universe"><a href="#Backtest-Universe" class="headerlink" title="Backtest Universe"></a>Backtest Universe</h2><p>As stated in the section <a href='#pair_formation'>Pair formation</a></p><h2 id="Backtest-benchmark"><a href="#Backtest-benchmark" class="headerlink" title="Backtest benchmark"></a>Backtest benchmark</h2><p>SPDR S&amp;P 500 ETF Trust (SPY)</p><h2 id="Backtest-Results"><a href="#Backtest-Results" class="headerlink" title="Backtest Results"></a>Backtest Results</h2><h3 id="Strategy-wise-performance"><a href="#Strategy-wise-performance" class="headerlink" title="Strategy-wise performance"></a>Strategy-wise performance</h3><div class="table-container"><table><thead><tr><th></th><th>Using close price</th><th>Using log(close price)</th></tr></thead><tbody><tr><td>Close after 10 days</td><td><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/backtest_price_10.png" class="" width="600"></td><td><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/backtest_log_price_10.png" class="" width="600"></td></tr><tr><td>Close after one month</td><td><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/backtest_price_22.png" class="" width="600"></td><td><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/backtest_log_price_22.png" class="" width="600"></td></tr><tr><td>Close after six months</td><td><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/backtest_price_132.png" class="" width="600"></td><td><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/backtest_log_price_132.png" class="" width="600"></td></tr><tr><td>Close after one year</td><td><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/backtest_price_264.png" class="" width="600"></td><td><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/backtest_log_price_264.png" class="" width="600"></td></tr></tbody></table></div><p style="text-align:center; color: grey;">  <i>Strategy-wise performance of every scenario</i></p><p>Once I put the backtest results altogether, it’s quite easy to notice that the performances of the pair trading strategy do conform to the curve as we see above but not exactly following the days in the plot. You can tell that the portfolio returns that close after six months and one year is comparatively lower than the portfolio returns that close after 10 days and one month. The scenarios that have the highest portfolio returns are all closing the trade after <strong>one month</strong>, no matter whether it’s using close price or log(close price).</p><p>On the other hand, the scenarios using close price don’t seem to have an obvious edge compared to the scenarios using log(close price). I guess we need some more parameters involved and more backtests to be performed in order to find out whether the difference exists.</p><h3 id="Pair-wise-performance"><a href="#Pair-wise-performance" class="headerlink" title="Pair-wise performance"></a>Pair-wise performance</h3><p>Before we shift our focus to this part, there are a few types of labels that I need to explain beforehand:</p><ol><li><strong>Normal Close</strong>: This type of order is the order that received the sell signals before hitting the holding period limitation, the same as hitting the vertical time bar of the <strong>Triple Barrier Method (TBM)</strong> (see <a href="https://mikelhsia.github.io/2022/10/21/2022-10-15-meta-label/">here</a> for more details). In the most optimal case, we would like to see the more <strong>Normal Close</strong> orders the better, as they are the orders that follow the central idea of pair trading to produce positive profit via buying when two stock prices deviate and selling when two stock prices start to cointegrate.</li><li><strong>Early Close</strong>: This category represents the orders that are not yet received the sell signals but hit the vertical time bar. The <strong>Early Close</strong> orders are considered as their momentum/energy/tendency to converge are less stronger than they used to be, therefore we close them before they converge in exchange for other pairs that have higher chances to converge. They could be in the money or out of money while we close these <strong>Early Close</strong> orders and have higher uncertainty compared to <strong>normal close</strong> orders. We use <code>+</code> to label this pair trading generate positive profit and <code>-</code> to label this trade’s profit as negative.</li><li><strong>Stop Loss Close</strong>: These orders are closed because their <em>epsilons</em> are getting too big or too small, indicating the prices of the stocks in this pair are starting to further diverge. To avoid a huge loss, we close this type of order before we hit any other bands. We use <code>+</code> to label this pair trading generate positive profit and <code>-</code> to label this trade’s profit as negative.</li></ol><div class="table-container"><table><thead><tr><th></th><th>Using close price</th><th>Using log(close price)</th></tr></thead><tbody><tr><td>Close after 10 days</td><td><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/price_10_dist.png" class="" width="600"></td><td><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/log_price_10_dist.png" class="" width="600"></td></tr><tr><td>Close after one month</td><td><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/price_22_dist.png" class="" width="600"></td><td><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/log_price_22_dist.png" class="" width="600"></td></tr><tr><td>Close after six months</td><td><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/price_132_dist.png" class="" width="600"></td><td><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/log_price_132_dist.png" class="" width="600"></td></tr><tr><td>Close after one year</td><td><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/price_264_dist.png" class="" width="600"></td><td><img data-src="/2023/04/26/2023-05-01-pair-trading-cointegration-part2/log_price_264_dist.png" class="" width="600"></td></tr></tbody></table></div><p style="text-align:center; color: grey;">  <i>Pair-wise performance of every scenario</i></p><p>From the charts above, you can tell that the volume of the <code>Stop Loss orders</code> increases while the holding period increase. That somewhat corroborates the inference stated in the paper <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1330689">An Anatomy of Pairs Trading: The Role of Idiosyncratic News, Common Information and Liquidity</a> that the profits of pair trading do decrease along with the longer holding period. In the meantime, the volume of the <code>Normal Close</code> and <code>Early Close (+)</code> orders remain fairly stable. The full distribution charts above show that the cointegrated pair has a lower probability to diverge once a specific amount of time has passed. The win rate statistics table provided below further back up our conclusion drawn.</p><div class="table-container"><table><thead><tr><th></th><th>Early Close Win Rate</th><th>Total Win Rate</th></tr></thead><tbody><tr><td>price_10_dist</td><td>60.83%</td><td>50.43%</td></tr><tr><td>price_22_dist</td><td>66.67%</td><td>44.13%</td></tr><tr><td>price_132_dist</td><td>50.00%</td><td>38.31%</td></tr><tr><td>price_1264_dist</td><td>N/A</td><td>38.02%</td></tr><tr><td>log_price_10_dist</td><td>61.31%</td><td>51.22%</td></tr><tr><td>log_price_22_dist</td><td>73.63%</td><td>54.95%</td></tr><tr><td>log_price_132_dist</td><td>75.00%</td><td>40.56%</td></tr><tr><td>log_price_1264_dist</td><td>100.00%</td><td>35.33%</td></tr></tbody></table></div><p style="text-align:center; color: grey;">  <i>Pair-wise performance of every scenario II</i></p><h1 id="Take-away"><a href="#Take-away" class="headerlink" title="Take away"></a>Take away</h1><p>In this post, I have shown you how to implement the cointegration pair trading in detail step-by-step. There are two potential defects in our trading strategy that we detected once we release the trading script to the live environment, and theoretical-based and trustworthy solutions have been applied to mitigate the consequences brought by these defects. Lastly, we use this complete backtest to confirm that the profitability of each trading pair does make a difference given using different holding periods to safeguard your strategy from being sabotaged by time volatility. Hope you enjoy reading this post, and let me know if you would like to know any parameter that might impact the performance of the cointegration pair trading strategy, let me know.</p><h1 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h1><ul><li><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1330689">An Anatomy of Pairs Trading: The Role of Idiosyncratic News, Common Information and Liquidity</a></li><li><a href="http://epchan.blogspot.com/2013/11/cointegration-trading-with-log-prices.html">Cointegration Trading with Log Prices vs. Prices</a></li><li><a href="https://ibkr.info/node/2085/">Determining Buying Power</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2023/04/26/2023-05-01-pair-trading-cointegration-part2/cover.jpeg&quot; class=&quot;&quot; width=&quot;1000&quot;&gt;
&lt;p style=&quot;text-align:center; color: grey;&quot;&gt;
  &lt;i&gt;Photo by &lt;a href=&#39;https://medium.com/r/?url=https%3A%2F%2Funsplash.com%2F%40aaronburden%3Futm_source%3Dmedium%26utm_medium%3Dreferral&#39;&gt;Aaron Burden&lt;/a&gt; on &lt;a href=&#39;https://medium.com/r/?url=https%3A%2F%2Funsplash.com%3Futm_source%3Dmedium%26utm_medium%3Dreferral&#39;&gt;Unsplash&lt;/a&gt;&lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;In the &lt;a href=&quot;https://mikelhsia.github.io/2023/02/25/2023-02-25-pair-trading-cointegration-part1/&quot;&gt;last post&lt;/a&gt;, we learned the basics of performing the pair trading strategy and using cointegration as a method to identify the potential tradable stocks pair. All the theories and the math formulas are so seemingly promising and convincing enough for us to believe it’s a profitable and stable trading strategy. But is it? In order to test and check the profitability and effectiveness of this strategy, we need to backtest this trading strategy to simulate real-world scenarios.&lt;/p&gt;</summary>
    
    
    <category term="Quantitative Trading" scheme="http://mikelhsia.github.io/categories/Quantitative-Trading/"/>
    
    
    <category term="Strategy" scheme="http://mikelhsia.github.io/tags/Strategy/"/>
    
    <category term="Backtesting" scheme="http://mikelhsia.github.io/tags/Backtesting/"/>
    
    <category term="Pair Trading" scheme="http://mikelhsia.github.io/tags/Pair-Trading/"/>
    
  </entry>
  
  <entry>
    <title>【Pair Trading】 Cointegration Test - A Key to Find High Probability Trading Strategy</title>
    <link href="http://mikelhsia.github.io/2023/02/25/2023-02-25-pair-trading-cointegration-part1/"/>
    <id>http://mikelhsia.github.io/2023/02/25/2023-02-25-pair-trading-cointegration-part1/</id>
    <published>2023-02-25T05:40:39.000Z</published>
    <updated>2023-03-03T16:51:59.682Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/cover.png" class="" width="600"><p>Cointegration is a statistical technique to find out whether a time series closely follows the movement of the other time series. Therefore, it becomes an important technique in the pair trading strategy for us to determine the right stock pair to trade with. In this post, we’re going to see why traders prefer using the cointegration test over the correlation test in pair trading, and whether the cointegration test results can boost our trading performance.</p><a id="more"></a><hr><p>Become a <a href="https://medium.com/@mikelhsia/membership">Medium member</a> to continue learning without limits. I’ll receive a small portion of your membership fee if you use the following link, at no extra cost to you.</p><blockquote><p>If you enjoy reading this and my other articles, feel free to <a src='https://medium.com/@mikelhsia/membership'> join Medium membership program</a> to read more about Quantitative Trading Strategy.</p></blockquote><hr><p><strong><em>Previous researches</em></strong></p><ul><li><a href="https://mikelhsia.github.io/2021/08/02/2021-08-12-pair-trading/">【Pair Trading】Part 1. Introduction to pair trading strategy</a></li><li><a href="https://mikelhsia.github.io/2021/08/30/2021-08-30-pair-trading-distance-approach/">【Pair Trading】Part 2. 5 in-depth analysis of distance approach in pair trading</a></li><li><a href="https://mikelhsia.github.io/2021/09/30/2021-10-05-pair-trading-market-neutral/">【Pair Trading】Part 3. The strategy that helps minimize your portfolio risk</a></li></ul><p>From the results of the previous research posts, I’ve found out that the pair trading strategies using the distance approach and Pearson correlation approach are not as satisfying as I expected. Even though we’re able to achieve the goal of making our strategy market neutral and reducing the max drawdown drastically, our Sharpe Ratio of each strategy variation is also reduced to a relatively low level compared to the benchmark buy-and-hold strategy.</p><p>Since the first method in these five pair trading strategies, let’s start putting efforts into the second method and see whether this test can generate more insights to evaluate and then determine whether this is a profitable pair trading strategy.</p><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/methods.png" class="" width="600"><p style="text-align:center; color: grey;">  <i>Extracted from <a href='https://www.youtube.com/watch?v=gd009r7QUuM&list=PLfv9eTYgatm3oz8uq8G17-50ed_s-n5ds&index=2&t=238s'>Pairs Trading: The Distance Approach</a> by <a href='https://www.youtube.com/channel/UC8hI87gt0dmTAIEupEcsckA'>Hudson & Thames</a></i></p><h1 id="1-Lesson-101-of-cointegration-pair-trading"><a href="#1-Lesson-101-of-cointegration-pair-trading" class="headerlink" title="1. Lesson 101 of cointegration pair trading"></a>1. Lesson 101 of cointegration pair trading</h1><h2 id="1-1-What-is-Cointegration"><a href="#1-1-What-is-Cointegration" class="headerlink" title="1.1. What is Cointegration"></a>1.1. What is Cointegration</h2><p>Cointegration describes the relationship between time series in the long run. It is a milestone in the long history of studying multi-asset trading strategies. It first appeared in Granger’s seminal paper “<a href="https://www.sciencedirect.com/science/article/abs/pii/0304407681900798">Some properties of time series data and their use in econometric model specification</a>” <em>(Granger, 1981)</em>. When we put the term cointegration into the words of quantitative trading, cointegration helps us to find whether two stock prices have the spread (usually the difference of price or difference of log(price)) is stationary, indicating the mean and the variance of the spread stays the same in the observation period. This statistic feature meets the criteria of a mean-reversion strategy that involves two indifferent stocks.</p><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/step1.png" class="" width="500"><p style="text-align:center; color: grey;">  <i>The price plot of KEY and RF</i></p><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/step2.png" class="" width="500"><p style="text-align:center; color: grey;">  <i>The price spread between KEY and RF will eventually go back to its mean value</i></p><p>But how do we examine whether the spread of two stock prices is stationary? Statistically speaking, a value in time series can be represented with the following equation:</p><script type="math/tex; mode=display">Y_t = \alpha Y_{t-1} + \beta X_e + constant + \epsilon</script><script type="math/tex; mode=display">or</script><script type="math/tex; mode=display">Y_t = \alpha_1 Y_{t-1} + \alpha_2 Y_{t-2} + ... + \alpha_p Y_{t-p} + constant + \epsilon</script><script type="math/tex; mode=display">\begin{align}&where:\\&Y_t \text{ represents the Y value at time t}\\&\alpha \text{ represents the unit root of this equation}\\&\beta X_t \text{ represents the impact from the previous Y value from t-1 to t-p}\\&\epsilon \text{ represents the residual, which suppose to be a random-distributed value}\end{align}</script><p>By looking at the equation, we can tell that if the unit root $\alpha$ is greater than 1, the $Y_t$ is affected by the previous value $Y_{t-1}, Y_{t-2}, …$ in this time series and is no longer a random-distributed time series. Therefore, our goal is to see whether $\alpha$ exists, the smaller the better. If $\alpha$ doesn’t exist in this equation, then we can say that this time series is stationary as $Y_t$ is simply an add-up of $constant$ and a randomly-distributed $\epsilon$. Here’s where the <strong>Augmented Dickey-Fuller test (ADF test)</strong> comes into play. We use the ADF test to examine whether the unit root exists or not.</p><p>There are a lot of materials here for you if you would like to know more about what cointegration is about:</p><ul><li><a href="https://analyzingalpha.com/stationarity">What is stationary?</a></li><li><a href="https://analyzingalpha.com/check-time-series-stationarity-python">How to check time series stationarity in Python</a></li><li><a href="https://www.youtube.com/watch?v=vvTKjm94Ars">Cointegration - an introduction</a></li></ul><h2 id="1-2-Misconception-about-the-relationship-between-correlation-and-cointegration"><a href="#1-2-Misconception-about-the-relationship-between-correlation-and-cointegration" class="headerlink" title="1.2. Misconception about the relationship between correlation and cointegration"></a>1.2. Misconception about the relationship between correlation and cointegration</h2><p>One might say that, doesn’t the <strong>correlation test</strong> describe the same statistical feature as the <strong><em>cointegration test</em></strong> which both methods are trying to see whether two time series are moving towards the same direction in the same observation period?</p><p>Correlation is meant to examine and measure the linear relationship between two time series. The positive correlation (correlation &gt; 0) means these two variables move in the same direction (up or down) over time, whereas the negative correlation (correlation &lt; 0) means they move in different directions. On the other hand, the cointegration test doesn’t care how these two variables move together. Instead, it measures whether the difference between two variables remains constant over time. Therefore, high cointegration doesn’t necessarily exist if two time series are highly correlated.</p><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/hi_correlation_hi_cointegration.png" class="" width="500"><p style="text-align:center; color: grey;">  <i>Time series that illustrates perfect correlation and cointegration - <a href='https://www.r-bloggers.com/2017/11/cointegration-correlation-and-log-returns/'>Rbloggers</a> by <a href='https://www.r-bloggers.com/author/cfsmith/'>cfsmith</a></i></p><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/low_correlation_hi_cointegration.png" class="" width="500"><p style="text-align:center; color: grey;">  <i>Time series that has perfect cointegration, but zero correlation - <a href='https://www.r-bloggers.com/2017/11/cointegration-correlation-and-log-returns/'>Rbloggers</a> by <a href='https://www.r-bloggers.com/author/cfsmith/'>cfsmith</a></i></p><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/low_correlation_hi_cointegration2.png" class="" width="500"><p style="text-align:center; color: grey;">  <i>Time series has the same perfect cointegration, but has a relatively low correlation - <a href='https://www.r-bloggers.com/2017/11/cointegration-correlation-and-log-returns/'>Rbloggers</a> by <a href='https://www.r-bloggers.com/author/cfsmith/'>cfsmith</a></i></p><p>Reference:</p><ul><li><a href="https://www.r-bloggers.com/2017/11/cointegration-correlation-and-log-returns/">Cointegration, correlation, and log return</a></li></ul><h2 id="1-3-The-methodology"><a href="#1-3-The-methodology" class="headerlink" title="1.3. The methodology"></a>1.3. The methodology</h2><p>In this post, I choose to use <strong>Engle-Granger 2-step approach</strong> as it is the most commonly seen cointegration test process for pair trading. As the name tells, there are two steps to go through in order to find out whether the pair of stocks is suitable for this strategy:</p><h3 id="First-step"><a href="#First-step" class="headerlink" title="First step"></a>First step</h3><p>First of all, we use OLS as the regression method to get the residuals of the equation. The regression formula should look like this given both $x$ and $y$ are time series that we have:</p><script type="math/tex; mode=display">y = \beta * x + constant</script><p>By doing this, we can get the parameters $\beta$ and $constant$. Then we are going to calculate the residuals by using the following equation:</p><script type="math/tex; mode=display">\epsilon = y - \beta * x - constant</script><p>Now we save the residuals as the input of the second step.</p><h3 id="Second-step"><a href="#Second-step" class="headerlink" title="Second step"></a>Second step</h3><p>The second step is much more straightforward. We use the Augmented Dickey-Fuller test to see whether the unit root exists in the residuals. If the hypothesis of having a unit root can be rejected by applying the Augmented Dickey-Fuller test, then we can say that the residuals are stationary and the time series $x$ and $y$ are cointegrated. Therefore, we can say this pair $\text{x-y}$ would be our trading target.</p><p>In python, it’s going to be as easy as:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> statsmodels.tsa.stattools <span class="keyword">import</span> adfuller</span><br><span class="line"></span><br><span class="line">adf_value, p_value= adfuller(TIME_SERIES_X, autolag = <span class="string">&#x27;BIC&#x27;</span>)</span><br></pre></td></tr></table></figure></p><h2 id="1-4-Trading-rules"><a href="#1-4-Trading-rules" class="headerlink" title="1.4. Trading rules"></a>1.4. Trading rules</h2><p>Theoretically speaking, the OLS-generated residuals should conform to the random distribution. That is to say, the cointegration pair trading strategy essentially is a mean-reverting, market-neutral, long-short strategy as the other pair trading strategy. The only difference is what would be the indicator to monitor and observe. In this case, we use the residual $\epsilon$ to generate the trading signals to either enter or exit a trade. Below is the most common trading rules performed in most of the research papers:</p><ul><li>Variables required<ul><li><strong>Residuals ($\epsilon$)</strong> generated from the OLS regression: $\epsilon = y - \beta * x - constant$</li><li><strong>Mean of the residuals ($\mu$)</strong> as the benchmark line in our residual observation</li><li><strong>Standard deviation of the residuals ($\sigma$)</strong> to calculate the trigger line in our residual observation</li><li><strong>Threshold</strong> is a fixed value that uses together with the standard deviation of the residuals to calculate the trigger line. In this research, we set it to 2.32 for calculating the upper bound and -2.32 for calculating the lower bound. (2.32 is usually used as it includes 99% inside the normal distribution)</li></ul></li><li>Trading rules<ul><li>Generating enter trading signals<ul><li>Open a long position if the current spread is smaller than the mean of the spread $\mu - threshold * \sigma$</li><li>Close a long position if the current spread is bigger than the mean of the spread $\mu$</li><li>Open a short position if the current spread is bigger than the mean of the spread $\mu + threshold * \sigma$</li><li>Close a short position if the current spread is smaller than the mean of the spread $\mu$</li></ul></li><li>Exit trading signals<ul><li>residual cross the mean of the residuals</li></ul></li><li>Repeated trading signals<ul><li>Only process the first signal if there are two consecutive enter/exit signals</li></ul></li></ul></li></ul><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/trading_rules.png" class="" width="800"><p style="text-align:center; color: grey;">    <i>Pair trading rules flow chart</i></p><p>To make our trading rules more intuitively easier to understand, let’s have a look at the below chart:</p><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/intuitive_visualization.png" class="" width="800"><p style="text-align:center; color: grey;">    <i>Top: Pair pricing movements; Middle: Residual movement; Bottom: Accumulative return (%)</i></p><h1 id="2-Research-plan"><a href="#2-Research-plan" class="headerlink" title="2. Research plan"></a>2. Research plan</h1><h2 id="2-1-Goal-of-this-research"><a href="#2-1-Goal-of-this-research" class="headerlink" title="2.1. Goal of this research"></a>2.1. Goal of this research</h2><p>Before starting to backtest the strategy performance, there are a few things that I would like to understand beforehand. Therefore, I set up three sets of scenarios to validate the answers to the below questions:</p><ol><li>When doing regression, whether stock price or log(price) will give us an edge?</li><li>Do we need to filter out those pairs whose correlation is low before the cointegration test?</li><li>Does the scenario using the pair in the same industries will have a lot of difference in performance compared to the scenario using the pair in different industries?</li></ol><p>I believe having a clear view of the above questions will help conduct backtest in the later stage. So let’s get started!</p><h2 id="2-2-Platform"><a href="#2-2-Platform" class="headerlink" title="2.2. Platform"></a>2.2. Platform</h2><p><a href="https://www.quantconnect.com">QuantConnect</a></p><h2 id="2-3-Fetching-data-needed"><a href="#2-3-Fetching-data-needed" class="headerlink" title="2.3. Fetching data needed"></a>2.3. Fetching data needed</h2><p>In this research, we use 24 months of data as training data and feed them into the ADF test and OLS regression to get the results forming the pairs we need for the following steps. Once the pairs have formed, we’re going to use another 12 months of data as testing data to see whether the pairs with high cointegration intention would have the character of mean-reversion.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">formation_period = <span class="number">22</span> * <span class="number">24</span></span><br><span class="line">trading_period = <span class="number">22</span> * <span class="number">12</span></span><br><span class="line">data_length = formation_period + trading_period</span><br><span class="line"></span><br><span class="line">history_price = qb.History(universe, data_length+<span class="number">1</span>, Resolution.Daily)</span><br><span class="line">history_price = history_price.reset_index().pivot(index=<span class="string">&#x27;time&#x27;</span>, columns=<span class="string">&#x27;symbol&#x27;</span>, values=<span class="string">&#x27;close&#x27;</span>)</span><br><span class="line"></span><br><span class="line">print(<span class="string">f&#x27;There are <span class="subst">&#123;history_price.shape[<span class="number">1</span>]&#125;</span> in the symbols&#x27;</span>)</span><br><span class="line"></span><br><span class="line">training_data = history_price.iloc[:formation_period, :]</span><br><span class="line">trading_data = history_price.iloc[formation_period:, :]</span><br></pre></td></tr></table></figure><h2 id="2-4-Universe-and-implementation"><a href="#2-4-Universe-and-implementation" class="headerlink" title="2.4. Universe and implementation"></a>2.4. Universe and implementation</h2><p>I’m using the component stocks from S&amp;P500 at one point as the base universe to start with. After downloading all the historic pricing data, I fed the necessary data into the following class to build the screening criteria:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">Pair</span>:</span></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">__init__</span>(<span class="params">self, symbol_a:str, symbol_b:str, rtn_a, rtn_b</span>):</span></span><br><span class="line">        self.symbol_a = symbol_a</span><br><span class="line">        self.symbol_b = symbol_b</span><br><span class="line">        self.rtn_a = np.array(rtn_a)</span><br><span class="line">        self.rtn_b = np.array(rtn_b)</span><br><span class="line">        self.corr, self.corr_p = self.correlation()</span><br><span class="line">        self.ols_hedge_ratio, self.ols_intercept, self.coint_value, self.coint_stationary_p, self.ols_res = self.cointeration_test()</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">correlation</span>(<span class="params">self</span>):</span></span><br><span class="line">        <span class="comment"># calculate the sum of squared deviations between two normalized price series</span></span><br><span class="line">        corr, p = pearsonr(self.rtn_a, self.rtn_b)</span><br><span class="line">        <span class="keyword">return</span> corr, p</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">cointeration_test</span>(<span class="params">self</span>):</span></span><br><span class="line">        x = self.rtn_a</span><br><span class="line">        y = self.rtn_b</span><br><span class="line"></span><br><span class="line">        x = sm.add_constant(x)</span><br><span class="line">        model = sm.OLS(y, x).fit()</span><br><span class="line"></span><br><span class="line">        intercept = model.params[<span class="number">0</span>]</span><br><span class="line">        beta = model.params[<span class="number">1</span>]</span><br><span class="line"></span><br><span class="line">        adf_result = adfuller(model.resid, autolag = <span class="string">&#x27;BIC&#x27;</span>)</span><br><span class="line">        adf_value = adf_result[<span class="number">0</span>]</span><br><span class="line">        stationary_p_value = adf_result[<span class="number">1</span>]</span><br><span class="line"></span><br><span class="line">        <span class="keyword">return</span> beta, intercept, adf_value, stationary_p_value, model.resid</span><br></pre></td></tr></table></figure></p><p>Then here’s how I feed the data into this defined class:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># This is for storing the final results</span></span><br><span class="line">pair_corrs = &#123;&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> stock_pair <span class="keyword">in</span> tqdm.tqdm(symbol_pairs):</span><br><span class="line">  <span class="keyword">if</span> str(stock_pair[<span class="number">0</span>]) <span class="keyword">not</span> <span class="keyword">in</span> history_price.columns:</span><br><span class="line">    <span class="comment"># print(f&#x27;&#123;str(stock_pair[0])&#125; not in the history_price table&#x27;)</span></span><br><span class="line">    <span class="keyword">continue</span></span><br><span class="line">  <span class="keyword">if</span> str(stock_pair[<span class="number">1</span>]) <span class="keyword">not</span> <span class="keyword">in</span> history_price.columns:</span><br><span class="line">    <span class="comment"># print(f&#x27;&#123;str(stock_pair[1])&#125; not in the history_price table&#x27;)</span></span><br><span class="line">    <span class="keyword">continue</span></span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> SPREAD_MODE == LOG_PRICE_MODE:</span><br><span class="line">    tmp = np.log(training_data.loc[:, [str(stock_pair[<span class="number">0</span>]), str(stock_pair[<span class="number">1</span>])]].dropna())</span><br><span class="line">  <span class="keyword">elif</span> SPREAD_MODE == PRICE_MODE:</span><br><span class="line">    tmp = training_data.loc[:, [str(stock_pair[<span class="number">0</span>]), str(stock_pair[<span class="number">1</span>])]].dropna()</span><br><span class="line"></span><br><span class="line">  pair_corrs[(str(stock_pair[<span class="number">0</span>]), str(stock_pair[<span class="number">1</span>]))] = Pair(</span><br><span class="line">    str(stock_pair[<span class="number">0</span>]),</span><br><span class="line">    str(stock_pair[<span class="number">1</span>]),</span><br><span class="line">    tmp.loc[:, str(stock_pair[<span class="number">0</span>])],</span><br><span class="line">    tmp.loc[:, str(stock_pair[<span class="number">1</span>])]</span><br><span class="line">  )</span><br></pre></td></tr></table></figure><br>Once the above actions have been accomplished, I’m choosing only the pairs whose ADF test p-value is smaller than 0.05 to be our candidates for pair trading:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">final_pairs = &#123;key:value <span class="keyword">for</span> key, value <span class="keyword">in</span> pair_corrs.items() <span class="keyword">if</span> value.coint_stationary_p &lt;= <span class="number">0.05</span>&#125;</span><br></pre></td></tr></table></figure></p><p>Lastly, let’s sort the pairs first by their correlation value and then by their cointegration p-value. By doing this, it’ll be easier for us to conduct our stratified analysis based on their level of cointegration.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">final_pairs = &#123;k:v <span class="keyword">for</span> k,v <span class="keyword">in</span> sorted(</span><br><span class="line">  final_pairs.items(),</span><br><span class="line">  key = <span class="keyword">lambda</span> x: x[<span class="number">1</span>].corr</span><br><span class="line">)&#125;</span><br><span class="line">final_pairs = &#123;k:v <span class="keyword">for</span> k,v <span class="keyword">in</span> sorted(</span><br><span class="line">  final_pairs.items(),</span><br><span class="line">  key = <span class="keyword">lambda</span> x: x[<span class="number">1</span>].coint_stationary_p</span><br><span class="line">)&#125;</span><br></pre></td></tr></table></figure><h2 id="2-5-Results"><a href="#2-5-Results" class="headerlink" title="2.5. Results"></a>2.5. Results</h2><p>To better visualize our results, I’m going to <strong>compare different scenarios simply based on the visualized diagram using stratified analysis and accumulative return diagram from the top 20 stock pairs that have the lowest ADF test p-value</strong>. In the stratified analysis, I expect to see if the accumulative returns of each group are parted from each other and are ranked from group 1 (the lowest cointegration p-value) to group 8 (the highest cointegration p-value). As for the accumulative return diagram from the <strong>top 20</strong> stock pairs, undoubtedly seeing a soaring return without a huge max drawdown would be the optimal result.</p><h3 id="2-5-1-Using-simply-stock-price-v-s-log-stock-price"><a href="#2-5-1-Using-simply-stock-price-v-s-log-stock-price" class="headerlink" title="2.5.1. Using simply stock price v.s. log(stock price)"></a>2.5.1. Using simply stock price v.s. log(stock price)</h3><p>In the blog post <strong><em><a href="http://epchan.blogspot.com/2013/11/cointegration-trading-with-log-prices.html">Cointegration Trading with Log Prices vs. Prices</a></em></strong> by Dr. Ernest P. Chen, the difference between using price and using log price has been stated clearly:</p><blockquote><p>For most cointegrating pairs that I have studied, both the price spreads and the log price spreads are stationary, so it doesn’t matter which one we use for our trading strategy. However, for an unusual pair where its log price spread cointegrates but price spread does not (Hat tip: Adam G. for drawing my attention to one such example), the implication is quite significant.</p><div style="text-align: right"> <i>- Ernest P. Chen</i></div></blockquote><p>Therefore, it would be interesting to see how this impact the entire strategy return.</p><div class="table-container"><table><thead><tr><th></th><th>Price</th><th>log(Price)</th></tr></thead><tbody><tr><td>Industry pairs without correlation filter</td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/price_w_ind_wo_corr_1.png" class="" width="600"></td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/logprice_w_ind_wo_corr_1.png" class="" width="600"></td></tr><tr><td>Non-industry pairs without correlation filter</td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/price_wo_ind_wo_corr_1.png" class="" width="600"></td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/logprice_wo_ind_wo_corr_1.png" class="" width="600"></td></tr><tr><td>Industry pairs with correlation filter</td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/price_w_ind_w_corr_1.png" class="" width="600"></td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/logprice_w_ind_w_corr_1.png" class="" width="600"></td></tr><tr><td>Non-industry pairs with correlation filter</td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/price_wo_ind_w_corr_1.png" class="" width="600"></td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/logprice_wo_ind_w_corr_1.png" class="" width="600"></td></tr></tbody></table></div><p>The scenarios using <strong>log(price)</strong> don’t seem to have distinct differences from the scenarios using <strong>price</strong>. So we can’t say for sure that whether using <strong>price</strong> or <strong>log(price)</strong> is superior.</p><h3 id="2-5-2-Filter-by-correlation-cointegration-v-s-filter-by-cointegration"><a href="#2-5-2-Filter-by-correlation-cointegration-v-s-filter-by-cointegration" class="headerlink" title="2.5.2. Filter by correlation + cointegration v.s. filter by cointegration"></a>2.5.2. Filter by correlation + cointegration v.s. filter by cointegration</h3><p>In this second research, I would like to know whether a high correlation has any positive impact on this trading strategy. The way I run this research is that, in addition to the already-have cointegration p-value filter, I add another filter to eliminate the pairs where the correlation value is under 0.9 and the p-value is greater than 0.05. Then, we do everything the same as the previous research.</p><div class="table-container"><table><thead><tr><th></th><th>with correlation filter</th><th>without correlation filter</th></tr></thead><tbody><tr><td>Industry pairs with price</td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/price_w_ind_w_corr_1.png" class="" width="600"></td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/price_w_ind_wo_corr_1.png" class="" width="600"></td></tr><tr><td>Non-industry pairs with log(price)</td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/logprice_wo_ind_w_corr_1.png" class="" width="600"></td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/logprice_wo_ind_wo_corr_1.png" class="" width="600"></td></tr><tr><td>Industry pairs with log(price)</td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/logprice_w_ind_w_corr_1.png" class="" width="600"></td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/logprice_w_ind_wo_corr_1.png" class="" width="600"></td></tr><tr><td>Non-industry pairs with price</td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/price_wo_ind_w_corr_1.png" class="" width="600"></td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/price_wo_ind_wo_corr_1.png" class="" width="600"></td></tr></tbody></table></div><p>Somehow it seems that the scenarios without the correlation filter are always better performed than the corresponding scenarios with the correlation filter. This might give us a clue that maybe the correlation filter is not needed.</p><h3 id="2-5-3-Construct-pairs-within-the-same-industries-or-across-different-industries"><a href="#2-5-3-Construct-pairs-within-the-same-industries-or-across-different-industries" class="headerlink" title="2.5.3. Construct pairs within the same industries or across different industries"></a>2.5.3. Construct pairs within the same industries or across different industries</h3><p>As we all know that the stock prices of companies in the same industry tend to be impacted simultaneously by the economic or industrial incidence, which we can deduce that companies in the same industry could have higher cointegration relationships than companies in different industries. Is this true? And how will it impact our pair trading strategy? I construct the trading pairs in two different ways: 1. we make all possible combinations using the native python function <code>itertools.combinations()</code>. 2. we only make possible combinations when two companies are in the same industry.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">pairs = []</span><br><span class="line"></span><br><span class="line"><span class="comment"># Create pairs only when two companies are in the same industry</span></span><br><span class="line">INDUSTRY_PAIR_FLAG = <span class="literal">True</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> INDUSTRY_PAIR_FLAG:</span><br><span class="line">    time = datetime.now()</span><br><span class="line">    sector = &#123;x: y.iloc[<span class="number">-1</span>][<span class="number">-1</span>] <span class="keyword">for</span> x <span class="keyword">in</span> universe <span class="keyword">if</span> <span class="keyword">not</span> (y:=qb.GetFundamental(x, <span class="string">&quot;AssetClassification.MorningstarSectorCode&quot;</span>, time - timedelta(days=<span class="number">3</span>), time)).empty&#125;</span><br><span class="line">    sectors_table = pd.DataFrame.from_dict(sector, orient=<span class="string">&#x27;index&#x27;</span>)</span><br><span class="line">    sectors_set = set(sectors_table.squeeze().values.tolist())</span><br><span class="line">    <span class="keyword">for</span> s <span class="keyword">in</span> sectors_set:</span><br><span class="line">        sector_list = sectors_table[sectors_table.squeeze() == s].index.tolist()</span><br><span class="line">        pairs.extend(list(it.combinations(sector_list, <span class="number">2</span>)))</span><br><span class="line"><span class="keyword">else</span>:</span><br><span class="line">    pairs = list(it.combinations(universe, <span class="number">2</span>))</span><br><span class="line"></span><br><span class="line">print(<span class="string">f&#x27;The <span class="subst">&#123;len(pairs)=&#125;</span> in the symbols_pairs&#x27;</span>)</span><br></pre></td></tr></table></figure><div class="table-container"><table><thead><tr><th></th><th>Pairing within the same industry</th><th>Pairing across different industry</th></tr></thead><tbody><tr><td>Price with correlation filter</td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/price_w_ind_w_corr_1.png" class="" width="600"></td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/price_wo_ind_w_corr_1.png" class="" width="600"></td></tr><tr><td>Price without correlation filter</td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/price_w_ind_wo_corr_1.png" class="" width="600"></td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/price_wo_ind_wo_corr_1.png" class="" width="600"></td></tr><tr><td>log(price) with correlation filter</td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/logprice_w_ind_w_corr_1.png" class="" width="600"></td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/logprice_wo_ind_w_corr_1.png" class="" width="600"></td></tr><tr><td>log(price) without correlation filter</td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/logprice_w_ind_wo_corr_1.png" class="" width="600"></td><td><img data-src="/2023/02/25/2023-02-25-pair-trading-cointegration-part1/logprice_wo_ind_wo_corr_1.png" class="" width="600"></td></tr></tbody></table></div><p>Same to previous research results, the first group returns in scenarios that form pairs across different industries seem always better than the ones in scenarios pairing within the same industry. That might tell us the cointegration relationship also exists across industries.</p><h1 id="3-Conclusion"><a href="#3-Conclusion" class="headerlink" title="3. Conclusion"></a>3. Conclusion</h1><p>From the research above, we have gained some insights regarding how each factor impacts the performance of the pair trading strategy. But, are we able to answer the three questions we mentioned above with 100% confidence? No. There are more details that we need to take into account when conducting the backtest, such as:</p><ul><li>Update the universe periodically by recalculating the cointegration p-value of all the pairs.</li><li>Use a smaller threshold to generate trading signals as the smaller entry point and exit will get a shorter holding period and more round trip trades and generally higher profits.</li><li>Use <a href="https://www.investopedia.com/terms/z/zscore.asp#:~:text=Investopedia%20%2F%20Tara%20Anand-,What%20Is%20Z%2DScore%3F,identical%20to%20the%20mean%20score.">z-score</a> method to smooth the $\epsilon$ that we’re tracking.</li><li>Close early if the trades were opened for too long.</li><li>Add a stop-loss threshold to prevent losing more if the $\epsilon$ goes way beyond the threshold.</li><li>…</li></ul><p>A lot of techniques can be experimented with and tested during backtesting. In the next episode, I’m going to work on the backtest and see whether there’s a possibility that we can find a profitable cointegration trading strategy.</p><p>Cheers!</p>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2023/02/25/2023-02-25-pair-trading-cointegration-part1/cover.png&quot; class=&quot;&quot; width=&quot;600&quot;&gt;
&lt;p&gt;Cointegration is a statistical technique to find out whether a time series closely follows the movement of the other time series. Therefore, it becomes an important technique in the pair trading strategy for us to determine the right stock pair to trade with. In this post, we’re going to see why traders prefer using the cointegration test over the correlation test in pair trading, and whether the cointegration test results can boost our trading performance.&lt;/p&gt;</summary>
    
    
    <category term="Quantitative Trading" scheme="http://mikelhsia.github.io/categories/Quantitative-Trading/"/>
    
    
    <category term="Strategy" scheme="http://mikelhsia.github.io/tags/Strategy/"/>
    
    <category term="Research" scheme="http://mikelhsia.github.io/tags/Research/"/>
    
    <category term="Pair Trading" scheme="http://mikelhsia.github.io/tags/Pair-Trading/"/>
    
  </entry>
  
  <entry>
    <title>【How 2】 Set Up Trading API Template In Python - Q&amp;A</title>
    <link href="http://mikelhsia.github.io/2022/12/15/2022-12-17-IBKR-broker-4/"/>
    <id>http://mikelhsia.github.io/2022/12/15/2022-12-17-IBKR-broker-4/</id>
    <published>2022-12-15T07:30:52.000Z</published>
    <updated>2024-06-28T06:04:41.512Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2022/12/15/2022-12-17-IBKR-broker-4/cover.png" class="" width="800"><p>In the last post in this series, we’re going to look at some questions that I discovered while working on connecting to Interactive Broker API. Some of them are due to the obscurity of the configuration and hard to find the right place to configure them, and some of them would need the extra tool to resolve. I put all of them down into one post and share it with you.</p><a id="more"></a><hr><p>Become a <a href="https://medium.com/@mikelhsia/membership">Medium member</a> to continue learning without limits. I’ll receive a small portion of your membership fee if you use the following link, at no extra cost to you.</p><blockquote><p>If you enjoy reading this and my other articles, feel free to <a src='https://medium.com/@mikelhsia/membership'> join Medium membership program</a> to read more about Quantitative Trading Strategy.</p></blockquote><hr><p><strong><em>Previous researches</em></strong></p><ul><li><a href="https://mikelhsia.github.io/2022/11/04/2022-11-10-advanced-cppi-strategy/">【Momentum Trading】A Defense Trading Strategy That Works - CPPI (Constant Proportion Portfolio Insurance)</a></li><li><a href="https://mikelhsia.github.io/2022/12/12/2022-12-10-IBKR-Broker/">【How 2】 Set Up Trading API Template In Python - Connecting My Trading Strategies To Interactive Brokers</a></li><li><a href="https://mikelhsia.github.io/2022/12/12/2022-12-16-IBKR-broker-2/">【How 2】 Set Up Trading API Template In Python - Placing Orders with Interactive Broker</a></li><li><a href="https://mikelhsia.github.io/2022/12/14/2022-12-17-IBKR-broker-3/">【How 2】 Set Up Trading API Template In Python - Build Local Storage For Storing Trades</a></li></ul><h1 id="Q-amp-A"><a href="#Q-amp-A" class="headerlink" title="Q&amp;A"></a>Q&amp;A</h1><h2 id="1-When-I’m-using-apscheduler-and-ib-insync-at-the-same-time-there-are-errors-and-I-can’t-get-my-trading-script-to-work"><a href="#1-When-I’m-using-apscheduler-and-ib-insync-at-the-same-time-there-are-errors-and-I-can’t-get-my-trading-script-to-work" class="headerlink" title="1. When I’m using apscheduler and ib_insync at the same time, there are errors and I can’t get my trading script to work"></a>1. When I’m using apscheduler and ib_insync at the same time, there are errors and I can’t get my trading script to work</h2><p><a href="https://apscheduler.readthedocs.io/en/3.x/"><code>Apscheduler</code></a> is the standard package in my quantitative trading setup. It’s a python library that helps you schedule your python code/function to be run at a specific DateTime or regular intervals with consideration of timezone. I gotta recommend this library to those traders/developers who have similar requirements in their trading scripts.</p><p>However, both <code>apscheduler</code> and <code>ib_insync</code> packages use the design of multi-threading in their package. If you include both of them, you’ll run into a problem and find a <code>RuntimeError</code> occurred when you try to run your script. Fortunately, <code>ib_insync</code> package includes this functionality to enable nested threading.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">RuntimeError: There <span class="keyword">is</span> no current event loop <span class="keyword">in</span> thread <span class="string">&#x27;ThreadPoolExecutor-0_0&#x27;</span>.</span><br></pre></td></tr></table></figure><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>RuntimeError saying there is no current event loop</i></p><p>And all you have to do is to:</p><ol><li>Call <code>ib_insync.util.patchAsyncio()</code> after you import <code>ib_insync</code> library.</li><li>Use <code>AsyncIOScheduler</code> to create your scheduler.</li><li>Add <code>async</code> before the scheduled function definition.<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># ibkr_api.py</span></span><br><span class="line"><span class="keyword">import</span> ib_insync</span><br><span class="line">ib_insync.util.patchAsyncio()</span><br></pre></td></tr></table></figure></li></ol><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># main.py</span></span><br><span class="line"><span class="keyword">from</span> apscheduler.schedulers.asyncio <span class="keyword">import</span> AsyncIOScheduler</span><br><span class="line"><span class="keyword">import</span> asyncio</span><br><span class="line"></span><br><span class="line"><span class="keyword">async</span> <span class="function"><span class="keyword">def</span> <span class="title">test</span>():</span></span><br><span class="line">  print(datetime.datetime.now().strftime(’%Y-%m-%d-%H_%M_%S’))</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">&#x27;__main__&#x27;</span>:</span><br><span class="line">  scheduler = AsyncIOScheduler()</span><br><span class="line"></span><br><span class="line">  scheduler.add_job(</span><br><span class="line">    test,</span><br><span class="line">    <span class="string">&#x27;cron&#x27;</span>,</span><br><span class="line">    day_of_week=<span class="string">&#x27;mon-fri&#x27;</span>,</span><br><span class="line">    hour=<span class="number">9</span>,</span><br><span class="line">    minute=<span class="number">10</span>,</span><br><span class="line">    timezone=ZoneInfo(<span class="string">&#x27;US/Easter&#x27;</span>),</span><br><span class="line">  )</span><br></pre></td></tr></table></figure><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Apscheduler + ib_insync code snippet</i></p><p>By making the above changes, the <code>apscheduler</code> and <code>ib_insync</code> can coexist in the same script.</p><blockquote><p><em>Note: There are things that could still go wrong as using the nested threads are relatively complicated.</em><br>Reference: <a href="https://groups.io/g/insync/topic/using_ib_insync_with/7866651">Using ib-insync with APScheduler</a></p></blockquote><h2 id="2-Why-I-can’t-get-a-valid-stock-price-back-by-using-reqMktData"><a href="#2-Why-I-can’t-get-a-valid-stock-price-back-by-using-reqMktData" class="headerlink" title="2. Why I can’t get a valid stock price back by using reqMktData?"></a>2. Why I can’t get a valid stock price back by using reqMktData?</h2><p>There are several steps and restrictions for requesting stock prices using <code>reqMktData()</code> function:</p><ol><li>Before requesting a market quote, you need to subscribe to the market data on the IBKR platform. You can find the management page in the TWS or IB gateway tab <em>“Account” -&gt; “Manage Account” -&gt; “Subscribe Market Data/Research”</em>.</li><li>The <code>ib.reqMarketDataType(N)</code> is to specify what kind of data type you would like to request. For example, if you request market data type = 1 (live market data) outside the trading hour, you won’t be able to receive any valid pricing data from the server. Therefore choose the market data type carefully, and test and explore their limitation.</li><li>As said in the previous post, this entire querying of pricing data is an asynchronous process, meaning you could run into the situation that you’re accessing the pricing data while your script is still trying to fetch the data from the server. Therefore, remember to use <code>ib.sleep()</code> wisely to ensure you only access the pricing data when the pricing data is returned.</li></ol><blockquote><p>Reference: <a href="https://mikelhsia.github.io/2022/12/12/2022-12-16-IBKR-broker-2/">【How 2】 Set Up Trading API Template In Python - Placing orders with Interactive Brokers</a></p></blockquote><h2 id="3-There-are-popup-windows-that-prevent-me-from-placing-orders-using-API-What-happened"><a href="#3-There-are-popup-windows-that-prevent-me-from-placing-orders-using-API-What-happened" class="headerlink" title="3. There are popup windows that prevent me from placing orders using API. What happened?"></a>3. There are popup windows that prevent me from placing orders using API. What happened?</h2><p>Inside the TWS and the IB gateway, there are pre-configured conditions that prevent API consumers to place an unintended order. If you accidentally place an order that falls outside of the size or value range, or if the current market data is lagged and hasn’t been updated for a long time, then the TWS/IB gateway will pop up a warning window to tell you that there is a potential hazard to place such an order.</p><p>To prevent this from happening and stop your trading script, you can check the box in <em>API -&gt; Precautions -&gt; “Bypass Order Precaution for API Order”</em> to prevent the warning dialog boxes from popping up when you place orders through API. Yet, you have to bare the risk of unexpected loss when your script goes wrong.</p><img data-src="/2022/12/15/2022-12-17-IBKR-broker-4/order_precaution.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Check this checkbox to prevent the warning appears when placing orders via API</i></p><blockquote><p>Reference: <a href="https://guides.interactivebrokers.com/tws/usersguidebook/configuretws/apiprecautions.htm">API Precautions</a></p></blockquote><h2 id="4-How-could-I-reset-my-paper-account"><a href="#4-How-could-I-reset-my-paper-account" class="headerlink" title="4. How could I reset my paper account?"></a>4. How could I reset my paper account?</h2><p>Whenever you feel like starting a new test from a clean slate, you can always reset your paper account. However, the setting is quite hard to find. You have to log in to your paper account on the IBKR website, and then follow the below steps to reset your paper account.</p><img data-src="/2022/12/15/2022-12-17-IBKR-broker-4/step1.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>How to reset your paper account 1</i></p><img data-src="/2022/12/15/2022-12-17-IBKR-broker-4/step2.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>How to reset your paper account 2</i></p><img data-src="/2022/12/15/2022-12-17-IBKR-broker-4/step3.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>How to reset your paper account 3</i></p><h2 id="5-There-are-times-that-my-TWS-or-IB-gateway-won’t-successfully-auto-reconnect-What-should-I-do"><a href="#5-There-are-times-that-my-TWS-or-IB-gateway-won’t-successfully-auto-reconnect-What-should-I-do" class="headerlink" title="5. There are times that my TWS or IB gateway won’t successfully auto-reconnect. What should I do?"></a>5. There are times that my TWS or IB gateway won’t successfully auto-reconnect. What should I do?</h2><p>TWS and IB gateway are two very important intermediaries to relay the API instructions from your local trading script to the remote IBKR API server. However, there is a hidden mechanism inside TWS and IB gateway applications. These two applications need to restart every day and will automatically reconnect, and require human intervention to log in again every seven days. Therefore, there are two scenarios that we need to address:</p><ol><li><strong>Q:</strong> How do we keep the connection with IBKR after the software applications have auto-reconnected?<ul><li><strong>A:</strong> Avoid using the long connection as possible. Disconnect your <code>IB()</code> instance as long as the required actions are done, and reconnect to the server when new actions are needed.</li></ul></li><li><strong>Q:</strong> What if there’s an error occurred while software applications are rebooting?<ul><li><strong>A:</strong> As the software applications are run locally on your desktop or laptop, meaning this type of software crash is not monitored by any script or process. One possible solution is to wrap the headless software application inside docker. You can download the docker image of “ib-gateway-docker” from <a href="https://github.com/UnusualAlpha/ib-gateway-docker">here</a> and run this docker container on your local machine so that the process can be protected and monitored by the daemon inside the docker container.</li></ul></li></ol><blockquote><p>Reference:<a href="https://github.com/UnusualAlpha/ib-gateway-docker">IB gateway docker</a></p></blockquote><h2 id="6-What-to-do-with-2-factor-authentication-when-trading-using-a-real-account"><a href="#6-What-to-do-with-2-factor-authentication-when-trading-using-a-real-account" class="headerlink" title="6. What to do with 2-factor authentication when trading using a real account?"></a>6. What to do with 2-factor authentication when trading using a real account?</h2><p>Since Interactive Brokers adopts two-factor authentication for logging in and buying/selling stocks, it essentially means that working with Interactive Brokers API won’t be fully automated (See <a href="https://ibkr.info/article/2260">here</a> and <a href="https://ibkr.info/article/2879">here</a>). Every time you place a random order or log in to your TWS/IB gateway application, you will receive a message on your smartphone to confirm your corresponding action once more “<strong>manually</strong>“. Here are two posts that could give you a rough idea of how to work with this two-factor authentication system:</p><ul><li><a href="https://groups.io/g/insync/topic/81744821#6060">live trading and two-factor authentication</a></li><li><a href="https://groups.io/g/insync/topic/95475509#8603">Watchdog with 2fa</a></li></ul><p>There could be a possibility in the future to have a workaround to bypass this system. Currently, having your smartphone with you during trading hours would seem to be the most promising method.</p><hr><p>This is the last post in this <strong><em>Set-Up Trading API Template In Python</em></strong> series. I hope you enjoy reading these and let me know if there is any other topic you would like to read.</p>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2022/12/15/2022-12-17-IBKR-broker-4/cover.png&quot; class=&quot;&quot; width=&quot;800&quot;&gt;
&lt;p&gt;In the last post in this series, we’re going to look at some questions that I discovered while working on connecting to Interactive Broker API. Some of them are due to the obscurity of the configuration and hard to find the right place to configure them, and some of them would need the extra tool to resolve. I put all of them down into one post and share it with you.&lt;/p&gt;</summary>
    
    
    <category term="How2" scheme="http://mikelhsia.github.io/categories/How2/"/>
    
    <category term="Quantitative Trading" scheme="http://mikelhsia.github.io/categories/How2/Quantitative-Trading/"/>
    
    
    <category term="How2" scheme="http://mikelhsia.github.io/tags/How2/"/>
    
    <category term="Interactive Broker" scheme="http://mikelhsia.github.io/tags/Interactive-Broker/"/>
    
  </entry>
  
  <entry>
    <title>【How 2】 Set Up Trading API Template In Python - Build Local Storage For Storing Trades</title>
    <link href="http://mikelhsia.github.io/2022/12/14/2022-12-17-IBKR-broker-3/"/>
    <id>http://mikelhsia.github.io/2022/12/14/2022-12-17-IBKR-broker-3/</id>
    <published>2022-12-13T17:42:02.000Z</published>
    <updated>2022-12-15T07:12:01.091Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2022/12/14/2022-12-17-IBKR-broker-3/cover.png" class="" width="800"><p>Now we come to the third part of this series. In this post, I’m going to show you how I design and build my local database to store IBKR trades and other necessary information for generating meaningful indicators to review our strategy performance.</p><a id="more"></a><hr><blockquote><p>If you enjoy reading this and my other articles, feel free to <a src='https://medium.com/@mikelhsia/membership'> join Medium membership program</a> to read more about Quantitative Trading Strategy.</p></blockquote><hr><p><strong><em>Previous researches</em></strong></p><ul><li><a href="https://mikelhsia.github.io/2022/11/04/2022-11-10-advanced-cppi-strategy/">【Momentum Trading】A Defense Trading Strategy That Works - CPPI (Constant Proportion Portfolio Insurance)</a></li><li><a href="https://mikelhsia.github.io/2022/12/12/2022-12-10-IBKR-Broker/">【How 2】 Set Up Trading API Template In Python - Connecting My Trading Strategies To Interactive Brokers</a></li><li><a href="https://mikelhsia.github.io/2022/12/12/2022-12-16-IBKR-broker-2/">【How 2】 Set Up Trading API Template In Python - Placing Orders with Interactive Broker</a></li></ul><h1 id="Why-do-we-need-to-build-this-capability-ourselves"><a href="#Why-do-we-need-to-build-this-capability-ourselves" class="headerlink" title="Why do we need to build this capability ourselves?"></a>Why do we need to build this capability ourselves?</h1><p>We have most of our functions ready in our previous two posts except the <code>def get_transaction()</code> function. Most of the brokers would provide the function to retrieve historic transactions for at least 60 days. However, Interactive Brokers doesn’t support the functionality to retrieve the historic trades and portfolio performance from it. The reason I want this function supported is that I need to:</p><ol><li>Use the historic portfolio performance to compare with the benchmark evaluating KPIs and see whether my trading strategy is successful or not.</li><li>In the <a href="https://mikelhsia.github.io/2022/11/04/2022-11-10-advanced-cppi-strategy/">CPPI strategy</a> we talked about before, the B and E ratio calculation depends on the previous day’s maximum portfolio value. Therefore we need to persist it so that we won’t lose it every time we restart our trading script.</li><li>I would like to take the impact of the commission into account. Since Interactive Brokers won’t save my trading records any longer, I would need to save those trading records on my local DB so that I get to keep track of the commission spent on this strategy.</li></ol><p>To address the requirements that I put together above, building a database on the local machine is imperative. Below, I’m going to put down my solution into two sections:</p><ul><li>Design DB schema</li><li>Implement DB-related capabilities</li></ul><p>Also, in terms of which DB should be used here, SQL such as MySql or NoSQL like MongoDB will be too complicated and way too powerful. Therefore, I simply pick <code>sqlite3</code> to create easy-to-use local storage.</p><h1 id="Design-DB-schema"><a href="#Design-DB-schema" class="headerlink" title="Design DB schema"></a>Design DB schema</h1><p>We are going to create three tables, and each of them is going to address the requirements that we raised above respectively.</p><ul><li>IB_SQLITE_CPPI_TBL_NAME<ul><li>The only critical variable here is the <code>MAX_ASSET</code>. This is a value to keep track of the max portfolio value and calculate the CPPI E_ratio and B_ratio. If you want to know why we need this variable tracked in the data table, you can check out <a href="https://mikelhsia.github.io/2022/11/04/2022-11-10-advanced-cppi-strategy/">this post</a></li></ul></li><li>IB_SQLITE_TRANSACTION_TBL_NAME<ul><li>This table is basically recording the daily performance of our portfolio and market benchmark. We have <code>PORTFOLIO_CLOSE_VALUE</code>, <code>SPY_CLOSE_PRICE</code>, and <code>COMMISSION</code>, where the commission is a sum added up from the <strong>IB_SQLITE_ORDER_TBL_NAME</strong>.</li></ul></li><li>IB_SQLITE_ORDER_TBL_NAME<ul><li>This table is meant to record all the orders placed. I extracted the following information from the <code>ib.trades()</code> response and tuck them into the table: <code>symbol</code>, <code>order_id</code>, <code>action</code> (buy or sell), <code>quantity</code>, <code>order status</code>, <code>commission cost</code>, and the <code>account number</code>.</li></ul></li></ul><img data-src="/2022/12/14/2022-12-17-IBKR-broker-3/db_schema.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>DB schema of three tables</i></p><h1 id="Implement-DB-related-capabilities"><a href="#Implement-DB-related-capabilities" class="headerlink" title="Implement DB-related capabilities"></a>Implement DB-related capabilities</h1><img data-src="/2022/12/14/2022-12-17-IBKR-broker-3/db_helper_functions.png" class="" width="600"><p><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Private and public functions for managing our DB</i></p><br>Here I separated the functions into two groups. The first part of functions is the private functions that conduct database operations such as connecting to the database, creating the table, checking whether the table exists or not, and so on. This provides the minimum capability for managing the database. The second part of the functions is public functions that use private functions to interact with the specified data table in the database.</p><p>Below are the private sqlite3 DB functions:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">########################################################################</span></span><br><span class="line"><span class="comment"># Sqlite3 private functions</span></span><br><span class="line"><span class="comment">########################################################################</span></span><br><span class="line"><span class="meta">@contextmanager</span></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">sqlite_connect</span>(<span class="params">self</span>):</span></span><br><span class="line">    dirs = os.path.dirname(os.path.abspath(__file__))</span><br><span class="line">    <span class="keyword">try</span>:</span><br><span class="line">        db_path = os.path.join(dirs, IB_SQLITE_DB_NAME)</span><br><span class="line">        conn = sqlite3.connect(db_path)</span><br><span class="line">        print(<span class="string">f&#x27;Sqlite connection established&#x27;</span>)</span><br><span class="line">        <span class="keyword">yield</span> conn</span><br><span class="line">        conn.close()</span><br><span class="line">        print(<span class="string">f&#x27;Sqlite connection closed&#x27;</span>)</span><br><span class="line">    <span class="keyword">except</span> OSError <span class="keyword">as</span> e:</span><br><span class="line">        print(<span class="string">f&#x27;We are having an OS error&#x27;</span>)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">__sqlite_create_table</span>(<span class="params">self, conn=None, tbl_name=None</span>):</span></span><br><span class="line">    <span class="keyword">if</span> <span class="keyword">not</span> tbl_name <span class="keyword">or</span> <span class="keyword">not</span> conn:</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">False</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> tbl_name == IB_SQLITE_TRANSACTION_TBL_NAME:</span><br><span class="line">        conn.execute(<span class="string">f&#x27;&#x27;&#x27;CREATE TABLE <span class="subst">&#123;IB_SQLITE_TRANSACTION_TBL_NAME&#125;</span></span></span><br><span class="line"><span class="string">            (ID INTEGER PRIMARY KEY AUTOINCREMENT,</span></span><br><span class="line"><span class="string">            CREATE_TIME DATETIME NOT NULL,</span></span><br><span class="line"><span class="string">            PORTFOLIO_CLOSE_VALUE FLOAT NOT NULL,</span></span><br><span class="line"><span class="string">            SPY_CLOSE_PRICE FLOAT NOT NULL,</span></span><br><span class="line"><span class="string">            COMMISSION FLOAT NOT NULL);</span></span><br><span class="line"><span class="string">        &#x27;&#x27;&#x27;</span>)</span><br><span class="line">    <span class="keyword">elif</span> tbl_name == IB_SQLITE_CPPI_TBL_NAME:</span><br><span class="line">        conn.execute(<span class="string">f&#x27;&#x27;&#x27;CREATE TABLE <span class="subst">&#123;IB_SQLITE_CPPI_TBL_NAME&#125;</span></span></span><br><span class="line"><span class="string">            (ID INTEGER PRIMARY KEY AUTOINCREMENT,</span></span><br><span class="line"><span class="string">            CREATE_TIME DATETIME NOT NULL,</span></span><br><span class="line"><span class="string">            MAX_ASSET FLOAT NOT NULL,</span></span><br><span class="line"><span class="string">            E_RATIO FLOAT NOT NULL,</span></span><br><span class="line"><span class="string">            B_RATIO FLOAT NOT NULL);</span></span><br><span class="line"><span class="string">        &#x27;&#x27;&#x27;</span>)</span><br><span class="line">    <span class="keyword">elif</span> tbl_name == IB_SQLITE_ORDER_TBL_NAME:</span><br><span class="line">        conn.execute(<span class="string">f&#x27;&#x27;&#x27;CREATE TABLE <span class="subst">&#123;IB_SQLITE_ORDER_TBL_NAME&#125;</span></span></span><br><span class="line"><span class="string">            (ID INTEGER PRIMARY KEY AUTOINCREMENT,</span></span><br><span class="line"><span class="string">            CREATE_TIME DATETIME NOT NULL,</span></span><br><span class="line"><span class="string">            SYMBOL TEXT NOT NULL,</span></span><br><span class="line"><span class="string">            ORDER_ID TEXT NOT NULL UNIQUE,</span></span><br><span class="line"><span class="string">            ACTION TEXT NOT NULL,</span></span><br><span class="line"><span class="string">            QUANTITY INT NOT NULL,</span></span><br><span class="line"><span class="string">            ORDER_STATUS TEXT NOT NULL,</span></span><br><span class="line"><span class="string">            COMMISSION FLOAT NOT NULL,</span></span><br><span class="line"><span class="string">            ACCOUNT TEXT NOT NULL);</span></span><br><span class="line"><span class="string">        &#x27;&#x27;&#x27;</span>)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="literal">True</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">__sqlite_is_table_exist</span>(<span class="params">self, conn=None, tbl_name=None</span>):</span></span><br><span class="line">    <span class="keyword">if</span> <span class="keyword">not</span> tbl_name <span class="keyword">or</span> <span class="keyword">not</span> conn:</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">False</span></span><br><span class="line"></span><br><span class="line">    c = conn.cursor()</span><br><span class="line"></span><br><span class="line">    c.execute(<span class="string">f&#x27;&#x27;&#x27;SELECT count(name)</span></span><br><span class="line"><span class="string">        FROM sqlite_master</span></span><br><span class="line"><span class="string">        WHERE type=&quot;table&quot; AND name=&quot;<span class="subst">&#123;tbl_name&#125;</span>&quot;;</span></span><br><span class="line"><span class="string">    &#x27;&#x27;&#x27;</span>)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> c.fetchone()[<span class="number">0</span>]==<span class="number">1</span> :</span><br><span class="line">        <span class="comment"># Table exists</span></span><br><span class="line">        <span class="keyword">return</span> <span class="literal">True</span></span><br><span class="line">    <span class="keyword">else</span> :</span><br><span class="line">        <span class="comment"># Table does not exist</span></span><br><span class="line">        <span class="keyword">return</span> <span class="literal">False</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">__sqlite_query_data</span>(<span class="params">self, conn=None, tbl_name=None</span>):</span></span><br><span class="line">    <span class="keyword">if</span> <span class="keyword">not</span> conn <span class="keyword">or</span> <span class="keyword">not</span> tbl_name:</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">None</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> <span class="keyword">not</span> self.__sqlite_is_table_exist(conn, tbl_name):</span><br><span class="line">        self.__sqlite_create_table(conn, tbl_name)</span><br><span class="line"></span><br><span class="line">    df = pd.read_sql_query(<span class="string">f&#x27;SELECT * from <span class="subst">&#123;tbl_name&#125;</span>;&#x27;</span>, conn)</span><br><span class="line">    <span class="keyword">return</span> df</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">__sqlite_insert_record</span>(<span class="params">self, conn=None, sql=None, value_tuple: tuple=None, tbl_name=None</span>):</span></span><br><span class="line">    <span class="keyword">if</span> <span class="keyword">not</span> self.__sqlite_is_table_exist(conn, tbl_name):</span><br><span class="line">        self.__sqlite_create_table(conn, tbl_name)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> <span class="keyword">not</span> sql:</span><br><span class="line">        <span class="keyword">raise</span> RuntimeError(<span class="string">f&#x27;SQL string is empty&#x27;</span>)</span><br><span class="line"></span><br><span class="line">    conn.execute(sql, value_tuple)</span><br><span class="line"></span><br><span class="line">    conn.commit()</span><br><span class="line">    <span class="keyword">return</span> <span class="literal">True</span></span><br></pre></td></tr></table></figure></p><p>As for the public functions in our script, they provide support for our trading script so that it can achieve the purpose we want it to.</p><p>First of all, these two functions are for us to retrieve data from the corresponding data table and return in <code>pd.DataFrame()</code> format.<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">get_transactions</span>(<span class="params">self</span>):</span></span><br><span class="line">    <span class="keyword">with</span> self.sqlite_connect() <span class="keyword">as</span> conn:</span><br><span class="line">        df = self.__sqlite_query_data(conn, IB_SQLITE_TRANSACTION_TBL_NAME)</span><br><span class="line">    <span class="keyword">return</span> df</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">get_cppi_variables</span>(<span class="params">self</span>):</span></span><br><span class="line">    <span class="keyword">with</span> self.sqlite_connect() <span class="keyword">as</span> conn:</span><br><span class="line">        df = self.__sqlite_query_data(conn, IB_SQLITE_CPPI_TBL_NAME)</span><br><span class="line">    <span class="keyword">return</span> df</span><br></pre></td></tr></table></figure></p><p>Secondly, we created three functions for handling parsing the corresponding API responses into the data format we need. Therefore, this part of the functions involves interacting with the Interactive Brokers API, fetching data from sqlite3 local database, and processing the data accordingly.<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">update_orders_in_db</span>(<span class="params">self</span>):</span></span><br><span class="line">    sql = <span class="string">f&#x27;&#x27;&#x27;INSERT OR IGNORE INTO <span class="subst">&#123;IB_SQLITE_ORDER_TBL_NAME&#125;</span> (CREATE_TIME, SYMBOL, ORDER_ID, ACTION, QUANTITY, ORDER_STATUS, COMMISSION, ACCOUNT) VALUES (?, ?, ?, ?, ?, ?, ?, ?);&#x27;&#x27;&#x27;</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">with</span> self.sqlite_connect() <span class="keyword">as</span> conn:</span><br><span class="line">        trades = self.client.trades()</span><br><span class="line">        <span class="keyword">for</span> trade <span class="keyword">in</span> trades:</span><br><span class="line">            perm_id = trade.order.permId</span><br><span class="line">            qty = trade.order.filledQuantity</span><br><span class="line">            symbol = trade.contract.symbol</span><br><span class="line">            action = trade.order.action</span><br><span class="line">            commission = sum([fill.commissionReport.commission <span class="keyword">for</span> fill <span class="keyword">in</span> trade.fills])</span><br><span class="line">            status = trade.orderStatus.status</span><br><span class="line">            exec_time = trade.log[<span class="number">0</span>].time</span><br><span class="line">            account = trade.order.account</span><br><span class="line">            self.__sqlite_insert_record(</span><br><span class="line">                conn,</span><br><span class="line">                sql,</span><br><span class="line">                (exec_time, symbol, perm_id, action, qty, status, commission, account),</span><br><span class="line">                IB_SQLITE_ORDER_TBL_NAME</span><br><span class="line">            )</span><br><span class="line">    logger.logger.debug(<span class="string">f&#x27;Database <span class="subst">&#123;IB_SQLITE_ORDER_TBL_NAME&#125;</span> updated&#x27;</span>)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">update_transactions_in_db</span>(<span class="params">self</span>):</span></span><br><span class="line">    sql = <span class="string">f&#x27;&#x27;&#x27;INSERT OR IGNORE INTO <span class="subst">&#123;IB_SQLITE_TRANSACTION_TBL_NAME&#125;</span> (CREATE_TIME, PORTFOLIO_CLOSE_VALUE, SPY_CLOSE_PRICE, COMMISSION) VALUES (?,?,?,?);&#x27;&#x27;&#x27;</span></span><br><span class="line"></span><br><span class="line">    <span class="comment"># Portfolio value</span></span><br><span class="line">    portfolio_value = <span class="number">0</span></span><br><span class="line">    <span class="keyword">for</span> account <span class="keyword">in</span> self.accounts:</span><br><span class="line">        data = self.client.accountValues(account)</span><br><span class="line">        <span class="keyword">for</span> row <span class="keyword">in</span> data:</span><br><span class="line">            <span class="keyword">if</span> row.tag <span class="keyword">in</span> [<span class="string">&#x27;TotalCashBalance&#x27;</span>, <span class="string">&#x27;StockMarketValue&#x27;</span>] <span class="keyword">and</span> row.currency == self.currency:</span><br><span class="line">                portfolio_value += float(row.value)</span><br><span class="line"></span><br><span class="line">    <span class="comment"># SPY close value</span></span><br><span class="line">    benchmark_value = self.get_last_price_from_quote(<span class="string">&#x27;SPY&#x27;</span>)</span><br><span class="line"></span><br><span class="line">    <span class="comment"># Update the latest commission</span></span><br><span class="line">    commission = self.get_commission_from_db(<span class="number">1</span>)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">with</span> self.sqlite_connect() <span class="keyword">as</span> conn:</span><br><span class="line">        self.__sqlite_insert_record(</span><br><span class="line">            conn,</span><br><span class="line">            sql,</span><br><span class="line">            (datetime.now(), portfolio_value, benchmark_value, commission),</span><br><span class="line">            IB_SQLITE_TRANSACTION_TBL_NAME</span><br><span class="line">        )</span><br><span class="line">    logger.logger.debug(<span class="string">f&#x27;Database <span class="subst">&#123;IB_SQLITE_TRANSACTION_TBL_NAME&#125;</span> updated&#x27;</span>)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">update_cppi_variables_in_db</span>(<span class="params">self, max_asset, E, B</span>):</span></span><br><span class="line">    sql = <span class="string">f&#x27;&#x27;&#x27;INSERT OR IGNORE INTO <span class="subst">&#123;IB_SQLITE_CPPI_TBL_NAME&#125;</span> (CREATE_TIME, MAX_ASSET, E_RATIO, B_RATIO) VALUES (?,?,?,?);&#x27;&#x27;&#x27;</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">with</span> self.sqlite_connect() <span class="keyword">as</span> conn:</span><br><span class="line">        self.__sqlite_insert_record(</span><br><span class="line">            conn,</span><br><span class="line">            sql,</span><br><span class="line">            (datetime.now(), max_asset, E, B),</span><br><span class="line">            IB_SQLITE_CPPI_TBL_NAME</span><br><span class="line">        )</span><br></pre></td></tr></table></figure></p><p>Lastly, this is the function to achieve the goal for me to calculate the commission sum on the day (or for multiple days).</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">get_commission_from_db</span>(<span class="params">self, time_delta:int=<span class="number">0</span></span>) -&gt; float:</span></span><br><span class="line">    <span class="keyword">with</span> self.sqlite_connect() <span class="keyword">as</span> conn:</span><br><span class="line">        df = self.__sqlite_query_data(conn, IB_SQLITE_ORDER_TBL_NAME)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> df.empty:</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span></span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        <span class="keyword">return</span> df[(datetime.now(self.timezone) - pd.to_datetime(df[<span class="string">&#x27;CREATE_TIME&#x27;</span>], utc=<span class="literal">False</span>)) &lt; timedelta(days=time_delta)][<span class="string">&#x27;COMMISSION&#x27;</span>].sum()</span><br></pre></td></tr></table></figure><h1 id="My-strategy-report-card"><a href="#My-strategy-report-card" class="headerlink" title="My strategy report card"></a>My strategy report card</h1><p>In the last part of this post, I’ll show you the portfolio performance metrics that I plan using to evaluate the trading strategy with the data stored in our local database. You can also modify the DB schema, record the information you need, and come up with important and helpful for you to evaluate the effectiveness of your trading script.</p><ol><li>Sharpe Ratio (SR)</li><li>Total return</li><li>Annualized return</li><li>Variance</li><li>Max Drawdown (MDD)</li><li>Trading fee spent</li></ol><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">get_strategy_report</span>(<span class="params">self, config=None, verbose=False</span>):</span></span><br><span class="line">    final = self.get_transactions()</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> verbose <span class="keyword">is</span> <span class="literal">True</span>:</span><br><span class="line">        print(final)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> tmp.empty:</span><br><span class="line">        ret_data[<span class="string">&#x27;Version&#x27;</span>] = <span class="string">&#x27;1.0&#x27;</span></span><br><span class="line">        ret_data[<span class="string">&#x27;SR/Portfolio&#x27;</span>] = <span class="number">0</span></span><br><span class="line">        ret_data[<span class="string">&#x27;SR/Benchmark&#x27;</span>] = <span class="number">0</span></span><br><span class="line">        ret_data[<span class="string">&#x27;Total Return/Portfolio&#x27;</span>] = <span class="number">0</span></span><br><span class="line">        ret_data[<span class="string">&#x27;Total Return/Benchmark&#x27;</span>] = <span class="number">0</span></span><br><span class="line">        ret_data[<span class="string">&#x27;Annualized Return/Portfolio&#x27;</span>] = <span class="number">0</span></span><br><span class="line">        ret_data[<span class="string">&#x27;Annualized Return/Benchmark&#x27;</span>] = <span class="number">0</span></span><br><span class="line">        ret_data[<span class="string">&#x27;Variance/Portfolio&#x27;</span>] = <span class="number">0</span></span><br><span class="line">        ret_data[<span class="string">&#x27;Variance/Benchmark&#x27;</span>] = <span class="number">0</span></span><br><span class="line">        ret_data[<span class="string">&#x27;MDD/Portfolio&#x27;</span>] = <span class="number">0</span></span><br><span class="line">        ret_data[<span class="string">&#x27;MDD/Benchmark&#x27;</span>] = <span class="number">0</span></span><br><span class="line">        ret_data[<span class="string">&#x27;Trading fee&#x27;</span>] = <span class="number">0</span></span><br><span class="line">        ret_data[<span class="string">&#x27;Trading fee ratio&#x27;</span>] = <span class="number">0</span></span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        ret_data[<span class="string">&#x27;Version&#x27;</span>] = <span class="string">&#x27;1.0&#x27;</span></span><br><span class="line">        ret_data[<span class="string">&#x27;SR/Portfolio&#x27;</span>] = tmp.loc[:, <span class="string">&#x27;PORTFOLIO_CLOSE_VALUE&#x27;</span>].mean() / tmp.loc[:, <span class="string">&#x27;PORTFOLIO_CLOSE_VALUE&#x27;</span>].std()</span><br><span class="line">        ret_data[<span class="string">&#x27;SR/Benchmark&#x27;</span>] = tmp.loc[:, <span class="string">&#x27;SPY_CLOSE_PRICE&#x27;</span>].mean() / tmp.loc[:, <span class="string">&#x27;SPY_CLOSE_PRICE&#x27;</span>].std()</span><br><span class="line">        ret_data[<span class="string">&#x27;Total Return/Portfolio&#x27;</span>] = (tmp.loc[:, <span class="string">&#x27;PORTFOLIO_CLOSE_VALUE&#x27;</span>].iloc[<span class="number">-1</span>] / c[<span class="string">&#x27;init_cash&#x27;</span>]) - <span class="number">1</span></span><br><span class="line">        ret_data[<span class="string">&#x27;Total Return/Benchmark&#x27;</span>] = (tmp.loc[:, <span class="string">&#x27;SPY_CLOSE_PRICE&#x27;</span>].iloc[<span class="number">-1</span>] / tmp.loc[:, <span class="string">&#x27;SPY_CLOSE_PRICE&#x27;</span>].iloc[<span class="number">0</span>]) - <span class="number">1</span></span><br><span class="line">        ret_data[<span class="string">&#x27;Annualized Return/Portfolio&#x27;</span>] = (<span class="number">1</span> + ret_data[<span class="string">&#x27;Total Return/Portfolio&#x27;</span>])**(<span class="number">365</span>/(datetime.today() - pd.to_datetime(c[<span class="string">&#x27;start_date&#x27;</span>])).days) - <span class="number">1</span></span><br><span class="line">        ret_data[<span class="string">&#x27;Annualized Return/Benchmark&#x27;</span>] = (<span class="number">1</span> + ret_data[<span class="string">&#x27;Total Return/Benchmark&#x27;</span>])**(<span class="number">365</span>/(datetime.today() - pd.to_datetime(c[<span class="string">&#x27;start_date&#x27;</span>])).days) - <span class="number">1</span></span><br><span class="line">        ret_data[<span class="string">&#x27;Variance/Portfolio&#x27;</span>] = tmp.loc[:, <span class="string">&#x27;PORTFOLIO_CLOSE_VALUE&#x27;</span>].var()</span><br><span class="line">        ret_data[<span class="string">&#x27;Variance/Benchmark&#x27;</span>] = tmp.loc[:, <span class="string">&#x27;SPY_CLOSE_PRICE&#x27;</span>].var()</span><br><span class="line">        ret_data[<span class="string">&#x27;MDD/Portfolio&#x27;</span>] = self.__calculate_mdd(tmp.loc[:, <span class="string">&#x27;PORTFOLIO_CLOSE_VALUE&#x27;</span>])</span><br><span class="line">        ret_data[<span class="string">&#x27;MDD/Benchmark&#x27;</span>] = self.__calculate_mdd(tmp.loc[:, <span class="string">&#x27;SPY_CLOSE_PRICE&#x27;</span>])</span><br><span class="line">        ret_data[<span class="string">&#x27;Trading fee&#x27;</span>] = tmp.loc[:, <span class="string">&#x27;COMMISSION&#x27;</span>].sum()</span><br><span class="line">        ret_data[<span class="string">&#x27;Trading fee ratio&#x27;</span>] = ret_data[<span class="string">&#x27;Trading fee&#x27;</span>] / tmp.loc[:, <span class="string">&#x27;PORTFOLIO_CLOSE_VALUE&#x27;</span>].iloc[<span class="number">-1</span>]</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> ret_data</span><br></pre></td></tr></table></figure><hr><p>That’s it! I know it’s a bit too much code and too little talk in this post, but a good trading strategy should always include a performance evaluation to know whether this strategy is still in effect. This is the last bit of what I built in my API template so I hope it would help people who want to build their own API templates.</p>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2022/12/14/2022-12-17-IBKR-broker-3/cover.png&quot; class=&quot;&quot; width=&quot;800&quot;&gt;
&lt;p&gt;Now we come to the third part of this series. In this post, I’m going to show you how I design and build my local database to store IBKR trades and other necessary information for generating meaningful indicators to review our strategy performance.&lt;/p&gt;</summary>
    
    
    <category term="How2" scheme="http://mikelhsia.github.io/categories/How2/"/>
    
    <category term="Quantitative Trading" scheme="http://mikelhsia.github.io/categories/How2/Quantitative-Trading/"/>
    
    
    <category term="How2" scheme="http://mikelhsia.github.io/tags/How2/"/>
    
    <category term="Python3" scheme="http://mikelhsia.github.io/tags/Python3/"/>
    
    <category term="Interactive Broker" scheme="http://mikelhsia.github.io/tags/Interactive-Broker/"/>
    
    <category term="Sqlite3" scheme="http://mikelhsia.github.io/tags/Sqlite3/"/>
    
  </entry>
  
  <entry>
    <title>【How 2】 Set Up Trading API Template In Python - Placing orders with Interactive Brokers</title>
    <link href="http://mikelhsia.github.io/2022/12/12/2022-12-16-IBKR-broker-2/"/>
    <id>http://mikelhsia.github.io/2022/12/12/2022-12-16-IBKR-broker-2/</id>
    <published>2022-12-12T03:11:24.000Z</published>
    <updated>2022-12-15T07:05:45.858Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2022/12/12/2022-12-16-IBKR-broker-2/cover.png" class="" width="800"><p>This is the second part of the <a href="https://mikelhsia.github.io/2022/12/12/2022-12-10-IBKR-Broker/"><strong>Set Up Trading API Template In Python</strong></a>. We’re going to focus on implementing the rest of the functions in our Interactive Broker class.</p><a id="more"></a><hr><blockquote><p>If you enjoy reading this and my other articles, feel free to <a src='https://medium.com/@mikelhsia/membership'> join Medium membership program</a> to read more about Quantitative Trading Strategy.</p></blockquote><hr><p><strong><em>Previous researches</em></strong></p><ul><li><a href="https://mikelhsia.github.io/2022/11/04/2022-11-10-advanced-cppi-strategy/">【Momentum Trading】A Defense Trading Strategy That Works - CPPI (Constant Proportion Portfolio Insurance)</a></li><li><a href="https://mikelhsia.github.io/2022/12/12/2022-12-10-IBKR-Broker/">【How 2】Set Up Trading API Template In Python - Connecting My Trading Strategies To Interactive Brokers</a></li></ul><h1 id="Recap-and-what’s-the-next"><a href="#Recap-and-what’s-the-next" class="headerlink" title="Recap and what’s the next"></a>Recap and what’s the next</h1><p>In the previous post, we get to know how our trading script sends API calls to the IBKR API service via the IB gateway. Also, we have learned how to configure the IB gateway application. Lastly, we also showcase the code snippet to get the available cash balance and the total investment market value under your account from the server. Now, we’ll look at the rest of the functions in our <code>InteractiveBrokerTradeAPI</code> class.</p><ul><li>Get a much more detailed status report with get_account_detail</li><li>Fetch the market calendar</li><li>How to create a valid order for Interactive Broker</li></ul><h1 id="API-document-reference"><a href="#API-document-reference" class="headerlink" title="API document reference"></a>API document reference</h1><p><a href="https://ib-insync.readthedocs.io/api.html"><code>ib_insync</code></a></p><h1 id="Get-a-deep-dive-status-report-with-get-account-detail"><a href="#Get-a-deep-dive-status-report-with-get-account-detail" class="headerlink" title="Get a deep dive status report with get_account_detail"></a>Get a deep dive status report with get_account_detail</h1><p>The <code>get_account_detail()</code> in our earlier example has successfully extracted the <code>TotalCashBalance</code> and <code>StockMarketValue</code> from the <code>ib.accountValues()</code> response. Yet, if we would like to know more about our portfolio status, we can include two more calls to obtain more information about our portfolio: 1. positions in our portfolio, 2. the orders that we placed in the past 24 hours.</p><p>For 1., we use <code>ib.portfolio()</code> to acquire information on the stocks we hold. We will extract the position size and the market value of each symbol, thus gaining a bigger picture of how our portfolio looks like.<br><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">[PortfolioItem(contract=Stock(conId=<span class="number">42454579</span>, symbol=<span class="string">&#x27;SHV&#x27;</span>, right=<span class="string">&#x27;0&#x27;</span>, primaryExchange=<span class="string">&#x27;NASDAQ&#x27;</span>, currency=<span class="string">&#x27;USD&#x27;</span>, localSymbol=<span class="string">&#x27;SHV&#x27;</span>, tradingClass=<span class="string">&#x27;NMS&#x27;</span>), position=<span class="number">427.0</span>, marketPrice=<span class="number">109.9701004</span>, marketValue=<span class="number">46957.23</span>, averageCost=<span class="number">109.98463535</span>, unrealizedPNL=<span class="number">-6.21</span>, realizedPNL=<span class="number">0.0</span>, account=<span class="string">&#x27;DU4399668&#x27;</span>),</span><br><span class="line"> PortfolioItem(contract=Stock(conId=<span class="number">39622943</span>, symbol=<span class="string">&#x27;SSO&#x27;</span>, right=<span class="string">&#x27;0&#x27;</span>, primaryExchange=<span class="string">&#x27;ARCA&#x27;</span>, currency=<span class="string">&#x27;USD&#x27;</span>, localSymbol=<span class="string">&#x27;SSO&#x27;</span>, tradingClass=<span class="string">&#x27;SSO&#x27;</span>), position=<span class="number">1060.0</span>, marketPrice=<span class="number">47.5340004</span>, marketValue=<span class="number">50386.04</span>, averageCost=<span class="number">48.7388397</span>, unrealizedPNL=<span class="number">-1277.13</span>, realizedPNL=<span class="number">-62.91</span>, account=<span class="string">&#x27;DU4399668&#x27;</span>)]</span><br></pre></td></tr></table></figure></p><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Response from <b>ib.portfolio()</b> call</i></p><p>As for 2., we use <code>ib.trades()</code> to obtain the trades we made in the past 24 hours. Remember, the Interactive broker holds this information for only 24 hours or so, and you won’t be able to retrieve this piece once the server drops this information. Therefore, we will find a way to address this in another post to persist the order-related information. In the below Trade objects, we extract the information we need such as order id, average price, order status, commission cost, and so on for each symbol.<br><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">[Trade(contract=Stock(conId=<span class="number">42454579</span>, symbol=<span class="string">&#x27;SHV&#x27;</span>, right=<span class="string">&#x27;?&#x27;</span>, exchange=<span class="string">&#x27;SMART&#x27;</span>, currency=<span class="string">&#x27;USD&#x27;</span>, localSymbol=<span class="string">&#x27;SHV&#x27;</span>, tradingClass=<span class="string">&#x27;NMS&#x27;</span>), order=Order(permId=<span class="number">423185966</span>, action=<span class="string">&#x27;SELL&#x27;</span>, orderType=<span class="string">&#x27;MKT&#x27;</span>, lmtPrice=<span class="number">0.0</span>, auxPrice=<span class="number">0.0</span>, tif=<span class="string">&#x27;DAY&#x27;</span>, ocaType=<span class="number">3</span>, displaySize=<span class="number">2147483647</span>, rule80A=<span class="string">&#x27;0&#x27;</span>, openClose=<span class="string">&#x27;&#x27;</span>, volatilityType=<span class="number">0</span>, deltaNeutralOrderType=<span class="string">&#x27;None&#x27;</span>, referencePriceType=<span class="number">0</span>, account=<span class="string">&#x27;DU4399668&#x27;</span>, clearingIntent=<span class="string">&#x27;IB&#x27;</span>, cashQty=<span class="number">0.0</span>, dontUseAutoPriceForHedge=True, filledQuantity=<span class="number">1.0</span>, refFuturesConId=<span class="number">2147483647</span>, shareholder=<span class="string">&#x27;Not an insider or substantial shareholder&#x27;</span>), orderStatus=OrderStatus(orderId=<span class="number">0</span>, status=<span class="string">&#x27;Filled&#x27;</span>, filled=<span class="number">0.0</span>, remaining=<span class="number">0.0</span>, avgFillPrice=<span class="number">0.0</span>, permId=<span class="number">0</span>, parentId=<span class="number">0</span>, lastFillPrice=<span class="number">0.0</span>, clientId=<span class="number">0</span>, whyHeld=<span class="string">&#x27;&#x27;</span>, mktCapPrice=<span class="number">0.0</span>), fills=[Fill(contract=Stock(conId=<span class="number">42454579</span>, symbol=<span class="string">&#x27;SHV&#x27;</span>, right=<span class="string">&#x27;?&#x27;</span>, exchange=<span class="string">&#x27;SMART&#x27;</span>, currency=<span class="string">&#x27;USD&#x27;</span>, localSymbol=<span class="string">&#x27;SHV&#x27;</span>, tradingClass=<span class="string">&#x27;NMS&#x27;</span>), execution=Execution(execId=<span class="string">&#x27;00025b49.63971bea.01.01&#x27;</span>, time=datetime.datetime(<span class="number">2022</span>, <span class="number">12</span>, <span class="number">12</span>, <span class="number">17</span>, <span class="number">2</span>, <span class="number">38</span>, tzinfo=datetime.timezone.utc), acctNumber=<span class="string">&#x27;DU4399668&#x27;</span>, exchange=<span class="string">&#x27;EDGEA&#x27;</span>, side=<span class="string">&#x27;SLD&#x27;</span>, shares=<span class="number">1.0</span>, price=<span class="number">109.97</span>, permId=<span class="number">423185966</span>, clientId=<span class="number">0</span>, orderId=<span class="number">0</span>, liquidation=<span class="number">0</span>, cumQty=<span class="number">1.0</span>, avgPrice=<span class="number">109.97</span>, orderRef=<span class="string">&#x27;&#x27;</span>, evRule=<span class="string">&#x27;&#x27;</span>, evMultiplier=<span class="number">0.0</span>, modelCode=<span class="string">&#x27;&#x27;</span>, lastLiquidity=<span class="number">2</span>), commissionReport=CommissionReport(execId=<span class="string">&#x27;00025b49.63971bea.01.01&#x27;</span>, commission=<span class="number">1.002648</span>, currency=<span class="string">&#x27;USD&#x27;</span>, realizedPNL=<span class="number">-1.017284</span>, yield_=<span class="number">0.0</span>, yieldRedemptionDate=<span class="number">0</span>), time=datetime.datetime(<span class="number">2022</span>, <span class="number">12</span>, <span class="number">12</span>, <span class="number">17</span>, <span class="number">2</span>, <span class="number">38</span>, tzinfo=datetime.timezone.utc))], log=[TradeLogEntry(time=datetime.datetime(<span class="number">2022</span>, <span class="number">12</span>, <span class="number">12</span>, <span class="number">17</span>, <span class="number">2</span>, <span class="number">38</span>, tzinfo=datetime.timezone.utc), status=<span class="string">&#x27;Filled&#x27;</span>, message=<span class="string">&#x27;Fill 1.0@109.97&#x27;</span>, errorCode=<span class="number">0</span>)], advancedError=<span class="string">&#x27;&#x27;</span>)]</span><br></pre></td></tr></table></figure></p><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Response from <b>ib.trades()</b> call</i></p><p>Combining everything we talked about above, we can construct our <code>get_account_detail()</code> function as below:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">get_account_detail</span>(<span class="params">self</span>):</span></span><br><span class="line">    self.accounts = self.client.managedAccounts()</span><br><span class="line"></span><br><span class="line">    acc_data = []</span><br><span class="line">    <span class="keyword">for</span> account <span class="keyword">in</span> self.accounts:</span><br><span class="line">        acc = &#123;&#125;</span><br><span class="line">        acc[<span class="string">&#x27;account&#x27;</span>] = account</span><br><span class="line">        data = self.client.accountValues(account)</span><br><span class="line">        acc[<span class="string">&#x27;cash&#x27;</span>] = <span class="number">0</span></span><br><span class="line">        acc[<span class="string">&#x27;total_assets&#x27;</span>] = <span class="number">0</span></span><br><span class="line">        <span class="keyword">for</span> row <span class="keyword">in</span> data:</span><br><span class="line">            <span class="keyword">if</span> row.tag <span class="keyword">in</span> [<span class="string">&#x27;TotalCashBalance&#x27;</span>] <span class="keyword">and</span> row.currency == self.currency:</span><br><span class="line">                acc[<span class="string">&#x27;cash&#x27;</span>] = row.value</span><br><span class="line">                acc[<span class="string">&#x27;total_assets&#x27;</span>] += float(row.value)</span><br><span class="line">            <span class="keyword">elif</span> row.tag <span class="keyword">in</span> [<span class="string">&#x27;StockMarketValue&#x27;</span>] <span class="keyword">and</span> row.currency == self.currency:</span><br><span class="line">                acc[<span class="string">&#x27;total_assets&#x27;</span>] += float(row.value)</span><br><span class="line">        acc_data.append(acc)</span><br><span class="line"></span><br><span class="line">    pos_data = []</span><br><span class="line">    data = self.client.portfolio()</span><br><span class="line">    <span class="keyword">for</span> position <span class="keyword">in</span> data:</span><br><span class="line">        pos = &#123;&#125;</span><br><span class="line"></span><br><span class="line">        pos[<span class="string">&#x27;code&#x27;</span>] = position.contract.symbol</span><br><span class="line">        pos[<span class="string">&#x27;qty&#x27;</span>] = position.position</span><br><span class="line">        pos[<span class="string">&#x27;cost_price&#x27;</span>] = position.averageCost</span><br><span class="line">        pos[<span class="string">&#x27;market_val&#x27;</span>] = position.marketValue</span><br><span class="line">        pos[<span class="string">&#x27;pl_val&#x27;</span>] = position.unrealizedPNL</span><br><span class="line">        <span class="keyword">if</span> pos[<span class="string">&#x27;cost_price&#x27;</span>] * pos[<span class="string">&#x27;qty&#x27;</span>] == <span class="number">0</span>:</span><br><span class="line">            pos[<span class="string">&#x27;pl_ratio&#x27;</span>] = <span class="number">0</span></span><br><span class="line">        <span class="keyword">else</span>:</span><br><span class="line">            pos[<span class="string">&#x27;pl_ratio&#x27;</span>] = pos[<span class="string">&#x27;pl_val&#x27;</span>] / (pos[<span class="string">&#x27;cost_price&#x27;</span>] * pos[<span class="string">&#x27;qty&#x27;</span>])</span><br><span class="line">        pos_data.append(pos)</span><br><span class="line"></span><br><span class="line">    orders_data = []</span><br><span class="line">    data = self.client.trades()</span><br><span class="line">    <span class="keyword">for</span> order <span class="keyword">in</span> data:</span><br><span class="line">        o = &#123;&#125;</span><br><span class="line">        o[<span class="string">&#x27;order_id&#x27;</span>] = order.order.orderId</span><br><span class="line">        o[<span class="string">&#x27;order_status&#x27;</span>] = order.orderStatus.status</span><br><span class="line">        o[<span class="string">&#x27;create_time&#x27;</span>] = order.log[<span class="number">-1</span>].time</span><br><span class="line">        o[<span class="string">&#x27;trd_side&#x27;</span>] = order.order.action</span><br><span class="line">        o[<span class="string">&#x27;order_type&#x27;</span>] = order.order.action</span><br><span class="line">        o[<span class="string">&#x27;code&#x27;</span>] = order.contract.symbol</span><br><span class="line">        orders_data.append(o)</span><br><span class="line">    <span class="keyword">return</span> acc_data, pos_data, orders_data</span><br></pre></td></tr></table></figure></p><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Full code of <b>get_account_detail()</b> function</i></p><h1 id="Fetch-the-market-calendar"><a href="#Fetch-the-market-calendar" class="headerlink" title="Fetch the market calendar"></a>Fetch the market calendar</h1><p>The trading hours information in <code>ib_insync</code> package is quite discreet. After reading the API document very carefully, I finally found it in the response of <code>ib.reqContractDetails()</code> call and look like this:<br><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">ContractDetails(contract=Contract(secType=<span class="string">&#x27;STK&#x27;</span>, conId=<span class="number">756733</span>, symbol=<span class="string">&#x27;SPY&#x27;</span>, exchange=<span class="string">&#x27;SMART&#x27;</span>, primaryExchange=<span class="string">&#x27;ARCA&#x27;</span>, currency=<span class="string">&#x27;USD&#x27;</span>, localSymbol=<span class="string">&#x27;SPY&#x27;</span>,</span><br><span class="line">tradingClass=<span class="string">&#x27;SPY&#x27;</span>), marketName=<span class="string">&#x27;SPY&#x27;</span>, minTick=<span class="number">0.01</span>, orderTypes=<span class="string">&#x27;ACTIVETIM,AD,ADJUST,ALERT,ALGO,ALLOC,AON,AVGCOST,BASKET,BENCHPX,CASHQTY,COND,CONDORDER,DARKONLY,</span></span><br><span class="line"><span class="string">DARKPOLL,DAY,DEACT,DEACTDIS,DEACTEOD,DIS,DUR,GAT,GTC,GTD,GTT,HID,IBKRATS,ICE,IOC,LIT,LMT,LOC,MIDPX,MIT,MKT,MOC,MTL,NGCOMB,NODARK,NONALGO,OCA,OPG,OPGREROUT,PEGBENCH,</span></span><br><span class="line"><span class="string">PEGMID,POSTATS,POSTONLY,PREOPGRTH,PRICECHK,REL,REL2MID,RELPCTOFS,RTH,SCALE,SCALEODD,SCALERST,SIZECHK,SMARTSTG,SNAPMID,SNAPMKT,SNAPREL,STP,STPLMT,SWEEP,TRAIL,TRAILLIT,</span></span><br><span class="line"><span class="string">TRAILLMT,TRAILMIT,WHATIF&#x27;</span>, validExchanges=<span class="string">&#x27;SMART,AMEX,NYSE,CBOE,PHLX,ISE,CHX,ARCA,ISLAND,DRCTEDGE,BEX,BATS,EDGEA,CSFBALGO,JEFFALGO,BYX,IEX,EDGX,FOXRIVER,PEARL,NYSENAT,</span></span><br><span class="line"><span class="string">LTSE,MEMX,IBEOS,PSX&#x27;</span>, priceMagnifier=<span class="number">1</span>, underConId=<span class="number">0</span>, longName=<span class="string">&#x27;SPDR S&amp;P 500 ETF TRUST&#x27;</span>, contractMonth=<span class="string">&#x27;&#x27;</span>, industry=<span class="string">&#x27;&#x27;</span>, category=<span class="string">&#x27;&#x27;</span>, subcategory=<span class="string">&#x27;&#x27;</span>, timeZoneId=<span class="string">&#x27;US/Eastern&#x27;</span>,</span><br><span class="line"> tradingHours=<span class="string">&#x27;20221212:0400-20221212:2000;20221213:0400-20221213:2000;20221214:0400-20221214:2000;20221215:0400-20221215:2000;20221216:0400-20221216:2000&#x27;</span>,</span><br><span class="line"> liquidHours=<span class="string">&#x27;20221212:0930-20221212:1600;20221213:0930-20221213:1600;20221214:0930-20221214:1600;20221215:0930-20221215:1600;20221216:0930-20221216:1600&#x27;</span>, evRule=<span class="string">&#x27;&#x27;</span>,</span><br><span class="line"> evMultiplier=<span class="number">0</span>, mdSizeMultiplier=<span class="number">1</span>, aggGroup=<span class="number">1</span>, underSymbol=<span class="string">&#x27;&#x27;</span>, underSecType=<span class="string">&#x27;&#x27;</span>, marketRuleIds=<span class="string">&#x27;26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26&#x27;</span>,</span><br><span class="line"> secIdList=[TagValue(tag=<span class="string">&#x27;ISIN&#x27;</span>, value=<span class="string">&#x27;US78462F1030&#x27;</span>)], realExpirationDate=<span class="string">&#x27;&#x27;</span>, lastTradeTime=<span class="string">&#x27;&#x27;</span>, stockType=<span class="string">&#x27;ETF&#x27;</span>, minSize=<span class="number">0.0001</span>, sizeIncrement=<span class="number">0.0001</span>,</span><br><span class="line"> suggestedSizeIncrement=<span class="number">100.0</span>, cusip=<span class="string">&#x27;&#x27;</span>, ratings=<span class="string">&#x27;&#x27;</span>, descAppend=<span class="string">&#x27;&#x27;</span>, bondType=<span class="string">&#x27;&#x27;</span>, couponType=<span class="string">&#x27;&#x27;</span>, callable=False, putable=False, coupon=<span class="number">0</span>, convertible=False,</span><br><span class="line"> maturity=<span class="string">&#x27;&#x27;</span>, issueDate=<span class="string">&#x27;&#x27;</span>, nextOptionDate=<span class="string">&#x27;&#x27;</span>, nextOptionType=<span class="string">&#x27;&#x27;</span>, nextOptionPartial=False, notes=<span class="string">&#x27;&#x27;</span>)</span><br></pre></td></tr></table></figure></p><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Response of <b></b>ib.reqContractDetails()</b> functions</i></p><p>The trading calendar resides in this response, and we can extract them by parsing them like this:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">is_market_open</span>(<span class="params">self, offset_days=<span class="number">0</span></span>):</span></span><br><span class="line">    spy_contract = ib_insync.Stock(<span class="string">&#x27;SPY&#x27;</span>, <span class="string">&#x27;SMART&#x27;</span>, self.currency)</span><br><span class="line">    self.client.qualifyContracts(spy_contract)</span><br><span class="line">    trading_days = self.client.reqContractDetails(spy_contract)[<span class="number">0</span>].liquidHours</span><br><span class="line">    trading_days_dict = &#123;d.split(<span class="string">&#x27;:&#x27;</span>)[<span class="number">0</span>]:d.split(<span class="string">&#x27;:&#x27;</span>)[<span class="number">1</span>] <span class="keyword">for</span> d <span class="keyword">in</span> trading_days.split(<span class="string">&#x27;;&#x27;</span>)&#125;</span><br><span class="line">    today_str = (datetime.now().astimezone(self.timezone) + timedelta(days=offset_days)).strftime(<span class="string">&#x27;%Y%m%d&#x27;</span>)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> k, v <span class="keyword">in</span> trading_days_dict.items():</span><br><span class="line">        <span class="keyword">if</span> (today_str <span class="keyword">in</span> k) <span class="keyword">and</span> (v == <span class="string">&#x27;CLOSED&#x27;</span>):</span><br><span class="line">            <span class="keyword">return</span> <span class="literal">False</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="literal">True</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">is_market_open_now</span>(<span class="params">self</span>):</span></span><br><span class="line">    spy_contract = ib_insync.Stock(<span class="string">&#x27;SPY&#x27;</span>, <span class="string">&#x27;SMART&#x27;</span>, self.currency)</span><br><span class="line">    self.client.qualifyContracts(spy_contract)</span><br><span class="line">    trading_days = self.client.reqContractDetails(spy_contract)[<span class="number">0</span>].liquidHours</span><br><span class="line">    trading_days_list = [d.split(<span class="string">&#x27;-&#x27;</span>) <span class="keyword">for</span> d <span class="keyword">in</span> trading_days.split(<span class="string">&#x27;;&#x27;</span>)]</span><br><span class="line"></span><br><span class="line">    day_str = datetime.now().astimezone(self.timezone).strftime(<span class="string">&#x27;%Y%m%d&#x27;</span>)</span><br><span class="line">    time_str = datetime.now().astimezone(self.timezone).strftime(<span class="string">&#x27;%H%M&#x27;</span>)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> d <span class="keyword">in</span> trading_days_list:</span><br><span class="line">        <span class="keyword">if</span> len(d) &gt; <span class="number">1</span> <span class="keyword">and</span> day_str <span class="keyword">in</span> d[<span class="number">0</span>].split()[<span class="number">0</span>]:</span><br><span class="line">            <span class="keyword">if</span> time_str &gt; d[<span class="number">0</span>].split(<span class="string">&#x27;:&#x27;</span>)[<span class="number">1</span>] <span class="keyword">and</span> time_str &lt; d[<span class="number">1</span>].split(<span class="string">&#x27;:&#x27;</span>)[<span class="number">1</span>]:</span><br><span class="line">                <span class="keyword">return</span> <span class="literal">True</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="literal">False</span></span><br></pre></td></tr></table></figure><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Full code of <b>is_market_open()</b> and <b>is_market_open_now()</b> functions</i></p><h1 id="How-to-create-a-valid-order-for-Interactive-Broker"><a href="#How-to-create-a-valid-order-for-Interactive-Broker" class="headerlink" title="How to create a valid order for Interactive Broker"></a>How to create a valid order for Interactive Broker</h1><p>In order to create a valid order that Interactive Broker could recognize, there are a few steps to follow:</p><ol><li>Specify the symbol and the currency used. Use <code>contract = ib_insync.Stock(symbol, &#39;SMART&#39;, self.currency)</code> to create a <code>contract</code> object for later use. Stock symbol would be the first parameter, the name of the stock exchange be the second, and the currency symbol (here we use <code>USD</code>) would be the third.</li><li>Make a query to the broker to filter and find the related stock information. <code>ib.qualifyContracts(contract)</code> would activate the <code>contract</code> object and infuse live data from the stock exchange.<img data-src="/2022/12/12/2022-12-16-IBKR-broker-2/contract.png" class="" width="800"><p style="font-size: 0.8em; text-align:center; color: grey;"><i>Contract object before and after using <b>ib.qualifyContracts()</b> to infuse correct data</i></p></li><li>We need the latest quote price in order to calculate how many shares we would like to purchase. First of all, you need to specify the <code>reqMarketDataType</code> to tell the server which type of data you’re requesting. There are four market data types:<ol><li>1 - Live market data: (top of the book)</li><li>2 - Frozen data (at the close)</li><li>3 - Delayed data (can be used if there are no live subscriptions)</li><li>4 - Frozen Delayed data (outside of regular trading hours)<br>Once we have specified the market data type, we’re all set to request the quote price from the server using the <code>contract</code> in your first parameter and <code>snapshot</code> to <em>True</em>.<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">ib.reqMarketDataType(<span class="number">3</span>)</span><br><span class="line">quote = ib.reqMktData(</span><br><span class="line">    contract,</span><br><span class="line">    genericTickList=<span class="string">&quot;&quot;</span>,</span><br><span class="line">    snapshot=<span class="literal">True</span>,</span><br><span class="line">    regulatorySnapshot=<span class="literal">False</span>,</span><br><span class="line">    mktDataOptions=<span class="literal">None</span></span><br><span class="line">)</span><br></pre></td></tr></table></figure>One thing worth mentioning is that, do you remember the reason why I’m using the <code>ib_insync</code> package instead of the native IBKR API? Instead of saying fetching a quote price from the API server, subscribing to the periodical price change would be a better way to put it. We first subscribe to the price bar to get 5-minute, 10-minute, or one-day price data, and another thread would be created to stream the price data. Therefore, to extract the quote price from the returned object, you first must ensure the quote price has been successfully returned/received.<blockquote><p><em>Notes: Before requesting a market quote, you need to subscribe to the market data on the IBKR platform. You can find the management page in the TWS or IB gateway tab “Account” -&gt; “Manage Account” -&gt; “Subscribe Market Data/Research”</em></p></blockquote></li></ol></li><li>Lastly, other than using <code>sleep()</code> function call to ensure that we have received the price data from the server, we can also assign the callback function to monitor the status of a specific order status change. Here we use a global configuration under the <code>ib</code> instance to specify this callback function by using <code>ib.orderStatusEvent += [callback function]</code></li></ol><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">@contextmanager</span></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">connect</span>(<span class="params">self</span>):</span></span><br><span class="line">    self.client = ib_insync.IB()</span><br><span class="line">    <span class="comment"># Newly added</span></span><br><span class="line">    self.client.orderStatusEvent += self.__order_status</span><br><span class="line">    self.client.connect(</span><br><span class="line">        IB_TWS_URI,</span><br><span class="line">        <span class="comment"># IB_GATEWAY_PAPER_PORT,</span></span><br><span class="line">        IB_TWS_PAPER_PORT,</span><br><span class="line">        IB_TWS_CLIENT_ID</span><br><span class="line">    )</span><br><span class="line"></span><br><span class="line">    <span class="keyword">yield</span> self</span><br><span class="line"></span><br><span class="line">    self.client.disconnect()</span><br><span class="line">    self.client.sleep(<span class="number">2</span>)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">place_order</span>(<span class="params">self, symbol: str, quantity: int, price: float=<span class="number">0</span></span>):</span></span><br><span class="line">    contract = ib_insync.Stock(symbol.upper(), <span class="string">&#x27;SMART&#x27;</span>, self.currency)</span><br><span class="line">    self.client.qualifyContracts(contract)</span><br><span class="line">    <span class="keyword">if</span> quantity &gt;= <span class="number">0.0</span>:</span><br><span class="line">        order = ib_insync.MarketOrder(<span class="string">&#x27;BUY&#x27;</span>, quantity)</span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        order = ib_insync.MarketOrder(<span class="string">&#x27;SELL&#x27;</span>, -quantity)</span><br><span class="line">    trade = self.client.placeOrder(contract, order)</span><br><span class="line">    self.client.sleep(<span class="number">5</span>)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="literal">True</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">get_last_price_from_quote</span>(<span class="params">self</span>):</span></span><br><span class="line">    contract = ib_insync.Stock(symbol.upper(), <span class="string">&#x27;SMART&#x27;</span>, self.currency)</span><br><span class="line">    self.client.qualifyContracts(contract)</span><br><span class="line">    <span class="comment"># ib.reqMarketDataType(1)   # Live market data: (top of the book)</span></span><br><span class="line">    <span class="comment"># ib.reqMarketDataType(2)   # Frozen data (at the close)</span></span><br><span class="line">    <span class="comment"># ib.reqMarketDataType(3)   # Delayed data (can be used if there is no live subscriptions)</span></span><br><span class="line">    <span class="comment"># ib.reqMarketDataType(4)   # Frozen Delayed data (outside of regular trading hours)</span></span><br><span class="line">    self.client.reqMarketDataType(<span class="number">3</span>)</span><br><span class="line">    quote = self.client.reqMktData(</span><br><span class="line">        contract,</span><br><span class="line">        genericTickList=<span class="string">&quot;&quot;</span>,</span><br><span class="line">        snapshot=<span class="literal">True</span>,</span><br><span class="line">        regulatorySnapshot=<span class="literal">False</span>,</span><br><span class="line">        mktDataOptions=<span class="literal">None</span></span><br><span class="line">    )</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> _ <span class="keyword">in</span> range(<span class="number">10</span>):</span><br><span class="line">        <span class="keyword">if</span> math.isnan(quote.last):</span><br><span class="line">            self.client.sleep(<span class="number">1</span>)</span><br><span class="line">        <span class="keyword">else</span>:</span><br><span class="line">            <span class="keyword">return</span> quote.last</span><br><span class="line">    logger.logger.error(<span class="string">&#x27;&#123;&#125;[&#123;&#125;]: &#123;&#125;&#x27;</span>.format(sys._getframe().f_code.co_name, symbol, <span class="string">&#x27;No last price in quote&#x27;</span>))</span><br><span class="line">    self.notifier.send_msg(<span class="string">&#x27;&#123;&#125;[&#123;&#125;]&#x27;</span>.format(sys._getframe().f_code.co_name, symbol), <span class="string">&#x27;No last price in quote&#x27;</span>)</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span></span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">__order_status</span>(<span class="params">self, trade</span>):</span></span><br><span class="line">    <span class="string">&#x27;&#x27;&#x27;</span></span><br><span class="line"><span class="string">    Call back function for checking order status</span></span><br><span class="line"><span class="string">    &#x27;&#x27;&#x27;</span></span><br><span class="line">    print(<span class="string">f&#x27;Order [<span class="subst">&#123;trade.contract.symbol&#125;</span>] status updated: <span class="subst">&#123;trade.orderStatus.status&#125;</span>&#x27;</span>)</span><br><span class="line">    match trade.orderStatus.status:</span><br><span class="line">        case <span class="string">&#x27;Filled&#x27;</span>:</span><br><span class="line">            print(<span class="string">f&#x27;<span class="subst">&#123;trade=&#125;</span>&#x27;</span>)</span><br><span class="line">            self.update_order_from_filledEvent_in_db(trade)</span><br><span class="line">        case <span class="string">&#x27;PendingSubmit&#x27;</span>:</span><br><span class="line">            <span class="comment"># print(f&#x27;Pending submit: &#123;trade&#125;&#x27;)</span></span><br><span class="line">            <span class="keyword">pass</span></span><br><span class="line">        case <span class="string">&#x27;Submitted&#x27;</span>:</span><br><span class="line">            <span class="comment"># print(f&#x27;Submitted: &#123;trade&#125;&#x27;)</span></span><br><span class="line">            <span class="keyword">pass</span></span><br><span class="line">        case _:</span><br><span class="line">            <span class="comment"># print(f&#x27;Others: &#123;trade.orderStatus.status&#125;&#x27;)</span></span><br><span class="line">            <span class="keyword">pass</span></span><br></pre></td></tr></table></figure><h1 id="Let’s-wrap-it-up"><a href="#Let’s-wrap-it-up" class="headerlink" title="Let’s wrap it up"></a>Let’s wrap it up</h1><p>We have all the pieces ready except the <code>get_transaction()</code>, which we will talk about it in another post as we need to take Database management into account. Let’s now add some test code so that you can also place the order.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br><span class="line">132</span><br><span class="line">133</span><br><span class="line">134</span><br><span class="line">135</span><br><span class="line">136</span><br><span class="line">137</span><br><span class="line">138</span><br><span class="line">139</span><br><span class="line">140</span><br><span class="line">141</span><br><span class="line">142</span><br><span class="line">143</span><br><span class="line">144</span><br><span class="line">145</span><br><span class="line">146</span><br><span class="line">147</span><br><span class="line">148</span><br><span class="line">149</span><br><span class="line">150</span><br><span class="line">151</span><br><span class="line">152</span><br><span class="line">153</span><br><span class="line">154</span><br><span class="line">155</span><br><span class="line">156</span><br><span class="line">157</span><br><span class="line">158</span><br><span class="line">159</span><br><span class="line">160</span><br><span class="line">161</span><br><span class="line">162</span><br><span class="line">163</span><br><span class="line">164</span><br><span class="line">165</span><br><span class="line">166</span><br><span class="line">167</span><br><span class="line">168</span><br><span class="line">169</span><br><span class="line">170</span><br><span class="line">171</span><br><span class="line">172</span><br><span class="line">173</span><br><span class="line">174</span><br><span class="line">175</span><br><span class="line">176</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># test.py</span></span><br><span class="line"><span class="keyword">from</span> contextlib <span class="keyword">import</span> contextmanager</span><br><span class="line"><span class="keyword">import</span> ib_insync</span><br><span class="line"><span class="keyword">from</span> modules.broker.TradeAPI <span class="keyword">import</span> AbstractTradeInterface</span><br><span class="line"><span class="keyword">from</span> datetime <span class="keyword">import</span> datetime, timedelta</span><br><span class="line"><span class="keyword">import</span> math</span><br><span class="line"><span class="keyword">from</span> zoneinfo <span class="keyword">import</span> ZoneInfo</span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">InteractiveBrokerTradeAPI</span>(<span class="params">AbstractTradeInterface</span>):</span></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">__init__</span>(<span class="params">self,currency=<span class="string">&#x27;USD&#x27;</span></span>):</span></span><br><span class="line">        self.client = <span class="literal">None</span></span><br><span class="line">        self.accounts = []</span><br><span class="line">        self.currency = currency</span><br><span class="line">        self.timezone = ZoneInfo(<span class="string">&#x27;US/Eastern&#x27;</span>)</span><br><span class="line"></span><br><span class="line"><span class="meta">    @contextmanager</span></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">connect</span>(<span class="params">self</span>):</span></span><br><span class="line">        self.client = ib_insync.IB()</span><br><span class="line">        <span class="comment"># Newly added</span></span><br><span class="line">        self.client.orderStatusEvent += self.__order_status</span><br><span class="line">        self.client.connect(<span class="string">&#x27;127.0.0.1&#x27;</span>, <span class="number">7497</span>, <span class="number">101</span>)</span><br><span class="line">        print(<span class="string">&quot;=&quot;</span>*<span class="number">30</span>)</span><br><span class="line">        print(<span class="string">&quot;Connection established&quot;</span>)</span><br><span class="line">        print(<span class="string">&quot;=&quot;</span>*<span class="number">30</span>)</span><br><span class="line"></span><br><span class="line">        <span class="keyword">yield</span> self</span><br><span class="line"></span><br><span class="line">        self.client.disconnect()</span><br><span class="line">        self.client.sleep(<span class="number">2</span>)</span><br><span class="line">        print(<span class="string">&quot;=&quot;</span>*<span class="number">30</span>)</span><br><span class="line">        print(<span class="string">&quot;Connection closed&quot;</span>)</span><br><span class="line">        print(<span class="string">&quot;=&quot;</span>*<span class="number">30</span>)</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">get_account_detail</span>(<span class="params">self</span>):</span></span><br><span class="line">        self.accounts = self.client.managedAccounts()</span><br><span class="line"></span><br><span class="line">        acc_data = []</span><br><span class="line">        <span class="keyword">for</span> account <span class="keyword">in</span> self.accounts:</span><br><span class="line">            acc = &#123;&#125;</span><br><span class="line">            acc[<span class="string">&#x27;account&#x27;</span>] = account</span><br><span class="line">            data = self.client.accountValues(account)</span><br><span class="line">            acc[<span class="string">&#x27;cash&#x27;</span>] = <span class="number">0</span></span><br><span class="line">            acc[<span class="string">&#x27;total_assets&#x27;</span>] = <span class="number">0</span></span><br><span class="line">            <span class="keyword">for</span> row <span class="keyword">in</span> data:</span><br><span class="line">                <span class="keyword">if</span> row.tag <span class="keyword">in</span> [<span class="string">&#x27;TotalCashBalance&#x27;</span>] <span class="keyword">and</span> row.currency == self.currency:</span><br><span class="line">                    acc[<span class="string">&#x27;cash&#x27;</span>] = row.value</span><br><span class="line">                    acc[<span class="string">&#x27;total_assets&#x27;</span>] += float(row.value)</span><br><span class="line">                <span class="keyword">elif</span> row.tag <span class="keyword">in</span> [<span class="string">&#x27;StockMarketValue&#x27;</span>] <span class="keyword">and</span> row.currency == self.currency:</span><br><span class="line">                    acc[<span class="string">&#x27;total_assets&#x27;</span>] += float(row.value)</span><br><span class="line">            acc_data.append(acc)</span><br><span class="line"></span><br><span class="line">        pos_data = []</span><br><span class="line">        data = self.client.portfolio()</span><br><span class="line">        <span class="keyword">for</span> position <span class="keyword">in</span> data:</span><br><span class="line">            pos = &#123;&#125;</span><br><span class="line"></span><br><span class="line">            pos[<span class="string">&#x27;code&#x27;</span>] = position.contract.symbol</span><br><span class="line">            pos[<span class="string">&#x27;qty&#x27;</span>] = position.position</span><br><span class="line">            pos[<span class="string">&#x27;cost_price&#x27;</span>] = position.averageCost</span><br><span class="line">            pos[<span class="string">&#x27;market_val&#x27;</span>] = position.marketValue</span><br><span class="line">            pos[<span class="string">&#x27;pl_val&#x27;</span>] = position.unrealizedPNL</span><br><span class="line">            <span class="keyword">if</span> pos[<span class="string">&#x27;cost_price&#x27;</span>] * pos[<span class="string">&#x27;qty&#x27;</span>] == <span class="number">0</span>:</span><br><span class="line">                pos[<span class="string">&#x27;pl_ratio&#x27;</span>] = <span class="number">0</span></span><br><span class="line">            <span class="keyword">else</span>:</span><br><span class="line">                pos[<span class="string">&#x27;pl_ratio&#x27;</span>] = pos[<span class="string">&#x27;pl_val&#x27;</span>] / (pos[<span class="string">&#x27;cost_price&#x27;</span>] * pos[<span class="string">&#x27;qty&#x27;</span>])</span><br><span class="line">            pos_data.append(pos)</span><br><span class="line"></span><br><span class="line">        orders_data = []</span><br><span class="line">        data = self.client.trades()</span><br><span class="line">        <span class="keyword">for</span> order <span class="keyword">in</span> data:</span><br><span class="line">            o = &#123;&#125;</span><br><span class="line">            o[<span class="string">&#x27;order_id&#x27;</span>] = order.order.permId</span><br><span class="line">            o[<span class="string">&#x27;order_status&#x27;</span>] = order.orderStatus.status</span><br><span class="line">            o[<span class="string">&#x27;create_time&#x27;</span>] = order.log[<span class="number">-1</span>].time</span><br><span class="line">            o[<span class="string">&#x27;trd_side&#x27;</span>] = order.order.action</span><br><span class="line">            o[<span class="string">&#x27;order_type&#x27;</span>] = order.order.action</span><br><span class="line">            o[<span class="string">&#x27;code&#x27;</span>] = order.contract.symbol</span><br><span class="line">            orders_data.append(o)</span><br><span class="line">        <span class="keyword">return</span> acc_data, pos_data, orders_data</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">place_order</span>(<span class="params">self, symbol: str, quantity: int, price: float=<span class="number">0</span></span>):</span></span><br><span class="line">        contract = ib_insync.Stock(symbol.upper(), <span class="string">&#x27;SMART&#x27;</span>, self.currency)</span><br><span class="line">        self.client.qualifyContracts(contract)</span><br><span class="line">        <span class="keyword">if</span> quantity &gt;= <span class="number">0.0</span>:</span><br><span class="line">            order = ib_insync.MarketOrder(<span class="string">&#x27;BUY&#x27;</span>, quantity)</span><br><span class="line">        <span class="keyword">else</span>:</span><br><span class="line">            order = ib_insync.MarketOrder(<span class="string">&#x27;SELL&#x27;</span>, -quantity)</span><br><span class="line">        trade = self.client.placeOrder(contract, order)</span><br><span class="line">        self.client.sleep(<span class="number">5</span>)</span><br><span class="line"></span><br><span class="line">        <span class="keyword">return</span> <span class="literal">True</span></span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">get_last_price_from_quote</span>(<span class="params">self, symbol:str</span>):</span></span><br><span class="line">        contract = ib_insync.Stock(symbol.upper(), <span class="string">&#x27;SMART&#x27;</span>, self.currency)</span><br><span class="line">        <span class="comment"># self.client.reqMarketDataType(3)</span></span><br><span class="line">        self.client.reqMarketDataType(<span class="number">3</span>)</span><br><span class="line">        self.client.qualifyContracts(contract)</span><br><span class="line">        quote = self.client.reqMktData(</span><br><span class="line">            contract,</span><br><span class="line">            genericTickList=<span class="string">&quot;&quot;</span>,</span><br><span class="line">            snapshot=<span class="literal">True</span>,</span><br><span class="line">            regulatorySnapshot=<span class="literal">False</span>,</span><br><span class="line">            mktDataOptions=<span class="literal">None</span></span><br><span class="line">        )</span><br><span class="line"></span><br><span class="line">        <span class="keyword">for</span> _ <span class="keyword">in</span> range(<span class="number">10</span>):</span><br><span class="line">            <span class="keyword">if</span> math.isnan(quote.last):</span><br><span class="line">                self.client.sleep(<span class="number">1</span>)</span><br><span class="line">            <span class="keyword">else</span>:</span><br><span class="line">                <span class="keyword">return</span> quote.last</span><br><span class="line">        print(<span class="string">f&#x27;No last price in quote for <span class="subst">&#123;symbol&#125;</span>&#x27;</span>)</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span></span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">__order_status</span>(<span class="params">self, trade</span>):</span></span><br><span class="line">        <span class="string">&#x27;&#x27;&#x27;</span></span><br><span class="line"><span class="string">        Call back function for checking order status</span></span><br><span class="line"><span class="string">        &#x27;&#x27;&#x27;</span></span><br><span class="line">        print(<span class="string">f&#x27;Order [<span class="subst">&#123;trade.contract.symbol&#125;</span>] status updated: <span class="subst">&#123;trade.orderStatus.status&#125;</span>&#x27;</span>)</span><br><span class="line">        match trade.orderStatus.status:</span><br><span class="line">            case <span class="string">&#x27;Filled&#x27;</span>:</span><br><span class="line">                print(<span class="string">f&#x27;Order <span class="subst">&#123;trade.contract.symbol&#125;</span>, filled.&#x27;</span>)</span><br><span class="line">            case _:</span><br><span class="line">                print(<span class="string">f&#x27;Others order status: <span class="subst">&#123;trade.orderStatus.status&#125;</span>&#x27;</span>)</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">is_market_open</span>(<span class="params">self, offset_days=<span class="number">0</span></span>):</span></span><br><span class="line">        spy_contract = ib_insync.Stock(<span class="string">&#x27;SPY&#x27;</span>, <span class="string">&#x27;SMART&#x27;</span>, self.currency)</span><br><span class="line">        self.client.qualifyContracts(spy_contract)</span><br><span class="line">        trading_days = self.client.reqContractDetails(spy_contract)[<span class="number">0</span>].liquidHours</span><br><span class="line">        trading_days_dict = &#123;d.split(<span class="string">&#x27;:&#x27;</span>)[<span class="number">0</span>]:d.split(<span class="string">&#x27;:&#x27;</span>)[<span class="number">1</span>] <span class="keyword">for</span> d <span class="keyword">in</span> trading_days.split(<span class="string">&#x27;;&#x27;</span>)&#125;</span><br><span class="line">        today_str = (datetime.now().astimezone(self.timezone) + timedelta(days=offset_days)).strftime(<span class="string">&#x27;%Y%m%d&#x27;</span>)</span><br><span class="line"></span><br><span class="line">        <span class="keyword">for</span> k, v <span class="keyword">in</span> trading_days_dict.items():</span><br><span class="line">            <span class="keyword">if</span> (today_str <span class="keyword">in</span> k) <span class="keyword">and</span> (v == <span class="string">&#x27;CLOSED&#x27;</span>):</span><br><span class="line">                <span class="keyword">return</span> <span class="literal">False</span></span><br><span class="line"></span><br><span class="line">        <span class="keyword">return</span> <span class="literal">True</span></span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">is_market_open_now</span>(<span class="params">self</span>):</span></span><br><span class="line">        spy_contract = ib_insync.Stock(<span class="string">&#x27;SPY&#x27;</span>, <span class="string">&#x27;SMART&#x27;</span>, self.currency)</span><br><span class="line">        self.client.qualifyContracts(spy_contract)</span><br><span class="line">        trading_days = self.client.reqContractDetails(spy_contract)[<span class="number">0</span>].liquidHours</span><br><span class="line">        trading_days_list = [d.split(<span class="string">&#x27;-&#x27;</span>) <span class="keyword">for</span> d <span class="keyword">in</span> trading_days.split(<span class="string">&#x27;;&#x27;</span>)]</span><br><span class="line"></span><br><span class="line">        day_str = datetime.now().astimezone(self.timezone).strftime(<span class="string">&#x27;%Y%m%d&#x27;</span>)</span><br><span class="line">        time_str = datetime.now().astimezone(self.timezone).strftime(<span class="string">&#x27;%H%M&#x27;</span>)</span><br><span class="line"></span><br><span class="line">        <span class="keyword">for</span> d <span class="keyword">in</span> trading_days_list:</span><br><span class="line">            <span class="keyword">if</span> len(d) &gt; <span class="number">1</span> <span class="keyword">and</span> day_str <span class="keyword">in</span> d[<span class="number">0</span>].split()[<span class="number">0</span>]:</span><br><span class="line">                <span class="keyword">if</span> time_str &gt; d[<span class="number">0</span>].split(<span class="string">&#x27;:&#x27;</span>)[<span class="number">1</span>] <span class="keyword">and</span> time_str &lt; d[<span class="number">1</span>].split(<span class="string">&#x27;:&#x27;</span>)[<span class="number">1</span>]:</span><br><span class="line">                    <span class="keyword">return</span> <span class="literal">True</span></span><br><span class="line"></span><br><span class="line">        <span class="keyword">return</span> <span class="literal">False</span></span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">get_transactions</span>(<span class="params">self</span>):</span></span><br><span class="line">        <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># Main function</span></span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">&#x27;__main__&#x27;</span>:</span><br><span class="line">    broker = InteractiveBrokerTradeAPI()</span><br><span class="line">    print(datetime.now().strftime(<span class="string">&#x27;Now is %Y-%m-%d&#x27;</span>))</span><br><span class="line">    <span class="keyword">with</span> broker.connect() <span class="keyword">as</span> c:</span><br><span class="line">        accounts, positions, orders = c.get_account_detail()</span><br><span class="line">        print(ib_insync.util.df(accounts))</span><br><span class="line">        print(ib_insync.util.df(positions))</span><br><span class="line">        print(ib_insync.util.df(orders))</span><br><span class="line">        print(<span class="string">&quot;=&quot;</span>*<span class="number">30</span>)</span><br><span class="line">        market_open = c.is_market_open()</span><br><span class="line">        market_open_now = c.is_market_open_now()</span><br><span class="line">        print(<span class="string">f&#x27;<span class="subst">&#123;market_open=&#125;</span>&#x27;</span>)</span><br><span class="line">        print(<span class="string">f&#x27;<span class="subst">&#123;market_open_now=&#125;</span>&#x27;</span>)</span><br><span class="line">        print(<span class="string">&quot;=&quot;</span>*<span class="number">30</span>)</span><br><span class="line">        print(c.get_last_price_from_quote(<span class="string">&#x27;SSO&#x27;</span>))</span><br><span class="line">        <span class="keyword">if</span> market_open <span class="keyword">and</span> market_open_now:</span><br><span class="line">            last = c.get_last_price_from_quote(<span class="string">&#x27;AAPL&#x27;</span>)</span><br><span class="line">            print(<span class="string">f&#x27;<span class="subst">&#123;last=&#125;</span>&#x27;</span>)</span><br><span class="line">            c.place_order(<span class="string">&#x27;AAPL&#x27;</span>, <span class="number">1</span>)</span><br></pre></td></tr></table></figure><p>And here’s the output.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Output</span></span><br><span class="line">Now <span class="keyword">is</span> <span class="number">2022</span><span class="number">-12</span><span class="number">-14</span></span><br><span class="line">==============================</span><br><span class="line">Connection established</span><br><span class="line">==============================</span><br><span class="line">     account      cash  total_assets</span><br><span class="line"><span class="number">0</span>  DU4399668  <span class="number">2118.598</span>     <span class="number">99946.918</span></span><br><span class="line">  code     qty  cost_price  market_val   pl_val  pl_ratio</span><br><span class="line"><span class="number">0</span>  SHV   <span class="number">427.0</span>  <span class="number">109.987048</span>    <span class="number">46970.03</span>     <span class="number">5.56</span>  <span class="number">0.000118</span></span><br><span class="line"><span class="number">1</span>  SSO  <span class="number">1019.0</span>   <span class="number">48.717576</span>    <span class="number">50858.29</span>  <span class="number">1215.08</span>  <span class="number">0.024476</span></span><br><span class="line"><span class="literal">None</span></span><br><span class="line">==============================</span><br><span class="line">market_open=<span class="literal">True</span></span><br><span class="line">market_open_now=<span class="literal">True</span></span><br><span class="line">==============================</span><br><span class="line"><span class="number">49.94</span></span><br><span class="line">last=<span class="number">147.86</span></span><br><span class="line">Order [AAPL] status updated: PreSubmitted</span><br><span class="line">Others order status: PreSubmitted</span><br><span class="line">Order [AAPL] status updated: Filled</span><br><span class="line">Order AAPL, filled.</span><br><span class="line">==============================</span><br><span class="line">Connection closed</span><br><span class="line">==============================</span><br></pre></td></tr></table></figure><p>Voila! Now as long as we schedule the time for each function to run, we will have our automated trading script ready to run! It’s time for you to put on your creative hat and start improvising, adding your own magic to your trading script. See you next time.</p>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2022/12/12/2022-12-16-IBKR-broker-2/cover.png&quot; class=&quot;&quot; width=&quot;800&quot;&gt;
&lt;p&gt;This is the second part of the &lt;a href=&quot;https://mikelhsia.github.io/2022/12/12/2022-12-10-IBKR-Broker/&quot;&gt;&lt;strong&gt;Set Up Trading API Template In Python&lt;/strong&gt;&lt;/a&gt;. We’re going to focus on implementing the rest of the functions in our Interactive Broker class.&lt;/p&gt;</summary>
    
    
    <category term="How2" scheme="http://mikelhsia.github.io/categories/How2/"/>
    
    <category term="Quantitative Trading" scheme="http://mikelhsia.github.io/categories/How2/Quantitative-Trading/"/>
    
    
    <category term="How2" scheme="http://mikelhsia.github.io/tags/How2/"/>
    
    <category term="Python3" scheme="http://mikelhsia.github.io/tags/Python3/"/>
    
    <category term="Interactive Broker" scheme="http://mikelhsia.github.io/tags/Interactive-Broker/"/>
    
  </entry>
  
  <entry>
    <title>【How 2】Set Up Trading API Template In Python - Connecting My Trading Strategies To Interactive Brokers</title>
    <link href="http://mikelhsia.github.io/2022/12/07/2022-12-10-IBKR-Broker/"/>
    <id>http://mikelhsia.github.io/2022/12/07/2022-12-10-IBKR-Broker/</id>
    <published>2022-12-07T06:58:04.000Z</published>
    <updated>2022-12-15T07:05:45.857Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2022/12/07/2022-12-10-IBKR-Broker/cover.png" class="" width="800"><p>Building your trading strategy to connect to a broker with the broker’s proprietary API is always dreadful. There are tones of API documentation to read, tones of trial-and-error tests to conduct, and tones of unknown causes and bugs that fail your API test. In this post, I’m going to demonstrate my MVP API template to get my trading strategies to work, so that you can build your own in a way that makes your trading strategies work as well.</p><a id="more"></a><hr><blockquote><p>If you enjoy reading this and my other articles, feel free to <a src='https://medium.com/@mikelhsia/membership'> join Medium membership program</a> to read more about Quantitative Trading Strategy.</p></blockquote><hr><p><strong><em>Previous researches</em></strong></p><ul><li><a href="https://mikelhsia.github.io/2022/11/04/2022-11-10-advanced-cppi-strategy/">【Momentum Trading】A Defense Trading Strategy That Works - CPPI (Constant Proportion Portfolio Insurance)</a></li></ul><p>After completing the research CPPI investment strategy, I need to find a new broker with a new account to start trading. <a href="https://www.interactivebrokers.ca/en/trading/lp-why-ibkr.php?wid=978897570"><strong>Interactive Broker (IBKR)</strong></a> is the first broker that I am familiar with when I was a student, so I kinda have this obsession to make my trades through it. Years later, I consider myself finally had the luxury to sit down and spend time reading their API documentation. I’ve found the <a href="https://www.interactivebrokers.com/en/trading/api-guides.php">IBKR API</a> doesn’t support retrieving historical transaction data, for example, the orders you placed three days ago and the commissions you paid for each trade wouldn’t be stored anywhere in the IBKR server. However, I need this information to build my performance evaluation report. Therefore, I’ve decided to put it on hold until there’s a good enough solution to come to rescue me. Now, as my knowledge grows, I’ve figured it’s about time to tackle this task.</p><p><a href="https://www.interactivebrokers.ca/en/trading/lp-why-ibkr.php?wid=978897570"><strong>Interactive Broker (IBKR)</strong></a> is a renowned investment broker that has successfully operated its business across the world. It is also famous for its low trading fees in both the equity and the derivative markets. On the other hand, its proprietary API is notorious for being complicated enough to work with. I’m going to give you my two cents here and hopefully it’ll help people who would like to consider Interactive Broker as their market broker. Here are the things I’m going to talk about:</p><ul><li>IBKR TWS and IB gateway</li><li>IB gateway configurations</li><li>My MVP broker API template</li><li>Introduce ib_insync package and start connecting</li><li>Implement our first IBKR call</li></ul><h1 id="IBKR-TWS-and-IB-gateway"><a href="#IBKR-TWS-and-IB-gateway" class="headerlink" title="IBKR TWS and IB gateway"></a>IBKR TWS and IB gateway</h1><p>In order to connect to the broker’s API service, each broker provides different methods to do so. <a href="https://developer.tdameritrade.com/apis">TD Ameritrade API</a> allows you to use their API service remotely through the API token provided. <a href="https://openapi.futunn.com/futu-api-doc/en/">Futu OpenAPI</a> requires you to download extra software on your local PC/laptop, so that your API calls will be able to access their API service through this middleware. As for [IBKR API], it is similar to Futu OpenAPI that all the API calls are connected through its proprietary software to reach the API service.</p><img data-src="/2022/12/07/2022-12-10-IBKR-Broker/cover.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>How does your API call reach the IBKR API service</i></p><p>Interactive broker provides two software applications to help you connect to their API service:</p><p><strong>TWS</strong><br>TWS stands for Trader Workstation. TWS is designed for trades who would like to conduct a series of research and trade equity and derivatives across many markets in one unified platform. Users can read the latest news, study company fundamentals and annual reports, research the stock trend or patterns, and even place orders with it. Also, it embeds the capability of being an intermediary between your desktop/laptop and the IBKR API service.<br><img data-src="/2022/12/07/2022-12-10-IBKR-Broker/tws.png" class="" width="600"></p><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>IBKR TWS Trading Station</i></p><p><strong>IB gateway</strong><br>Compared to TWS, IB gateway is simply an API gateway without all the User Interface that you can see in the TWS. You can’t buy or sell or do anything with the IB gateway. It allows you to connect to the IBKR API service and nothing more. In general, it is a super lightweight TWS that will consume much less of your desktop/laptop memories and resources.<br><img data-src="/2022/12/07/2022-12-10-IBKR-Broker/ib_gateway.png" class="" width="600"></p><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>IBKR IB Gateway</i></p><p>These two software applications have similar configurations but hold separate parameters which they don’t share mutually. Now let’s take a look at the configurations that will concern us before we start programming our API connection to IBKR.</p><h1 id="IB-gateway-configurations"><a href="#IB-gateway-configurations" class="headerlink" title="IB gateway configurations"></a>IB gateway configurations</h1><p>Let’s use IB gateway as an example so that we don’t get distracted by the various features that TWS offers. First of all, we need to log in to the IB gateway software application. I don’t want to mess around with my real money while testing my trading script, I would instead use a paper account. <em>(Tip: You can reset your paper account on your account management page every day.)</em></p><img data-src="/2022/12/07/2022-12-10-IBKR-Broker/login.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Login to your paper account</i></p><p>Here is a few configurations that you need to pay attention to before starting to code:</p><ol><li>In API -&gt; Setting, uncheck the <em>“Read-Only API”</em>.</li><li>In API -&gt; Setting, remember or configure the <em>“Socket port”</em> because you will need it when connecting to this software.</li><li>In API -&gt; Precautions, check the box <em>“Bypass Order Precaution for API Order”</em> to prevent additional errors or warning dialog boxes popped up when you place orders through API.</li><li>If you’re using TWS as your middleman service, you need to check one more box <em>“Enable ActiveX and Socket Client”</em>.</li></ol><img data-src="/2022/12/07/2022-12-10-IBKR-Broker/api_configuration.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>API configuration</i></p><img data-src="/2022/12/07/2022-12-10-IBKR-Broker/api_configuration2.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>API configuration 2</i></p><p>Now we’re all set. Let’s get down to the business.</p><h1 id="My-MVP-broker-API-template"><a href="#My-MVP-broker-API-template" class="headerlink" title="My MVP broker API template"></a>My MVP broker API template</h1><p>There are a few actions required in your trading strategies in order to run the basic buy/sell operations properly:</p><ol><li>Connect to and disconnect from the dedicated broker API service.</li><li>Get your basic account info in order to know the account status such as <code>Total asset value</code>, <code>Remaining cash balance</code>, <code>Purchasing power</code>, … and so on.</li><li>Check the trading calendar and trading hours to see whether the market is open for trading or if it’s a holiday today.</li><li>Check the current quote price of a specific symbol in order to know how many shares we would purchase.</li><li>Place orders through the broker API.</li><li>Get the transaction history for performance evaluation later on.</li></ol><p>I use the <a href="https://docs.python.org/3/library/abc.html">python package <code>abc</code></a> (abbreviation for <strong>Abstract Base Class</strong>) to build my base API template. The most obvious advantage of building a baseclass with an abstract base class is that you can easily extend from it to build another class. For example, you have a class called <strong>InteractiveBrokerClass</strong> to make trades, and another class called <strong>TDAmeritradeBrokerClass</strong> to make trades with <a href="https://www.tdameritrade.com/">TD Ameritrade</a>. Both broker classes do similar things and require similar functions. Implementing them using as abstract base class would make your life easier in terms of managing the actions in both derived classes. Here’s my broker base class:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># TradeAPI.py</span></span><br><span class="line"><span class="keyword">import</span> abc</span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">AbstractTradeInterface</span>(<span class="params">metaclass=abc.ABCMeta</span>):</span></span><br><span class="line"><span class="meta">  @abc.abstractmethod</span></span><br><span class="line">  <span class="function"><span class="keyword">def</span> <span class="title">connect</span>(<span class="params">self</span>):</span></span><br><span class="line">    <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line"><span class="meta">  @abc.abstractmethod</span></span><br><span class="line">  <span class="function"><span class="keyword">def</span> <span class="title">get_account_detail</span>(<span class="params">self</span>):</span></span><br><span class="line">    <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line"><span class="meta">  @abc.abstractmethod</span></span><br><span class="line">  <span class="function"><span class="keyword">def</span> <span class="title">get_last_price_from_quote</span>(<span class="params">self</span>):</span></span><br><span class="line">    <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line"><span class="meta">  @abc.abstractmethod</span></span><br><span class="line">  <span class="function"><span class="keyword">def</span> <span class="title">place_order</span>(<span class="params">self</span>):</span></span><br><span class="line">    <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line"><span class="meta">  @abc.abstractmethod</span></span><br><span class="line">  <span class="function"><span class="keyword">def</span> <span class="title">is_market_open</span>(<span class="params">self</span>):</span></span><br><span class="line">    <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line"><span class="meta">  @abc.abstractmethod</span></span><br><span class="line">  <span class="function"><span class="keyword">def</span> <span class="title">is_market_open_now</span>(<span class="params">self</span>):</span></span><br><span class="line">    <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line"><span class="meta">  @abc.abstractmethod</span></span><br><span class="line">  <span class="function"><span class="keyword">def</span> <span class="title">get_transactions</span>(<span class="params">self</span>):</span></span><br><span class="line">    <span class="keyword">pass</span></span><br></pre></td></tr></table></figure><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Abstract Trading API Template</i></p><p>Implementing any of them could be straightforward and also could be extremely complex, depending on the design of the structure of the proprietary broker API. Like I said earlier that Interactive Broker didn’t support querying historic transactions/trades, we need to build alternative functions on the side in order to support the <code>def get_transactions(self)</code> in our base template. I’m not going to dive into how to build them now and we will come back to this in another post. I will start by looking at how we are going to connect to the IB gateway by implementing the <code>def connect()</code> function, and then we can check the quote price and place orders with API calls whenever we want to.</p><h1 id="Introduce-ib-insync-package-and-start-connecting"><a href="#Introduce-ib-insync-package-and-start-connecting" class="headerlink" title="Introduce ib_insync package and start connecting"></a>Introduce ib_insync package and start connecting</h1><p>Instead of using the native <code>ib_api</code> package to connect to the IBKR API service, I choose to use <a href="https://pypi.org/project/ib-insync/"><code>ib_insync</code></a> package developed by <strong><em>Ewald R. de Wit</em></strong>. <code>ib_insync</code> not only simplifies the way to connect and communicate with the IBKR API service, but it also adds the asynchronous capability so that less CPU time was wasted while requesting data from the server. Here is an introductory post to get you familiar with the functions provided in <code>ib_insync</code>: <a href="https://algotrading101.com/learn/ib_insync-interactive-brokers-api-guide/">ib_insync: Interactive Broker API guide</a>. We can clearly learn that we can connect to the IBKR API service with the following code:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">ib = IB()</span><br><span class="line">ib.connect(</span><br><span class="line">  host=<span class="string">&#x27;127.0.0.1&#x27;</span>, <span class="comment"># local host IP</span></span><br><span class="line">  port=<span class="number">4002</span>, <span class="comment"># The port that we configured in the IB gateway</span></span><br><span class="line">  clientId=<span class="number">1</span> <span class="comment"># The non-duplicated client ID for each connection</span></span><br><span class="line">)</span><br><span class="line">ib.disconnect() <span class="comment"># To disconnect from the server</span></span><br></pre></td></tr></table></figure><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>You would need to connect to server before you make any request</i></p><p>This established connection <code>ib</code> was handled and maintained by whoever initiated it. In order to better and easier to handle the connection and close it effectively once we finished using it, I would suggest using <a href="https://docs.python.org/3/library/contextlib.html"><code>contextmanager</code></a> so that the context manager will close the connection once we finished using it. We don’t have to explicitly disconnect from the API service. Instead, the context manager will handle it every time when the <code>with</code> clause is finished.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># test.py</span></span><br><span class="line"><span class="meta">@contextmanager</span></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">connect</span>(<span class="params">self</span>):</span></span><br><span class="line">  self.client = ib_insync.IB()</span><br><span class="line">  self.client.connect(<span class="string">&#x27;127.0.0.1&#x27;</span>, <span class="number">4002</span>, <span class="number">300</span>)</span><br><span class="line"></span><br><span class="line">  <span class="keyword">yield</span> self <span class="comment"># Return the self instance</span></span><br><span class="line"></span><br><span class="line">  self.client.disconnect()</span><br><span class="line">  self.client.sleep(<span class="number">2</span>)  <span class="comment"># make sure the connection is closed before next time you connect to IBKR API service</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">&#x27;__main__&#x27;</span>:</span><br><span class="line">  <span class="keyword">with</span> broker.connect() <span class="keyword">as</span> c:</span><br><span class="line">    <span class="comment"># Make requests to the server</span></span><br><span class="line">  <span class="comment"># The connections will be closed since we disconnect from the server after `yield self`</span></span><br></pre></td></tr></table></figure><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Connect function implemented with context manager</i></p><h1 id="Implement-our-first-IBKR-call"><a href="#Implement-our-first-IBKR-call" class="headerlink" title="Implement our first IBKR call"></a>Implement our first IBKR call</h1><p>In the last part of this post, let’s try to implement <code>get_account_detail()</code> call in our broker class so that we could learn the account status.</p><p>In the original <a href="https://ib-insync.readthedocs.io/api.html">ib_insync document</a>, I found out that <code>ib.managedAccounts()</code> can retrieve a list of account names, and <code>ib.accountValues(account:str)</code> can retrieve all stats under this account parameter. Hence, I’m going to:</p><ol><li>First, use <code>ib.managedAccounts()</code> to retrieve all the accounts created.</li><li>Use <code>ib.accountValues()</code> to get all variables related to this account.</li><li>Extract the <code>TotalCashBalance</code> and <code>StockMarketValue</code> concerned USD so that I could tell how much money I have in cash and as well the total value under my account.</li></ol><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># test.py</span></span><br><span class="line"><span class="keyword">from</span> contextlib <span class="keyword">import</span> contextmanager</span><br><span class="line"><span class="keyword">import</span> ib_insync</span><br><span class="line"><span class="keyword">from</span> zoneinfo <span class="keyword">import</span> ZoneInfo</span><br><span class="line"><span class="keyword">from</span> modules.broker.TradeAPI <span class="keyword">import</span> AbstractTradeInterfac</span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">InteractiveBrokerTradeAPI</span>(<span class="params">AbstractTradeInterface</span>):</span></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">__init__</span>(<span class="params">self,currency=<span class="string">&#x27;USD&#x27;</span></span>):</span></span><br><span class="line">        self.client = <span class="literal">None</span></span><br><span class="line">        self.accounts = []</span><br><span class="line">        self.currency = currency</span><br><span class="line"></span><br><span class="line"><span class="meta">    @contextmanager</span></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">connect</span>(<span class="params">self</span>):</span></span><br><span class="line">        self.client = ib_insync.IB()</span><br><span class="line">        self.client.connect(<span class="string">&#x27;127.0.0.1&#x27;</span>, <span class="number">4002</span>, <span class="number">300</span>)</span><br><span class="line"></span><br><span class="line">        <span class="keyword">yield</span> self</span><br><span class="line"></span><br><span class="line">        self.client.disconnect()</span><br><span class="line">        self.client.sleep(<span class="number">2</span>)  <span class="comment"># make sure the connection is closed before next time you connect to IBKR API service</span></span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">get_account_detail</span>(<span class="params">self</span>):</span></span><br><span class="line">        self.accounts = self.client.managedAccounts()</span><br><span class="line"></span><br><span class="line">        acc_data = []</span><br><span class="line">        <span class="keyword">for</span> account <span class="keyword">in</span> self.accounts:</span><br><span class="line">            acc = &#123;&#125;</span><br><span class="line">            acc[<span class="string">&#x27;account&#x27;</span>] = account</span><br><span class="line">            data = self.client.accountValues(account)</span><br><span class="line">            acc[<span class="string">&#x27;cash&#x27;</span>] = <span class="number">0</span></span><br><span class="line">            acc[<span class="string">&#x27;total_assets&#x27;</span>] = <span class="number">0</span></span><br><span class="line">            <span class="keyword">for</span> row <span class="keyword">in</span> data:</span><br><span class="line">                <span class="keyword">if</span> row.tag <span class="keyword">in</span> [<span class="string">&#x27;TotalCashBalance&#x27;</span>] <span class="keyword">and</span> row.currency == self.currency:</span><br><span class="line">                    acc[<span class="string">&#x27;cash&#x27;</span>] = row.value</span><br><span class="line">                    acc[<span class="string">&#x27;total_assets&#x27;</span>] += float(row.value)</span><br><span class="line">                <span class="keyword">elif</span> row.tag <span class="keyword">in</span> [<span class="string">&#x27;StockMarketValue&#x27;</span>] <span class="keyword">and</span> row.currency == self.currency:</span><br><span class="line">                    acc[<span class="string">&#x27;total_assets&#x27;</span>] += float(row.value)</span><br><span class="line">            acc_data.append(acc)</span><br><span class="line">        <span class="keyword">return</span> acc_data</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">get_last_price_from_quote</span>(<span class="params">self</span>):</span></span><br><span class="line">        <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">place_order</span>(<span class="params">self</span>):</span></span><br><span class="line">        <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">is_market_open</span>(<span class="params">self</span>):</span></span><br><span class="line">        <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">is_market_open_now</span>(<span class="params">self</span>):</span></span><br><span class="line">        <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">get_transactions</span>(<span class="params">self</span>):</span></span><br><span class="line">        <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># Main function</span></span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">&#x27;__main__&#x27;</span>:</span><br><span class="line">    broker = InteractiveBrokerTradeAPI()</span><br><span class="line">    <span class="keyword">with</span> broker.connect() <span class="keyword">as</span> c:</span><br><span class="line">        accounts = c.get_account_detail()</span><br><span class="line">        print(accounts)</span><br></pre></td></tr></table></figure><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Full code</i></p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Output</span></span><br><span class="line">&gt;&gt; [&#123;<span class="string">&#x27;account&#x27;</span>: <span class="string">&#x27;DU4399668&#x27;</span>, <span class="string">&#x27;cash&#x27;</span>: <span class="string">&#x27;77.44&#x27;</span>, <span class="string">&#x27;total_assets&#x27;</span>: <span class="number">96737.42</span>&#125;]</span><br></pre></td></tr></table></figure><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Account status output</i></p><p><em>Noted: Make sure you attach the rest of the unimplemented functions as I did, as this is required in an abstract class. Otherwise, you will see the following error message when you run the test code.</em><br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">Traceback (most recent call last):</span><br><span class="line">  File <span class="string">&quot;/Users/michael/quantitative-strategy/app/trading/test.py&quot;</span>, line <span class="number">86</span>, <span class="keyword">in</span> &lt;module&gt;</span><br><span class="line">    broker = InteractiveBrokerTradeAPI(version_param=SCRIPT_VERSION)</span><br><span class="line">TypeError: Can<span class="string">&#x27;t instantiate abstract class InteractiveBrokerTradeAPI with abstract methods get_last_price_from_quote, get_transactions, is_market_open, is_market_open_now, place_order</span></span><br></pre></td></tr></table></figure></p><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>You need to implement all defined functions in the base class</i></p><h1 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h1><p>Done! We’re not officially connecting our trading script to the Interactive Broker API service. To make sure this broker class can fully support the functionalities of your trading program, there are still more functions to be implemented. Don’t worry, I’ll see you next time.</p>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2022/12/07/2022-12-10-IBKR-Broker/cover.png&quot; class=&quot;&quot; width=&quot;800&quot;&gt;
&lt;p&gt;Building your trading strategy to connect to a broker with the broker’s proprietary API is always dreadful. There are tones of API documentation to read, tones of trial-and-error tests to conduct, and tones of unknown causes and bugs that fail your API test. In this post, I’m going to demonstrate my MVP API template to get my trading strategies to work, so that you can build your own in a way that makes your trading strategies work as well.&lt;/p&gt;</summary>
    
    
    <category term="How2" scheme="http://mikelhsia.github.io/categories/How2/"/>
    
    <category term="Quantitative Trading" scheme="http://mikelhsia.github.io/categories/How2/Quantitative-Trading/"/>
    
    
    <category term="How2" scheme="http://mikelhsia.github.io/tags/How2/"/>
    
    <category term="Python3" scheme="http://mikelhsia.github.io/tags/Python3/"/>
    
    <category term="Interactive Broker" scheme="http://mikelhsia.github.io/tags/Interactive-Broker/"/>
    
  </entry>
  
  <entry>
    <title>【Momentum Trading】A Defense Trading Strategy That Works - CPPI (Constant Proportion Portfolio Insurance)</title>
    <link href="http://mikelhsia.github.io/2022/11/04/2022-11-10-advanced-cppi-strategy/"/>
    <id>http://mikelhsia.github.io/2022/11/04/2022-11-10-advanced-cppi-strategy/</id>
    <published>2022-11-03T17:04:25.000Z</published>
    <updated>2024-04-23T07:53:51.144Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2022/11/04/2022-11-10-advanced-cppi-strategy/cover.jpg" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Photo by <a href='https://medium.com/r?url=https%3A%2F%2Funsplash.com%2F%40sasun1990%3Futm_source%3Dmedium%26utm_medium%3Dreferral'>Sasun Bughdaryan</a> on <a href='https://unsplash.com/?utm_source=medium&utm_medium=referral'>Unsplash</a></i></p><p>We’ve been talking too much about the attack side of quantitative trading, such as momentum, mean reversion, and ML. These strategies aim to outperform the benchmark/index by adding your personal points of view to the trading strategies. Beating the benchmark becomes the only goal when playing the offense. What about defense? After reading <a href="https://quantpedia.com/introduction-to-cppi-constant-proportion-portfolio-insurance/">Introduction to CPPI – Constant Proportion Portfolio Insurance</a>, I started to feel that I can’t agree more with the idea of “The best defense is a good offense” once said by Sun Tzu, a Chinese military general, a strategist, and a philosopher. What does defense mean in the field of quantitative trading? Does defense mean we strive not to lose money and then nothing else worth doing? Maybe talking about the CPPI strategy would give us a better picture of what actually defense means to the traders. Let’s now have a look at how to approach the other side of trading.</p><a id="more"></a><hr><blockquote><p>If you enjoy reading this and my other articles, feel free to <a src='https://medium.com/@mikelhsia/membership'> join Medium membership program</a> to read more about Quantitative Trading Strategy.</p></blockquote><hr><h1 id="What-does-it-mean-by-playing-defense-in-trading"><a href="#What-does-it-mean-by-playing-defense-in-trading" class="headerlink" title="What does it mean by playing defense in trading?"></a>What does it mean by playing defense in trading?</h1><p>When talking about quantitative strategy, they all come down to this advanced CAPM formula and use it to categorize different trading strategies. $\beta$ in this formula stands for the sensitivity of your portfolio against the movement of the market return and $\alpha$ stands for the excessive market return that can’t be captured by the $\beta$ term. For example, if the current risk-free rate is 1%, and the market return is 4% with 0 in $\alpha$, then you can calculate that your expected return would be $2*(4\%-1\%) + 1\% = 7\%$ if the $\beta$ equals to 2, or 10% if the $\beta$ equals 3.</p><script type="math/tex; mode=display">Return_{Expected} = Return_{riskfree} + \beta \times (Return_{market} - Return_{riskfree}) + \alpha</script><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Advanced CAPM model by <a href='https://www.investopedia.com/terms/j/jensensmeasure.asp'>Jensen</a></i></p><p>The $\beta$ here is essentially the <strong>exposure</strong> of our portfolio against market fluctuation. The idea of playing defense in trading is to reduce the exposure against the market so that you will lose less money than other people in the bear market. There are ways to reduce market exposure, such as market-neutral strategy, portfolio diversification, hedging, and many other methods that help reduce your market exposure and $\beta$ of your portfolio. However, reducing $\beta$ could also harm your profit when the overall market goes up because you have a smaller $\beta$. This is what happens when you play defense in trading.</p><h1 id="What-is-CPPI-Constant-Proportion-Portfolio-Insurance-strategy"><a href="#What-is-CPPI-Constant-Proportion-Portfolio-Insurance-strategy" class="headerlink" title="What is CPPI (Constant Proportion Portfolio Insurance) strategy?"></a>What is CPPI (Constant Proportion Portfolio Insurance) strategy?</h1><p>CPPI (Constant Proportion Portfolio Insurance) strategy achieves the goal of reducing risk exposure by adding a risk-free asset such as a 3-month treasury into your portfolio, which can be considered as a type of <em>portfolio diversification</em>. Short-term treasury usually has very little risk exposure against the market. You can see below the close price movement of SHV, SHY, TLT, and IEF compare to the movement of SPY close price.</p><img data-src="/2022/11/04/2022-11-10-advanced-cppi-strategy/riskfree.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Which asset is worth being picked as a risk-free asset? As SHY and SHV are short-term Treasury ETFs, and TLT and IEF are long-term Treasury ETF</i></p><p>On top of adding a risk-free asset into the portfolio, the CPPI strategy proposes concepts named <strong><em>Floor</em></strong> and <strong><em>Cushion</em></strong>. <strong><em>Floor</em></strong> would be the minimum asset value that you want to protect from loss, and the <strong><em>Cushion</em></strong> is the asset value that you would like to invest in riskier assets in order to gain an additional return. CPPI strategy allows investors to keep the potential chances of profiting, and limiting the downside risk by scaling the ratios of the floor and cushion dynamically. I’m listing the related formulas below for reference. For more details, go take a look at the <a href="https://quantpedia.com/introduction-to-cppi-constant-proportion-portfolio-insurance/">article</a> in <a href="https://quantpedia.com">QuantPedia</a>.</p><img data-src="/2022/11/04/2022-11-10-advanced-cppi-strategy/intro-to-CPPI.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Extracted from <a href='https://quantpedia.com/introduction-to-cppi-constant-proportion-portfolio-insurance/'>Introduction to CPPI – Constant Proportion Portfolio Insurance</a></i></p><h1 id="Disadvantages-when-applying-CPPI-strategies"><a href="#Disadvantages-when-applying-CPPI-strategies" class="headerlink" title="Disadvantages when applying CPPI strategies"></a>Disadvantages when applying CPPI strategies</h1><p>After conducting a series of backtesting against the proposed strategies in the <a href="https://quantpedia.com/introduction-to-cppi-constant-proportion-portfolio-insurance/">article</a>, I’ve found the strategies are less promising and less satisfying than I originally expected. Here are a few things I discovered:</p><h2 id="Floor-level-is-fixed-even-if-the-total-asset-value-go-rocket-high"><a href="#Floor-level-is-fixed-even-if-the-total-asset-value-go-rocket-high" class="headerlink" title="Floor level is fixed even if the total asset value go rocket high"></a>Floor level is fixed even if the total asset value go rocket high</h2><ul><li>In <strong>Basic CPPI</strong> strategy: I approached this strategy by setting a fixed percentage (80%) of the original asset that I want to protect from the beginning. This means, for example, I have \$100,000 as my start-up fund, and I want to protect 80% value of my fund. Then I’ll have \$80,000 as my floor level.</li><li>Even though we have the minimum asset value protected, the floor value stays the same throughout the entire backtesting period. This also means that there is a big chunk of the asset not protected when the portfolio grows a certain amount. That’s the reason why we suffer a huge drop at the beginning of 2022.</li></ul><img data-src="/2022/11/04/2022-11-10-advanced-cppi-strategy/basic_cppi.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Basic CPPI: Max Drawdown: 59.3%, Annual Variance: 0.069</i></p><h2 id="The-value-floor-trench"><a href="#The-value-floor-trench" class="headerlink" title="The value floor trench"></a>The value floor trench</h2><ul><li>In <strong>New High CPPI</strong> Strategy: Applying <em>New High CPPI strategy</em> will update the floor when the total portfolio value reaches a new high and the floor value accordingly. Therefore, our increasingly growing floor will help us protect more value when there’s higher value in our portfolio.</li><li>There’s one scenario that could cause this strategy less effective. When your portfolio value grows substantially, your floor value also increased along the way and reached a skyrocket-high number. What if one day the market crashed and the portfolio value dropped way below the floor value, you’ll then need to invest 100% of your asset into bonds/treasuries by the definition of the New High CPPI strategy. It’d take months or even years for the portfolio to recover to above the floor level. By looking at the backtesting result below, we miss the fantastic opportunity to grow our portfolio back to where it was and <em>fall into a floor trench</em>.</li></ul><img data-src="/2022/11/04/2022-11-10-advanced-cppi-strategy/newhigh_cppi.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>New High CPPI ex1: Invest nearly 100% in bonds/treasuries after 2020</i></p><img data-src="/2022/11/04/2022-11-10-advanced-cppi-strategy/newhigh_cppi_2.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>New High CPPI ex2: Investment loss will take a long time to recover back to floor level</i></p><h1 id="Let’s-add-some-spices-to-make-the-strategy-promising"><a href="#Let’s-add-some-spices-to-make-the-strategy-promising" class="headerlink" title="Let’s add some spices to make the strategy promising"></a>Let’s add some spices to make the strategy promising</h1><p>Since now we know the basics and pros and cons of the CPPI strategy, let’s add some spice to this strategy to make it more appealing so that traders like you will be more than willing to invest.</p><h2 id="Change-the-meat"><a href="#Change-the-meat" class="headerlink" title="Change the meat"></a>Change the meat</h2><p>As we all know the CPPI strategy can preserve the upward potential and limit the downward loss to a certain degree, why don’t we use the 2x SPY ETF to replace the SPY ETF to gain more upward space? Therefore,</p><ul><li>Risky asset<ul><li>We use <code>SSO</code> which is a <code>2x SPY</code> ETF so that we can exploit the high upward growth possibility.</li></ul></li><li>Risk-free asset<ul><li>We use <code>SHV</code> as it seems to be the more stable among the four risk-free assets we mentioned earlier in this post.</li></ul></li><li>Benchmark<ul><li>We still use <code>SPY</code> to be our benchmark</li></ul></li></ul><h2 id="Add-seasoning"><a href="#Add-seasoning" class="headerlink" title="Add seasoning"></a>Add seasoning</h2><p>As proposed in the article <a href="https://quantpedia.com/introduction-to-cppi-constant-proportion-portfolio-insurance/">Introduction to CPPI – Constant Proportion Portfolio Insurance</a>, we can apply the dynamic multiplier method so that we get to use different multipliers depending on the market volatility instead of picking a fixed number throughout the backtesting period.</p><ul><li>Fixed multiplier<ul><li>We make the multiplier equal to 3 at all times.</li></ul></li><li>Dynamic multiplier<ul><li>We set a few indicators to categorize the regimes of current market volatility<ul><li>$EMA_{21d}$: Exponential moving average of close price in past 21 days</li><li>$SMA_{63d}$: Simple moving average of close price in the past 63 days</li><li>$\overline{EMA_{21d}}$: 126 days average of $EMA_{21d}$</li><li>$\overline{SMA_{63d}}$: 126 days average of $SMA_{63d}$<script type="math/tex; mode=display">\text{Dynamic multiplier} = \left\{\begin{array}\\  4 & \text{if } EMA_{21d} > \overline{EMA_{21d}} \text{ and } SMA_{63d} > \overline{SMA_{63d}} \\  2 & \text{if } EMA_{21d} < \overline{EMA_{21d}} \text{ and } SMA_{63d} < \overline{SMA_{63d}} \\  3 & \mbox{if others } \\\end{array}\right.</script></li></ul></li></ul></li></ul><h2 id="Additional-flavor-Smart-Floor"><a href="#Additional-flavor-Smart-Floor" class="headerlink" title="Additional flavor: Smart Floor"></a>Additional flavor: Smart Floor</h2><p>An idea struck me when I was working on backtesting the scenarios from the article: what if we can build a somewhat flexible/intelligent mechanism that can adjust the floor level based on historic values? The whole process of calculating the floor value has reminded me the process of using <strong>Gradient Descent</strong> to approach the local optima. If you are not familiar with the idea of <strong>Gradient Descent</strong>, check this video out, and it will give you a general idea of what it is and what it is for.</p><p><iframe src="//player.bilibili.com/player.html?aid=678849381&bvid=BV1Mm4y1Z7C4&cid=501387405&page=1" scrolling="no" border="0" frameborder="no" framespacing="0" allowfullscreen="true"> </iframe></p><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Gradient Descent, Step-by-Step by <a href='https://www.youtube.com/c/joshstarmer'>StatQuest</a></i></p><p>Therefore by borrowing the main idea from the Gradient Descent, we can update our floor value based on the difference between the current total asset value and the total asset value from the previous period. We multiply this difference with the learning rate $\alpha$ and add it to the previous total asset value. Then the base value used for calculating the floor level is now one step closer to the current total asset value. Here’s what the formula looks like:</p><script type="math/tex; mode=display">\begin{align}\text{diff} &= \text{Current AV} - \text{Previous AV}\\\text{Updated Base Value} &= \text{Previous AV} + \text{diff} * \alpha\\\text{Smart Floor} &= \text{Updated Base Value} * \text{Floor Percentage}\\\end{align}</script><script type="math/tex; mode=display">where</script><script type="math/tex; mode=display">AV = \text{Asset Value}</script><script type="math/tex; mode=display">\alpha = \text{Learning Rate}</script><p>To mitigate the impact of trading fee charges, you can also add a buffer when updating the <em>Base Value</em> to make sure you don’t update it too frequently, causing unnecessary loss on the trading fee.</p><h1 id="Backtesting-and-result"><a href="#Backtesting-and-result" class="headerlink" title="Backtesting and result"></a>Backtesting and result</h1><p>Now we have three ways to update our floor value: <code>basic floor</code>, <code>new high floor</code>, <code>smart floor</code>, and two ways to decide our multiplier: <code>fixed multiplier</code> and <code>dynamic multiplier</code>. Including the <code>buy and hold</code> strategy as the benchmark, let’s mashup these conditions and start conducting backtest against each of the scenarios.</p><h2 id="Platform"><a href="#Platform" class="headerlink" title="Platform"></a>Platform</h2><p><a href="https://www.quantconnect.com/">QuantConnect</a></p><h2 id="Universe"><a href="#Universe" class="headerlink" title="Universe"></a>Universe</h2><ul><li>Using <code>SPY</code> as the portfolio benchmark</li><li><code>SHV</code> as risk-free asset</li><li><code>SSO</code> as the risky asset</li></ul><h2 id="Rebalancing-strategy"><a href="#Rebalancing-strategy" class="headerlink" title="Rebalancing strategy"></a>Rebalancing strategy</h2><p>We update the CPPI and floor value every week, and then we adjust the proportion of risky and risk-free assets accordingly.</p><h2 id="Backtest-time-frame"><a href="#Backtest-time-frame" class="headerlink" title="Backtest time frame"></a>Backtest time frame</h2><p>Three time periods we’re going to test against with:</p><ol><li>Full period: <em>2010, 1, 1 - 2022, 11, 1</em><ul><li>To backtest the complete period and evaluate the overall performance</li></ul></li><li>Bear market period: <em>2007, 1, 1 - 2012, 1, 1</em><ul><li>To backtest the bear market scenario and evaluate accordingly</li></ul></li><li>Bull market period: <em>2015, 1, 1 - 2019, 1, 1</em><ul><li>To backtest the slow bull market and evaluate its performance</li></ul></li></ol><img data-src="/2022/11/04/2022-11-10-advanced-cppi-strategy/backtest_period.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Benchmark backtest full period: 2010/01/01 - 2022/11/01</i></p><h2 id="Execution-and-backtest"><a href="#Execution-and-backtest" class="headerlink" title="Execution and backtest"></a>Execution and backtest</h2><p>Here we go!</p><h3 id="Scenarios"><a href="#Scenarios" class="headerlink" title="Scenarios"></a>Scenarios</h3><p>We’re going to run the following backtest scenarios against three time periods:</p><ul><li>SPY, Buy-and-hold</li><li>SSO, Buy-and-hold</li><li>SSO, Basic floor, Fixed Multiplier</li><li>SSO, Basic floor, Dynamic Multiplier</li><li>SSO, Newhigh floor, Fixed Multiplier</li><li>SSO, Newhigh floor, Dynamic Multiplier</li><li>SSO, Learning floor, Fixed Multiplier</li><li>SSO, Learning floor, Dynamic Multiplier</li></ul><h3 id="Backtesting-results"><a href="#Backtesting-results" class="headerlink" title="Backtesting results"></a>Backtesting results</h3><h4 id="Full-period"><a href="#Full-period" class="headerlink" title="Full period"></a>Full period</h4><img data-src="/2022/11/04/2022-11-10-advanced-cppi-strategy/fulltime_backtest.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Backtest full period: 2010/01/01 - 2022/11/01</i></p><p>The backtest results on the right-hand side have the highest max drawdown 59.3% among all scenarios. We can see that the scenarios <code>SSO, Learning floor, Fixed Multiplier</code> and <code>SSO, Learning floor, Dynamic Multiplier</code> both have similar annual returns as the <code>SPY benchmark</code> scenario, but have lower and better MMD and variance. At the first glance, the <code>Smart (learning) floor</code> seems to have a great capability to keep to upward potential and limit the downward risks. Now let’s look at the other two scenarios.</p><img data-src="/2022/11/04/2022-11-10-advanced-cppi-strategy/fulltime_chart.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Scatter chart of full period: 2010/01/01 - 2022/11/01</i></p><h4 id="Bear-market-period"><a href="#Bear-market-period" class="headerlink" title="Bear market period"></a>Bear market period</h4><img data-src="/2022/11/04/2022-11-10-advanced-cppi-strategy/bear_backtest.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Backtest bear period: 2007/01/01 - 2012/01/01</i></p><p>It’s very obvious that the scenario <code>SSO, Learning floor, Fixed Multiplier</code> and <code>SSO, Learning floor, Dynamic Multiplier</code> has the highest returns among all the backtests including the benchmark buy-and-hold backtest. The Sharpe ratio, MDD, and annual variance of these two scenarios are also improved compared to the benchmark scenario. Seems <code>CPPI with learning floor</code> does have the capability to limit the downward loss.</p><img data-src="/2022/11/04/2022-11-10-advanced-cppi-strategy/bear_chart.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Scatter chart of bear period: 2007/01/01 - 2012/01/01</i></p><h4 id="Bull-market-period"><a href="#Bull-market-period" class="headerlink" title="Bull market period"></a>Bull market period</h4><img data-src="/2022/11/04/2022-11-10-advanced-cppi-strategy/bull_backtest.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Backtest bull period: 2015/01/01 - 2019/01/01</i></p><p>The distribution of the backtests looks similar to the full-time period chart. The three right-most scenarios <code>SSO-buy-and-hold</code>, <code>SSO-basic-fixed</code>, and <code>SSO-basic-dynamic</code> have the highest MDD, which is the outcome that we’re trying to avoid from the beginning. In the meantime, the scenario <code>SSO, Learning floor, Fixed Multiplier</code> and <code>SSO, Learning floor, Dynamic Multiplier</code> still outperform the other scenarios in terms of overall returns and MDD.</p><img data-src="/2022/11/04/2022-11-10-advanced-cppi-strategy/bull_chart.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Scatter chart of bull period: 2015/01/01 - 2019/01/01</i></p><h1 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h1><p>Here’s my report card of my winning candidate strategy: <code>SSO, learning floor, fixed multiplier</code>:<br><img data-src="/2022/11/04/2022-11-10-advanced-cppi-strategy/my_winning_candidate.png" class="" width="600"></p><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>SSO; Smart Floor; Fixed multiplier</i></p><p>It seems that all the scenarios are able to limit their downside risk due to the fact that we added risk-free assets into our portfolio, achieving the goal to diversify the risk. Even though the returns are diluted, some of the scenarios still show promising outcomes by keeping proportional profit in the book (especially the scenarios using <code>Smart Floor</code>). As for the impact of whether adopting either dynamic multiplier or fixed multiplier is not yet significant, there would be more research needed to make a statistical decision on which method works better than the other.</p><hr><h1 id="Misc"><a href="#Misc" class="headerlink" title="Misc"></a>Misc</h1><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ul><li><a href="https://quantpedia.com/introduction-to-cppi-constant-proportion-portfolio-insurance/">Introduction to CPPI – Constant Proportion Portfolio Insurance</a></li><li><a href="https://wiki.mbalib.com/wiki/%E5%9B%BA%E5%AE%9A%E6%AF%94%E4%BE%8B%E6%8A%95%E8%B5%84%E7%BB%84%E5%90%88%E4%BF%9D%E9%99%A9%E7%AD%96%E7%95%A5">MBA Lib: Constant Proportion Portfolio Insurance Strategy</a></li><li><a href="http://www.scienpress.com/Upload/JFIA/Vol%207_3_2.pdf">Portfolio insurance strategies in a low interest rate environment: A simulation based study</a></li><li><a href="https://medium.com/swlh/protect-your-portfolio-using-cppi-strategy-in-python-c3184c2b6767">If you can’t beat the market at least you can protect from it using Python</a></li></ul><h2 id="Backtest-code"><a href="#Backtest-code" class="headerlink" title="Backtest code"></a>Backtest code</h2><script src='https://www.quantconnect.com/terminal/backtest.js?sid=b67dea595ad0a269ac788f35379b8971'></script>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2022/11/04/2022-11-10-advanced-cppi-strategy/cover.jpg&quot; class=&quot;&quot; width=&quot;600&quot;&gt;
&lt;p style=&quot;font-size: 0.8em; text-align:center; color: grey;&quot;&gt;
  &lt;i&gt;Photo by &lt;a href=&#39;https://medium.com/r?url=https%3A%2F%2Funsplash.com%2F%40sasun1990%3Futm_source%3Dmedium%26utm_medium%3Dreferral&#39;&gt;Sasun Bughdaryan&lt;/a&gt; on &lt;a href=&#39;https://unsplash.com/?utm_source=medium&amp;utm_medium=referral&#39;&gt;Unsplash&lt;/a&gt;&lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;We’ve been talking too much about the attack side of quantitative trading, such as momentum, mean reversion, and ML. These strategies aim to outperform the benchmark/index by adding your personal points of view to the trading strategies. Beating the benchmark becomes the only goal when playing the offense. What about defense? After reading &lt;a href=&quot;https://quantpedia.com/introduction-to-cppi-constant-proportion-portfolio-insurance/&quot;&gt;Introduction to CPPI – Constant Proportion Portfolio Insurance&lt;/a&gt;, I started to feel that I can’t agree more with the idea of “The best defense is a good offense” once said by Sun Tzu, a Chinese military general, a strategist, and a philosopher. What does defense mean in the field of quantitative trading? Does defense mean we strive not to lose money and then nothing else worth doing? Maybe talking about the CPPI strategy would give us a better picture of what actually defense means to the traders. Let’s now have a look at how to approach the other side of trading.&lt;/p&gt;</summary>
    
    
    <category term="Quantitative Trading" scheme="http://mikelhsia.github.io/categories/Quantitative-Trading/"/>
    
    
    <category term="Strategy" scheme="http://mikelhsia.github.io/tags/Strategy/"/>
    
    <category term="Momentum" scheme="http://mikelhsia.github.io/tags/Momentum/"/>
    
    <category term="Backtesting" scheme="http://mikelhsia.github.io/tags/Backtesting/"/>
    
    <category term="Fundamental Analysis" scheme="http://mikelhsia.github.io/tags/Fundamental-Analysis/"/>
    
  </entry>
  
  <entry>
    <title>【Momentum Trading】Use machine learning to boost your day trading skill - meta-labeling</title>
    <link href="http://mikelhsia.github.io/2022/10/21/2022-10-15-meta-label/"/>
    <id>http://mikelhsia.github.io/2022/10/21/2022-10-15-meta-label/</id>
    <published>2022-10-21T06:59:59.000Z</published>
    <updated>2022-10-22T17:20:35.140Z</updated>
    
    <content type="html"><![CDATA[<img data-src="/2022/10/21/2022-10-15-meta-label/cover.jpeg" class="" width="600"><p>The Triple barrier method and meta-labeling technique were together introduced in the book <strong><em>Advances in Financial Machine Learning</em></strong> by <em>Marcos Lopez De Prado</em>. It seems that the combination of these two tools makes a great pair to either stabilize or further increase your portfolio growth. In this post, I’m going to quote my old research result (<a href="https://mikelhsia.github.io/2021/11/03/2021-11-06-rsi-indicator/">here</a>) from last time as the fundamental strategy benchmark, and apply these two techniques to see what beneficial impact we could bring to this strategy.</p><a id="more"></a><p><strong><em>Previous researches</em></strong></p><ul><li><a href="https://mikelhsia.github.io/2022/03/18/2022-03-22-supertrend-indicator/">【Momentum Trading】Yes or No? Adopting the Supertrend indicator in your trading strategies?</a></li><li><a href="https://mikelhsia.github.io/2021/11/03/2021-11-06-rsi-indicator/">【Momentum Trading】Four strategies of using RSI indicator to better time your market entry</a></li><li><a href="https://mikelhsia.github.io/2021/07/19/2021-07-20-advanced-macd-strategy/">【Momentum Trading】Optimize your MACD strategies with advanced indicators</a></li></ul><h1 id="Motivation"><a href="#Motivation" class="headerlink" title="Motivation"></a>Motivation</h1><p>After researching the combination of several technical indicators as buy-in signals, I feel the research framework is missing a robust method to mitigate the subjective impact of the technical indicators. Theoretically speaking, stock prices and other statistics represent the current market overview. Using this information to predict future stock price movements would be irrational. However, standing from the behavioral finance point of view, the historic stock prices and statistics could be used to summarize the standard behavior of general investors’ actions when certain critical points were reached. That’s where the momentum trading strategy begins to thrive. Traders/Investors combine several effective indicators and define the fixed or dynamic threshold to find the group of stocks that possess the momentum (uptrend or downtrend) in them. Then the problem comes back to, how do we define the threshold in a more objective method.</p><p>In the book <strong><em>Advances in Financial Machine Learning</em></strong> by <em>Marcos Lopez De Prado</em>, the Triple barrier method <em>(Chapter 3.4)</em> and the meta-labeling technique <em>(Chapter 3.6)</em> were introduced with his quote <em>“In that case, meta-labeling will help us figure out when we should pursue or dismiss a discretionary PM’s call” (Page 54).</em> These two tools could be adopted and leverage the power of machine learning to mitigate the subjectiveness in the technical-indicator-oriented momentum strategy.</p><p>As usual, I’m not going to introduce these two ideas from ground zero. The article <a href="http://www.sefidian.com/2021/06/26/labeling-financial-data-for-machine-learning/"><em>What is Triple Barrier Method(TBM) and Meta-labeling</em></a> breaks down the definitions of these two terms and attaches the code snippet for easier comprehension. See below for you to understand what they are.</p><h2 id="The-definition-of-the-Triple-Barrier-Method-TBM"><a href="#The-definition-of-the-Triple-Barrier-Method-TBM" class="headerlink" title="The definition of the Triple Barrier Method (TBM)"></a>The definition of the Triple Barrier Method (TBM)</h2><p>TBM adopts two horizontal lines and one vertical line to form a box, which is used for deciding the next move depending on the relative position of the stock price inside the box. There are three potential scenarios would be produced:</p><ul><li>If the upper barrier (profit-take) is hit first. Label “buy” or “1”.</li><li>If the lower barrier (stop-loss) is hit first. Label “sell” or “-1”.</li><li>If the vertical barrier (expiration) is hit first. Label = “return in this period” or “0”.</li></ul><img data-src="/2022/10/21/2022-10-15-meta-label/triple_barrier.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Triple Barrier Method Scenario 3 - hitting the vertical barrier</i></p><img data-src="/2022/10/21/2022-10-15-meta-label/triple_barrier2.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Triple Barrier Method Scenario 1 - hitting the horizontal barrier</i></p><h2 id="The-definition-of-the-meta-Labeling"><a href="#The-definition-of-the-meta-Labeling" class="headerlink" title="The definition of the meta-Labeling"></a>The definition of the meta-Labeling</h2><p>The meta-Labeling sounds like simply an extra label, but it is actually a term that indicates a series of actions for getting the final prediction at the end. <a href="https://www.youtube.com/watch?v=ZCFmZFBtqsQ">This Youtube video</a> by <a href="https://hudsonthames.org/">Hudson &amp; Thames</a> successfully summarizes the core idea of Meta-labeling:</p><blockquote><p><em>Meta-labeling is a machine learning (ML) layer that sits on top of any base primary strategy to help size positions, filter out false-positive signals, and improve metrics such as the Sharpe ratio and maximum drawdown.</em></p></blockquote><p>The steps to implement the meta-labeling can be summarized in the followings:</p><ol><li>Build the primary fundamental model and get the fundamental prediction.</li><li>Use a fixed value to filter the prediction.</li><li>Combine the prediction into your <code>x_train</code> as your new training data.</li><li>Combine the prediction into your <code>y_train</code> to form the new <code>y_train</code> data.</li><li>Construct the secondary model, and use your new <code>x_train</code> and <code>y_train</code> to train your secondary model.</li><li>Feed your <code>test_data</code> into both your primary and secondary model, and produce the predictions respectively.</li><li>Combine the predictions of both the primary and secondary models in order to acquire your final prediction.</li></ol><img data-src="/2022/10/21/2022-10-15-meta-label/meta_label_process.png" class="" width="800"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Meta-Labeling process</i></p><p>You might have questions right now as you just read my <a href="https://mikelhsia.github.io/2022/08/20/2022-08-20-votingclassifier/">previous post</a> and ask <em>“weren’t the Meta-labeling and the ensemble learning referring to the same thing?”</em> The fundamental difference between these two is that ensemble learning (especially the stacking method) solely adds its prediction into the original training set as a new feature. On the other hand, the meta-labeling not only adds its own prediction into the training set, but also modifies the <code>y_label</code> in the training set in accordance with the signal generation logic from the primary model. By having general ideas of what these two are, we can start strategizing how to achieve our goal from zero to one.</p><h1 id="Train-of-thought-from-zero-to-one"><a href="#Train-of-thought-from-zero-to-one" class="headerlink" title="Train of thought - from zero to one"></a>Train of thought - from zero to one</h1><p>To accomplish this backtest, I’ve summarized five steps below to give you a big picture of what we’re going to do:</p><ol><li>Construct our primary model and generate meta label using our primary model</li><li>Use our <strong>modified</strong> Triple-Barrier Method to generate training data for training our secondary model</li><li>Construct the secondary machine learning model</li><li>Train the secondary model</li><li>Execute and place orders with the combined signals.</li></ol><h2 id="1-Construct-our-primary-model"><a href="#1-Construct-our-primary-model" class="headerlink" title="1. Construct our primary model"></a>1. Construct our primary model</h2><p>First of all, we use the strategy left from <a href="https://mikelhsia.github.io/2021/11/03/2021-11-06-rsi-indicator/">here</a> and use the buy/sell signals generated from it as the Meta-label. We used MACD, Awesome Oscillator, and RSI indicator to generate our trading (buy/sell) signals. Other than this, we also prepare the following factors for later use:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">FEATURES = [</span><br><span class="line">  <span class="string">&#x27;_1d_rtn&#x27;</span>, <span class="string">&#x27;_3d_rtn&#x27;</span>, <span class="string">&#x27;_5d_rtn&#x27;</span>, <span class="string">&#x27;_10d_rtn&#x27;</span>, <span class="string">&#x27;_20d_rtn&#x27;</span>, <span class="string">&#x27;_60d_rtn&#x27;</span>,</span><br><span class="line">  <span class="string">&#x27;_1d_volume_change&#x27;</span>, <span class="string">&#x27;_3d_volume_change&#x27;</span>, <span class="string">&#x27;_5d_volume_change&#x27;</span>, <span class="string">&#x27;_10d_volume_change&#x27;</span>, <span class="string">&#x27;_20d_volume_change&#x27;</span>, <span class="string">&#x27;_60d_volume_change&#x27;</span>,</span><br><span class="line">  <span class="string">&#x27;_macd&#x27;</span>, <span class="string">&#x27;_macd_histo&#x27;</span>, <span class="string">&#x27;_macd_change_3&#x27;</span>, <span class="string">&#x27;_macd_change_5&#x27;</span>, <span class="string">&#x27;_macd_change_10&#x27;</span>, <span class="string">&#x27;_macd_change_15&#x27;</span>,</span><br><span class="line">  <span class="string">&#x27;_macd_signal_change_3&#x27;</span>, <span class="string">&#x27;_macd_signal_change_5&#x27;</span>, <span class="string">&#x27;_macd_signal_change_10&#x27;</span>, <span class="string">&#x27;_macd_signal_change_15&#x27;</span>,</span><br><span class="line">  <span class="string">&#x27;_macd_histo_change_3&#x27;</span>, <span class="string">&#x27;_macd_histo_change_5&#x27;</span>, <span class="string">&#x27;_macd_histo_change_10&#x27;</span>, <span class="string">&#x27;_macd_histo_change_15&#x27;</span>,</span><br><span class="line">  <span class="string">&#x27;_rsi&#x27;</span>, <span class="string">&#x27;_rsi_change_3&#x27;</span>, <span class="string">&#x27;_rsi_change_5&#x27;</span>, <span class="string">&#x27;_rsi_change_10&#x27;</span>, <span class="string">&#x27;_rsi_change_15&#x27;</span>, <span class="string">&#x27;_awesome_oscillator&#x27;</span>,</span><br><span class="line">]</span><br></pre></td></tr></table></figure><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Factors to be used in the second machine learning model</i></p><h2 id="2-Modified-Triple-Barrier-Method"><a href="#2-Modified-Triple-Barrier-Method" class="headerlink" title="2. Modified Triple-Barrier Method"></a>2. Modified Triple-Barrier Method</h2><p>In the original Triple-Barrier Method, the difference between the first and second vertical barriers indicates the expiration time that is a fixed number. However, the sell signal generated from our primary model could happen before reaching the expiration time. Secondly, we’re not able to predict the exact time between the buy signal and sell signal as each stock would have its own cycle and price movement velocity. Therefore, we need to slightly twist the definition of <strong>expiration time</strong> in Triple-Barrier Method by defining different expiration times for each stock base on the time between the time of generating the buy signal and the time of generating the sell signal.</p><img data-src="/2022/10/21/2022-10-15-meta-label/modified_triple_barrier.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Modified Triple-Barrier Method using buy/sell signals to form a close period</i></p><p>Since we have defined the expiration time, then we need to resample the original training data into a usable and meaningful format. For example, if we have a time series data that includes ‘price’, ‘1d_rtn’, ‘3d_rtn’, ‘1d_vol’, and ‘3d_vol’ as follows:</p><img data-src="/2022/10/21/2022-10-15-meta-label/data_process_1.png" class="" width="500"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Example data - 1</i></p><p>Once we have the original data, now let’s generate the buy and sell signal with it and attach the signals generated to its own row. You can easily mark the row that has one buy signal and one sell signal.<br><img data-src="/2022/10/21/2022-10-15-meta-label/data_process_2.png" class="" width="500"></p><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Example data - 2</i></p><p>Lastly, just as we will do when resampling the data, we keep all the factors in the row that has buy signal equal to <code>True</code>. We calculate return gain/loss between the buy and sell signals. Then we remove the column ‘price’, ‘buy signal’, and ‘sell signal’. In the end, we will have our training data that is used for training our secondary model.</p><img data-src="/2022/10/21/2022-10-15-meta-label/data_process_3.png" class="" width="500"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Example data - 3</i></p><h2 id="3-Construct-our-secondary-machine-learning-model"><a href="#3-Construct-our-secondary-machine-learning-model" class="headerlink" title="3. Construct our secondary machine learning model"></a>3. Construct our secondary machine learning model</h2><p>Here I use the basic neural network machine learning model to predict the winning stocks. There are two hidden layers in the model. Second, we use <code>Leaky ReLU</code> as the activation function of the hidden layers as I want the negative values to be able to update our model weights instead of doing nothing. I found <a href="https://mlfromscratch.com/activation-functions-explained/#/">this post</a> very useful to understand the differences among various activation functions such as GELU, SELU, ELU, ReLU, and Leaky ReLU. Also, since it’s going to be a binary classification to predict whether the trades we made are profitable or not, we’re using <code>binary_crossentropy</code> as our loss function.</p><p>See below for the summary of my neural network setup:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">self.model = tf.keras.Sequential()</span><br><span class="line">self.model.add(tf.keras.layers.Dense(len(FEATURES), activation=tf.keras.layers.LeakyReLU(alpha=<span class="number">0.01</span>), input_shape=(len(FEATURES), ), name=<span class="string">&quot;dense_1&quot;</span>))</span><br><span class="line">self.model.add(tf.keras.layers.Dropout(<span class="number">0.1</span>),)</span><br><span class="line">self.model.add(tf.keras.layers.Dense(len(FEATURES), activation=tf.keras.layers.LeakyReLU(alpha=<span class="number">0.01</span>), name=<span class="string">&quot;dense_2&quot;</span>))</span><br><span class="line">self.model.add(tf.keras.layers.Dropout(<span class="number">0.1</span>)),</span><br><span class="line">self.model.add(tf.keras.layers.Dense(<span class="number">1</span>, activation=<span class="string">&quot;sigmoid&quot;</span>, name=<span class="string">&quot;predictions&quot;</span>))</span><br><span class="line"></span><br><span class="line"><span class="comment"># Compile model</span></span><br><span class="line">self.model.compile(</span><br><span class="line">    optimizer=<span class="string">&#x27;Adam&#x27;</span>,</span><br><span class="line">    loss=<span class="string">&#x27;binary_crossentropy&#x27;</span>,</span><br><span class="line">)</span><br></pre></td></tr></table></figure><h2 id="4-Training-and-predicting"><a href="#4-Training-and-predicting" class="headerlink" title="4. Training and predicting"></a>4. Training and predicting</h2><p>Since having our training data and our secondary model ready in steps 2 and 3, we are now going to feed the data into our machine-learning model and start training. Before that, do remember that our data is raw and could have many outliers and missing data that could potentially contaminate the results of the prediction. Feature engineering is a must-take step. Here are a few things I did before throwing data into the black box:</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">PrepareData</span>(<span class="params">self, data</span>):</span></span><br><span class="line">  <span class="string">&#x27;&#x27;&#x27;Prepares the data for a format friendly for our model&#x27;&#x27;&#x27;</span></span><br><span class="line">  target_column = <span class="string">&#x27;rtn_bin&#x27;</span></span><br><span class="line">  data = self.__LabelYData(data, <span class="string">&#x27;_y_trade_rtn&#x27;</span>, target_column)</span><br><span class="line"></span><br><span class="line">  data_tmp = data.dropna()</span><br><span class="line"></span><br><span class="line">  X_train = data_tmp.loc[:, FEATURES]</span><br><span class="line">  y_train = (data_tmp.loc[:, target_column]).astype(int)</span><br><span class="line"></span><br><span class="line">  X_train = self.__WinsorizeCustom(X_train, FEATURES)</span><br><span class="line">  X_train = self.__LogCustom(X_train, LOGNORMAL_FEATURE)</span><br><span class="line"></span><br><span class="line">  <span class="keyword">return</span> X_train, y_train</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">__LabelYData</span>(<span class="params">self, df, source=<span class="string">&#x27;_y_trade_rtn&#x27;</span>, rtn_bin=<span class="string">&#x27;rtn_bin&#x27;</span></span>):</span></span><br><span class="line">  <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">__WinsorizeCustom</span>(<span class="params">self, df, cols: list</span>):</span></span><br><span class="line">  <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">__LogCustom</span>(<span class="params">self, df, cols: list</span>):</span></span><br><span class="line">  <span class="keyword">pass</span></span><br></pre></td></tr></table></figure><h3 id="4-1-Create-y-label-to-train-our-model"><a href="#4-1-Create-y-label-to-train-our-model" class="headerlink" title="4.1. Create y_label to train our model"></a>4.1. Create y_label to train our model</h3><p>We need our dependent variable, the so-called <code>y label</code>, to train our secondary machine learning model. There are various ways to achieve this. I pick the easiest method to create the <code>y label</code> by assigning <code>True</code> to the stock that its return is greater than <strong>3.0%</strong> after we sell it. You can pick other methods and see which better suits your scenarios and models.</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">__LabelYData</span>(<span class="params">self, df, source=<span class="string">&#x27;_y_trade_rtn&#x27;</span>, rtn_bin=<span class="string">&#x27;rtn_bin&#x27;</span></span>):</span></span><br><span class="line">  df[rtn_bin] = np.nan</span><br><span class="line"></span><br><span class="line">  df[rtn_bin] = df[source] &gt; <span class="number">0.03</span></span><br><span class="line"></span><br><span class="line">  <span class="keyword">return</span> df</span><br></pre></td></tr></table></figure><h3 id="4-2-Winsorize-the-outliers"><a href="#4-2-Winsorize-the-outliers" class="headerlink" title="4.2. Winsorize the outliers"></a>4.2. Winsorize the outliers</h3><p>Winsorization is the process of replacing the values of outliers with the less impactful smaller values. Here in order to reduce the impact of extreme values, we use the 5 percentile value to replace the extremely small value, and use the 95 percentile value to replace the extremely large value.</p><img data-src="/2022/10/21/2022-10-15-meta-label/winsorize.png" class="" width="600"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Winsorization</i></p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">__WinsorizeCustom</span>(<span class="params">self, df, cols: list</span>):</span></span><br><span class="line">  <span class="keyword">for</span> col <span class="keyword">in</span> cols:</span><br><span class="line">    quantiles = df.loc[:, col].quantile([<span class="number">0.05</span>, <span class="number">0.95</span>])</span><br><span class="line">    q_05 = quantiles.loc[<span class="number">0.05</span>]</span><br><span class="line">    q_95 = quantiles.loc[<span class="number">0.95</span>]</span><br><span class="line"></span><br><span class="line">    df.loc[:, col] = np.where(</span><br><span class="line">      df.loc[:, col].values &lt;= q_05,</span><br><span class="line">      q_05,</span><br><span class="line">      np.where(</span><br><span class="line">        df.loc[:, col].values &gt;= q_95,</span><br><span class="line">        q_95,</span><br><span class="line">        df.loc[:, col].values</span><br><span class="line">      )</span><br><span class="line">    )</span><br><span class="line">  <span class="keyword">return</span> df</span><br></pre></td></tr></table></figure><h3 id="4-3-Transform-our-data-to-the-log-normal-distribution"><a href="#4-3-Transform-our-data-to-the-log-normal-distribution" class="headerlink" title="4.3. Transform our data to the log-normal distribution"></a>4.3. Transform our data to the log-normal distribution</h3><p>It’s a well-known fact in financial machine learning, that having our data normally distributed is the prerequisite of an effective machine learning model. After plotting each of the factors in the histogram, you can easily tell which feature is skewed, then you apply log transform to make it less skewed. One thing that is worth bringing up again, is that some features could contain a <code>0</code> value. Since log 0 doesn’t exist and will return <code>NaN</code>, causing model training to fail, make sure you add <code>1</code> before you log transform the feature values.</p><img data-src="/2022/10/21/2022-10-15-meta-label/log_transform.png" class="" width="800"><p style="font-size: 0.8em; text-align:center; color: grey;">  <i>Transform the right-skewed distribution to normal distribution</i></p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">__LogCustom</span>(<span class="params">self, df, cols: list</span>):</span></span><br><span class="line">  <span class="keyword">for</span> col <span class="keyword">in</span> cols:</span><br><span class="line">    df.loc[:, col] = np.log(df.loc[:, col] + <span class="number">1</span>)</span><br><span class="line">  <span class="keyword">return</span> df</span><br></pre></td></tr></table></figure><h3 id="4-4-Drop-the-null-data"><a href="#4-4-Drop-the-null-data" class="headerlink" title="4.4. Drop the null data"></a>4.4. Drop the null data</h3><p>Don’t forget to trim the null data. Otherwise, your model is going to fail while training and predicting.</p><h2 id="5-Execute-and-place-orders-with-the-combined-signal"><a href="#5-Execute-and-place-orders-with-the-combined-signal" class="headerlink" title="5. Execute and place orders with the combined signal"></a>5. Execute and place orders with the combined signal</h2><p>Lastly, to summarize, our trading strategy would be that when we receive the signal from the primary model, we send the data on the day as the testing data for predicting with the secondary model. Once the secondary model confirms the signal with the prediction higher than 0.5 possibility to be a winning trade, then we place the buy order. As for selling the stock, we don’t need confirmation from the secondary model. As long as the primary model confirms and generates the sell signal, we sell the related holding stock.</p><h1 id="Backtesting-and-result"><a href="#Backtesting-and-result" class="headerlink" title="Backtesting and result"></a>Backtesting and result</h1><p>Ok. Let’s see what the backtesting results look like following the strategy that we describe above. I will start by describing the backtesting scenarios that we’re going to perform, and then demonstrate the results.</p><h2 id="Platform"><a href="#Platform" class="headerlink" title="Platform"></a>Platform</h2><p><a href="https://www.quantconnect.com/">QuantConnect</a></p><h2 id="Universe"><a href="#Universe" class="headerlink" title="Universe"></a>Universe</h2><ul><li>Sort stocks by <code>PERatio</code>, <code>EPS</code>, <code>ROE</code>, <code>NetIncome</code> and take top 60%</li><li>Sort stocks by <code>PBRatio</code>, from high to low</li><li>Focus on <code>technology</code> industry</li><li>Using QQQ as the portfolio benchmark</li></ul><h2 id="Rebalancing-Strategy"><a href="#Rebalancing-Strategy" class="headerlink" title="Rebalancing Strategy"></a>Rebalancing Strategy</h2><ul><li>Recalculate our universe and indicators to search for the buy and sell signals every day.</li><li>We keep 10 stocks that have buy-in signals and with the highest PBRatio.</li><li>We assign weight to each position evenly.</li><li>We don’t adjust the weight of each stock until we close these positions.</li><li>The secondary model will be re-trained monthly, weekly, and daily.</li></ul><h2 id="Backtest-time-frame"><a href="#Backtest-time-frame" class="headerlink" title="Backtest time frame"></a>Backtest time frame</h2><p>Backtest Date: <code>2018, 12 ,29</code> ~ <code>2022, 09, 24</code></p><h2 id="Execution-and-backtest"><a href="#Execution-and-backtest" class="headerlink" title="Execution and backtest"></a>Execution and backtest</h2><h3 id="Scenarios"><a href="#Scenarios" class="headerlink" title="Scenarios"></a>Scenarios</h3><p>There are two strategies in our arsenal and I’m going to try them out. I’m also going to take the frequency of updating our model into consideration by retraining the model with the up-to-date data every month, week, and day. Therefore, there are going to be six scenarios and two basic scenarios as our benchmark strategy in our backtests.</p><p><strong><em>1. MACD strategy benchmark</em></strong><br><strong><em>2. MACD strategy + Update monthly</em></strong><br><strong><em>3. MACD strategy + Update weekly</em></strong><br><strong><em>4. MACD strategy + Update daily</em></strong><br><strong><em>5. MACD+RSI strategy benchmark</em></strong><br><strong><em>6. MACD+RSI strategy + Update monthly</em></strong><br><strong><em>7. MACD+RSI strategy + Update weekly</em></strong><br><strong><em>8. MACD+RSI strategy + Update daily</em></strong></p><h3 id="Backtesting-results"><a href="#Backtesting-results" class="headerlink" title="Backtesting results"></a>Backtesting results</h3><div class="table-container"><table><thead><tr><th>Strategy</th><th>Total Trades</th><th>PSR</th><th>Unrealized</th><th>Fee</th><th>Return</th><th>Sharpe</th><th>MDD</th><th>Win rate</th><th>Alpha</th><th>Beta</th><th>Annual variance</th></tr></thead><tbody><tr><td>MACD Benchmark</td><td>838</td><td>23.299%</td><td>-$29,616.40</td><td>-$4,091.44</td><td><strong>146.89%</strong></td><td>0.787</td><td>42.300%</td><td><strong>67%</strong></td><td>0.119</td><td>1.053</td><td>0.087</td></tr><tr><td>MACD Monthly</td><td>928</td><td>14.622%</td><td>-$30,950.28</td><td>-$4,372.03</td><td><font color="scarlet">79.55%</font></td><td>0.587</td><td>44.700%</td><td><font color="scarlet">66%</font></td><td>0.038</td><td>1.067</td><td>0.068</td></tr><tr><td>MACD Weekly</td><td>834</td><td>21.410%</td><td>$-25,909.75</td><td>-$3,761.04</td><td><font color="green">150.54%</font></td><td>0.773</td><td>41.200%</td><td><font color="green">68%</font></td><td>0.125</td><td>1.074</td><td>0.096</td></tr><tr><td>MACD Daily</td><td>832</td><td>28.575%</td><td>$-36,755.89</td><td>-$4,177.46</td><td><font color="green">182.07%</font></td><td>0.88</td><td>42.500%</td><td><font color="green">68%</font></td><td>0.154</td><td>1.02</td><td>0.09</td></tr><tr><td>MACD+RSI Benchmark</td><td>1270</td><td>18.732%</td><td>$-14,361.51</td><td>-$6,592.32</td><td><strong>96.85%</strong></td><td>0.665</td><td>43.200%</td><td><strong>58%</strong></td><td>0.063</td><td>1.014</td><td>0.067</td></tr><tr><td>MACD+RSI Monthly</td><td>941</td><td>14.348%</td><td>$-7,308.52</td><td>-$4,382.58</td><td><font color="scarlet">72.05%</font></td><td>0.575</td><td>46.900%</td><td><font color="scarlet">56%</font></td><td>0.036</td><td>0.946</td><td>0.057</td></tr><tr><td>MACD+RSI Weekly</td><td>829</td><td>1.204%</td><td>$-436.44</td><td>-$3,535.95</td><td><font color="scarlet">-7.91%</font></td><td>0.03</td><td>56.000%</td><td><font color="scarlet">54%</font></td><td>-0.077</td><td>0.771</td><td>0.042</td></tr><tr><td>MACD+RSI Daily</td><td>898</td><td>19.534%</td><td>$-7,414.81</td><td>-$4,494.80</td><td><font color="green">92.44%</font></td><td>0.679</td><td>43.000%</td><td><font color="scarlet">56%</font></td><td>0.065</td><td>0.89</td><td>0.056</td></tr></tbody></table></div><h1 id="Beyond-and-next"><a href="#Beyond-and-next" class="headerlink" title="Beyond and next"></a>Beyond and next</h1><p>The purpose of Meta-labeling is not just for correcting the false-positive prediction, but also for raising the F1 score of the model. By adding another machine learning layer beyond the primary non-machine learning model, Meta-labeling enables the capability of processing quantitative fundamental data, technical indicators, and even arbitrary data in a more systematic way. This combines human intuition/experience and the power of machines, enhancing the interpretability and robustness of the model.</p><p>Even though the backtesting results failed to demonstrate the overwhelming power of the meta-labeling, there are still a few other thoughts and ideas to extend our backtesting and to further optimize our Meta-labeling trading algorithm:</p><ul><li>Add company-wise fundamental data into our training data to let our secondary machine learning model know more about the conditions and hence make better decisions.</li><li>In our backtest scenario, we round the prediction result and make it either True or False. That means the threshold is 0.5 that prediction lower than 0.5 would be deemed as a potentially losing trade, and the number above 0.5 would have a higher chance to become a winning trade. One thing we can do is to raise the threshold bar from 0.5 to a higher number to make sure you have an even higher chance to win in this trade. But keep this in mind, it’s going to be a trade-off between the number of trades and you could let the winning opportunities slip through your fingers.</li><li>Find a better method to label your y-label in order to distinguish the stocks that are going to soar or decline. You could either raise the original <strong>3.0%</strong> to a bigger number or mark the top 20% winning stocks so that our y-label will not be restricted to only the winning stocks during the bear market.</li></ul><h1 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h1><ul><li>The book <strong><em>Advances in Financial Machine Learning</em></strong> by <em>Marcos Lopes De Prado</em></li><li><a href="http://www.sefidian.com/2021/06/26/labeling-financial-data-for-machine-learning/">Labeling financial data for Machine Learning</a> by <em>Amir Masoud Sefidian</em></li><li><a href="https://www.youtube.com/watch?v=ZCFmZFBtqsQ">Meta-Labeling: Theory and Framework - Youtube video</a> by <a href="https://hudsonthames.org/"><em>Hudson &amp; Thames</em></a></li></ul><hr><blockquote><p>If you enjoy reading this, feel free to <a src='https://medium.com/@mikelhsia/membership'> join Medium membership program</a> to read more about Quantitative Trading Strategy.</p></blockquote>]]></content>
    
    
    <summary type="html">&lt;img data-src=&quot;/2022/10/21/2022-10-15-meta-label/cover.jpeg&quot; class=&quot;&quot; width=&quot;600&quot;&gt;
&lt;p&gt;The Triple barrier method and meta-labeling technique were together introduced in the book &lt;strong&gt;&lt;em&gt;Advances in Financial Machine Learning&lt;/em&gt;&lt;/strong&gt; by &lt;em&gt;Marcos Lopez De Prado&lt;/em&gt;. It seems that the combination of these two tools makes a great pair to either stabilize or further increase your portfolio growth. In this post, I’m going to quote my old research result (&lt;a href=&quot;https://mikelhsia.github.io/2021/11/03/2021-11-06-rsi-indicator/&quot;&gt;here&lt;/a&gt;) from last time as the fundamental strategy benchmark, and apply these two techniques to see what beneficial impact we could bring to this strategy.&lt;/p&gt;</summary>
    
    
    <category term="Quantitative Trading" scheme="http://mikelhsia.github.io/categories/Quantitative-Trading/"/>
    
    <category term="Machine Learning" scheme="http://mikelhsia.github.io/categories/Quantitative-Trading/Machine-Learning/"/>
    
    
    <category term="Technical Analysis" scheme="http://mikelhsia.github.io/tags/Technical-Analysis/"/>
    
    <category term="Backtesting" scheme="http://mikelhsia.github.io/tags/Backtesting/"/>
    
  </entry>
  
</feed>