Monday, November 30, 2015

Trade Idea: Positioning For ECB

The expectation for the next ECB is running high. Many from ECB, including Mr. Draghi, has already talked up further measures. There are already talks of telescopic monetary policy taxation in the air. And a lot of speculations to follow.

The ECB announced QE at the end of Jan this year. Then the market priced in QE aggressively as well, and yet ECB over-delivered. The continuation of the movement past the announcement and well past the start date is a proof of that. This time as well the market has priced in possibilities of further measures aggressively. The question is this time, will they deliver, or over-deliver?

The recent moves in the market in many way resembles the time period before the last QE announcement. And at the same time differs in some crucial details. The major similarities: 1) selling off in Euro, 2) rally in the front-end and major differences are 1) remarkable steadiness of the long-end and 2) steepening of the curve. In the figure below we show the excess moves in rates and slopes (relative to the US). The Blue columns are the move in the last QE. Red one current moves and the green columns shows what the current move should be if we adjust for the move in Euro (that is we assume the euro move correctly prices expectation and compare rates move based on that). As we can see the move in the front end much stronger than before, and reverse for the long end. In fact corrected for euro, the move in 10y is about fair. While 8th euro futures rallied most and 30y did much less than expected.

The bone of contention here is of course what exactly will the ECB do. As clear from the picture above, the market is fully or to a large extent pricing in an action in the front end, that is, a significant depo cut. And with all the stories of -20/-50 tiers or -35 flat or all other possible combination, it is hard to say what happens if ECB does a significant deposit cut. There is no reason to believe they cannot exceed expectation. So perhaps short end move is justified.

But here is the key. Whatever be the depo cut, it is in itself not important. It is plausibly true that the point of this depo cut is simply to make the QE program more tenable. With 15% of euro area govies trading below current depo, the ECB has a strong incentive. The question is if they do deliver, what does that mean for long end. It does not mean we have an increased supply, nor it means depo is reflationary. All it does it to save the QE program by making more bonds eligible. I have not checked for euro area, but based on Germany distribution of yields and amount outstanding, roughly a move from -20 to -35bps makes 84b more available. With a capital key of 18% that is ball-park 470b more papers to buy for ECB, approx. 8 months worth of QE. This is significant. But we also have to count in the feed-back response, as the market may potentially push the curve further down, and thus neutralizing a part of the impact the ECB hoped to create. So if this depo rate comes with any significant expansion of QE in terms of time or size, the long end should be biased for rally. And if the depo rate cut does not match market expectation, the short end will sell off back to previous levels.

And while we have all these, another point to note is the levels of vols. The implieds are way to high compared to delivered. But if we adjust for the Fed hike expectation (by computing implied/realized premiums in EUR over USD), the front ends are still cheaper compared to long end on a realized basis, with 5y around fair.

I believe whatever ECB does, it will hardly be a lasting change. Europe needs fiscal stimulus now. Monetary policy is just a tool to avoid falling behind, but can hardly give a large push ahead. Whatever the move follow the momentum, and then position for a fall back. ECB claims the QE has "clearly" worked, but the real rates in euro area were back at the 2014 levels at end of August, before the new QE expectation kicked in.

The trade here: A convex flattening position in 5s30s or 5s10s. If not through spread options, given the vol richness and underlying directionality, buying the belly payer vs. long end looks better than the alternative in a risk-reward consideration.  Otherwise a Nothing. Wait till the announcement and no point trying to fade the market from here.

EDIT (3-Dec-2015:08:55 UTC): The hidden risk to this view is the ECB doing away with the yield-floor limit for QE eligibility. That may lead to a large upward correction in long term yield and a significant steepening.

Sunday, November 29, 2015

Time Series Momentum Strategies | The Spirits Within - Part V

This is part of a series on time series momentum. Previous post on this are:

1. Part-I: Time Series vs Cross-sectional momentum
2. Part-II: Nature of linear time series momentum filters
3. Part-III: Types of sizing function
4. Part-IV: Strategy characteristics for random walk with a trend

In the last post we analyzed characteristics of a generic momentum strategy in case of an underlying following a random walk with a known trend. In this post we look in to the returns characteristics in case when the underlying is an auto-regressive process, specifically AR(1).

Once again, we assume the sizing function $\Psi$ is linear. Asset return $r_t$ is given by $r_t = \phi.r_{t-1}+\sigma.\epsilon$, where $\epsilon\sim N(0,1)$. Remebering that for linear function the expected return of the strategy is expected value of the signal times return, we get
$$E(R_t)=E(S_tr_t)=E(\sum_{s=t-k}^{t-1}(w_sr_s)r_t)=\sum_{s=t-k}^{t-1} w_s E(r_sr_t)=\sum_{s=t-k}^{t-1} w_s \gamma_s=\Sigma^2\sum_{s=t-k}^{t-1}w_s.\phi^s$$
Here $\Sigma^2= \frac{\sigma^2}{1-\phi^2}$ is the unconditional variance of the process. This makes the expected return sensitive to both the weighing scheme, i.e. the signal, and also exponentially sensitive to the auto-correlation coefficient $\phi$. Notice how this differs from the previous case. The higher order moments like variance and skew also vary exponentially with $\phi$ as in the figure below (left-hand one).

Everything (expected return, strategy vol, skew) increases with $\phi$, however increase in return is more than vol hence Sharpe improves as $\phi$ increases. While comparing across different types of positioning function (right hand chart above), the change in the Sharpe ratio is not very significant at reasonable values of $\phi$. The major difference comes in terms of higher order, i.e. skew and excess kurtosis. Again, we see that double-step and sigmoid present a competing choice between Sharpe ratio and positive strategy skew, perhaps with a bias to double-step in this particular case.

The influence of the weighing scheme on the strategy performance of course will depend on the underlying process. Here for the AR(1) process, the shorter the MA lookbacks, better the performance, to the extent that for very large MA (like 50/200) the strategy skew even turns negative (not shown here). Similar to the results above for varying $\phi$, the expected return and the strategy skew is more sensitive to the weighing function than strategy volatility.

The optimization objective here again would be to estimate the process parameters, but perhaps hoping for higher accuracy than the case for a known drift random walk. This is not only because the performance sensitivity is an order higher than before, but we also need to optimize the weighing function (i.e. the signal) which depends on the underlying process parameters. In previous case, we would be happy to have confidence on the sign of the drift term, ignoring the accurate estimation of its value. But in this case, with a given risk/reward budget, we need much more accuracy in the estimated value of $\phi$. Nevertheless, as far as positioning is considered, we again see sigmoid and double-steps are good competing alternatives, with a favor for sigmoid for a implementation with linear instruments and double-steps for non-linear instruments.

Monday, November 16, 2015

Five things I do not believe in...

But have no evidence to the contrary. Yet.

  1. That the dealers are running zero corporate bond inventories
  2. That China shorts is going to make money for investors (UPDATE: At least not in macro shorts. Possibly in selective equity shorts. There seems to be a fissure within the old and the new economy in China)
  3. That the next crisis (whenever that happens) will mean a dollar rally (against euro) (UPDATE: See this, although I think it misses the point. It is about in what currencies global assets and liabilities are funded)
  4. That migration crisis is just another one for Europe
  5. That we have reached the peak Geo-political crisis (think about power balance in post-oil scarcity world)

Wednesday, November 11, 2015

Trade Idea | Long Euro anyone?

Forget the Fed, forget the parity. Euro is more Yen than anything else. If on Friday we have a rating downgrade on Portugal, then ECB suddenly has almost euro 200b less room in current QE. That is NOT good for the expectation of an expansion. PSI 20 has done approx 16% correction from recent peak already, approx 6% in latest bout.

We have a few Fedspeaks scheduled this week, but next ECB is sufficiently away to be wary of anyone from there to talk down euro anytime soon. Minimal down side on a tactical long euro trade given the current level, and given the market is now perhaps pricing fully a Dec Fed hike.

The aggressive trade here is long euro. A more balanced one is convex long euro with cheapening achieved by long Portuguese equities.

Monday, November 9, 2015

Time Series Momentum Strategies | The Spirits Within - Part IV

This is part of a series on time series momentum. Previous post on this are:

1. Part-I: Time Series vs Cross-sectional momentum
2. Part-II: Nature of linear time series momentum filters
3. Part-III: Types of sizing function

In this post we look in to the returns characteristics of a generic time series momentum (TSMOM) strategy. We have the expressions for the returns and moments for the previous post. We here consider two cases of the behaviors of the underlying asset - one where the asset behave like a Gaussian random walk, and in the second where the asset returns are autogressive (of the order 1).

Gaussian Random Walk: Let's assume our sizing function $\Psi$ is linear and the underlying asset is a random walk. That is asset return $r_t$ is given by $r_t=\mu + \sigma.\epsilon$, where $\epsilon\sim N(0,1)$ is Gaussian noise. In this case we can find the expected return from a TSMOM strategy as below
$$E(R_t)=E(S_tr_t)=E(\sum_{s=t-k}^{t-1}(w_sr_s)r_t)=\sum_{s=t-k}^{t-1} w_s E(r_sr_t)=\sum_{s=t-k}^{t-1} w_s (\gamma_s+\mu^2)=\mu^2\sum_{s=t-k}^{t-1}w_s=\mu^2$$
Here $\gamma_s$ is the autocovariance of underlying returns at a lag $s$. We obtain the results using the facts that $\gamma_s=0$ for $s\neq0$ in our particular case, and also that $\sum_{s=t-k}^{t-1}w_s=1$ by design. This is a case of strict TSMOM strategy in the sense all $w$ are positive. The result is intuitive. The position size is proportional to the expected return $\mu$ and so is the return on this size, hence the square of $\mu$ term. Note, this result does not depend on the exact type of signals, as long as the weights are positive and adds up to one. Similarly we can show the volatility (square root of variance) of this strategy is proportional to $\mu\sigma$. Figure below shows simulated results for different parameters

As we can observe, the expected returns and strategy volatility is as discussed above. The skew of the strategy is positive and increases with decreasing $\mu$ (till a certain threshold) and increasing $\sigma$. Excess kurtosis increases with decreasing $\mu$. The signal function $S$ is a 10 vs 50 period simple moving average cross-over signal. Since for this special case of random walk, all the individual terms under the summation evaluates to the same expression for all terms (this is true for all moments), the underlying signal function parameters (i.e. simple vs. exponential or 5/10 period vs 50/250 period) do not influence the performance.

However, the positioning function $\Psi$ will influence the performance. The above results are for a linear function $\Psi=S$. Below is the performance comparison for different types of $\Psi$.

As we can see, there are variations in statistical characteristics across different choices of $\Psi$. The sigmoid function behaves similar to the linear function we have already seen. This is expected, for example, sigmoid can be made to resemble a linear function (with position cut-off) with appropriate choice of parameters. In general for the random walk case, binary function will show similar expected returns and variance as the underlying itself and little skew or excess kurtosis, Compared to both, linear will have higher skew due to higher potential position on the extreme. Sigmoid usually will show a reduced expected return (but maintaining the Sharpe Ratio more or less).

However, the double-step shows markedly lower Sharpe and higher skew (in spite of the position limit). It has a lower vol but an even lower expected return makes the Sharpe lower overall (compared to the benchmark linear case).  The higher skew comes from the sharp increase in position at a relatively lower threshold of signal (compared to, again, a linear function). Also higher the threshold $\epsilon$, higher is the skew.

So in the case of random walk with deterministic drift, the optimization problem is rather trivial. The underlying signal function does not affect the strategy performance much. That includes the type of the signal and the parameter space of the signal function. The choice then reduces to finding appropriate positioning function $\Psi$. Usually the linear is NOT preferred because of potentially very large exposure. Sigmoid is a good choice for position limiting with a higher Sharpe. On the other hand double-step is a good choice for a high skew strategy. Depending on the trading style (confidence in underlying process estimates, along with risk management), instruments (linear or convex) and trading horizon (we will come back to trading horizon later in details), a combination of sigmoid and double-step can deliver the desired mix of Sharpe and positive skew.

Friday, November 6, 2015

Time Series Momentum Strategies | The Spirits Within - Part III

This is part of a series on time series momentum. Previous post on this are:

1. Part-I: Time Series vs Cross-sectional momentum
2. Part-II: Nature of linear time series momentum filters

The second phase of designing a momentum strategy is designing the positioning function $\Psi$. This is the function that converts the signal in to a positions. The common choices are:

1. Sign/ Binary function (i.e. maximum long position allowed if positive signal, maximum short otherwise): The simplest of the lot. Sharp change in positioning near 0 level of signal (ambiguous zone) which can lead to increasing turnover and related costs. $\Psi = Sign(S)$
2. Linear (including constant): Simple, but no limit on maximum position. $\Psi =c S$, $c$ is a constant scaling factor.
3. Step function: Sudden change in direction (although no longer around the ambiguous zone). $\Psi=+1|S>\epsilon, -1|S<-\epsilon$. Here $\epsilon$ is the threshold.
4. Sigmoid function (error function) or tangent hyperbolic filtering: Smooth combination of linear and binary, moving gradually from one to another depending on parameters. $\Psi=erf(S)$, or $\Psi=\frac{-e^{-S} + e^S}{e^{-S} + e^S}$
5. Reverse sigmoid function: Sigmoid with peak sizing in long or short zone. $\Psi=e^{1/2}S.e^{-S^2/2}$

Given this set up, now we are ready to look in to the performance of such a strategy. By definition, the one-period return of the strategy is $R_t=\Psi(S_t).r_{t}$. The expression for one-period mean is as below.
$$\mu = E\left(\sum{\Psi(S_t).r_t}\right)$$
The k-th moment is given by 
$$M(k) = E\left(\sum{(\Psi(S_t).r_t)^k}\right)$$
As we have seen already, S can be (in case of linear filters) expressed as $S_t=\sum(w.r_t)$. Also for linear $\Psi$ we can take the coefficient outside the summation notation.

Wednesday, November 4, 2015

Time Series Momentum Strategies | The Spirits Within - Part II

This is part of a series on time series momentum. Look here for the previous post on this.

Here we focus on time series momentum strategies in a single underlying (as opposed to diversified momentum trading). 

The typical time series momentum trading strategy has two distinct design phases. The first one is generating a trading signal based on some logic applied to the underlying price levels or price returns. This, therefore, can be thought of as a function $S$ converting the underlying prices or returns to a trading signal. The second phase is designing an appropriate positioning function $\Psi$. This accomplishes converting the output from $S$ in phase one in to a sizing or positioning. In many cases, we can have the third phase consisting of risk management. This phase includes putting different types of risk management logic, like stop losses or take profit or we can even club volatility filter under this category. For the sake of practicality and simplicity, we keep risk management out of scope and concentrate on a strategy involving the basic two steps as above - designing $S$ and $\Psi$. The schematic below shows how the price or returns (first terms) flows through these filters to generate a profit or loss number (last term)
$$logP_t \Rightarrow S(logP_t) \Rightarrow \Psi(S(logP_t)) \Rightarrow r_{t+1}\Psi$$
$$r_t \Rightarrow S(r_t) \Rightarrow \Psi(S(r_t)) \Rightarrow r_{t+1}\Psi$$
First set refers to price-based signals, and the second set refers to returns based signal. Here $r_t=(logP_t - logP_{t-1})$ is the one period return. Note we are using logarithms of the prices for filtering, while in most cases (like moving average) simple prices are used.  This for convenience so that we can write the percentage returns as a difference of logarithms of prices. The designing objective is to choose $S$ and $\Psi$ to optimize the performance.

The two most common types of signal designing is either a momentum signal on the returns (RMOM) or a moving average crossover signal (XMOV). An RMOM signal computes a weighted average of recent returns and goes long if they are positive. A simple strategy based on such a signal is to buy an asset if, say the recent monthly return has been positive. A simple TSMOM signal will be as below
$$S_t^{RMOM}=\sum_{s=1}^nw_sr_s=\sum_{s=1}^nw_s(logP_{t-s+1} - logP_{t-s})$$
Here $w$ are the weights and $n$ is the window of the applied filter. A buy signal is generated for $S_t^{RMOM} \ge 0$. Similarly, a moving average cross-over signal tracks two moving averages and signals a buy when the fast one crosses the slow one from below. The moving average signals will be as shown below
$$S_t^{XMOV}=MA_t^{fast} - MA_t^{slow}=\sum_{s=1}^nc_s^{fast}(logP_{t-s+1}) - \sum_{s=1}^nc_s^{slow}(logP_{t-s+1}) = \sum_{s=1}^n(c_s^{fast} - c_s^{slow})logP_{t-s+1}$$
Here $c$ are the weights and $n$ is the window of the applied filters. A buy signal is generated for $S_t^{XMOV} \ge 0$.

Pedersen and Levine (from AQR Capital) have shown that these two are in fact equivalent ways of expressing same filtering. They have even shown that in general all linear filters are equivalent. For example, the equivalent ways of expressing the XMOV signals in equivalent RMOM expression is to compute the equivalent weights as
$$w_s=\sum_{j=1}^s(c_j^{fast} - c_j^{slow})$$
Here are some examples of price level filters mapped back to returns space applying these results.

Here the simple MA crossover is based on a 50 period and 250 period (fast and slow respectively) moving average filters. The corresponding EWMA is designed to have similar filtering (in the sense that the net signal has similar center of mass). Also note that the net signal weights are both positive and negative for the price space, but strictly positive on the return space in these two cases.

So we see that in general, we can represent the signal function $S$ as weighted past returns, in the form of  $\sum{w_sr_s}$, at least for linear filtering and scaling. Next we look at the positioning function $\Psi$.

This covers a significant number of technical indicators (like ROC, MACD, or even normalized momentum filters like CCI, assuming a known and constant volatility - i.e. constant scaling). However, this will exclude non-linear indicators like Aroon oscillator.

Tuesday, November 3, 2015

Time Series Momentum Strategies | The Spirits Within - Part I

This is first part of a small series on some observations on general time series momentum strategies. In this I lay threadbare generic time series momentum strategies with the objective of establishing general theoretical underpinning on the strategy performance and optimization approach.

The class of "time series momentum" strategies are very common and popular among investors. At its simplest, it means buy high and sell low. Or more precisely, buy high (an asset that has recently appreciated) to sell at a even higher price. Similarly sell low (an asset that has recently sold off) to buy back at a even lower price. From this point of view, this is fundamentally different than the value investing paradigm. In other words, any returns generated from these class of strategies should be independent and hence should represent an independent risk factor to the investors.

Most of the theoretical interest in momentum investing started with the classic paper by Jegadeesh and Titman in 1993, where they found strong evidence of momentum profit. Further research followed these interesting observation since then. However, what is described as momentum in this case is usually termed as cross-sectional momentum (XSMOM). This is fundamentally different than what is usually understood to be time series momentum (TSMOM).

An XSMOM strategy looks at the relative performance of a basket of assets, and invests in the winners and shorts the losers. The buy/sell signal is generated from relative performance of different assets (cross-section) within a given time interval. On the other hand, a TSMOM strategy looks at past performance of an asset, and buys if has been a winner, or sell otherwise. A TSMOM strategy may or may not involve a basket of assets, it does not depend on a basket crucially for the strategy implementation (unlike the XSMOM which is meaningless without the context of a basket). When a basket is used for a TSMOM it is for diversification and risk management.

This fundamental difference in construction shows how the performances of these two strategies can be similar and different. For sake of comparison, let's assume in both cases we have a basket of two assets. As it is evident if we indeed have strong correlated changes in asset prices (past performance predicts future), then both strategies will perform, as we will be buying and selling the right assets by construction. However, even if we have a change in the momentum, if there is an increase in dispersion of the asset performance (e.g. winner becomes losers, but losers become even more so - not a complete reversal, as in winners become losers and losers become winners) then the XSMOM will still perform. Similarly, even if we do not have strong auto-correlation, but persistent trends the TSMOM will perform better.

It seemed to me the first major theoretical insights in to TSMOM strategy was presented by Moskowitz, Ooi and Pedersen in 2012. The paper also captures the essence of the above paragraph, by breaking down the sources of profits in XSMOM and TSMOM strategies, as follows (in terms of one period expected returns):

$$E[r_{t,t+1}^{XSMOM}] = \frac{N-1}{N^2}tr(\Omega) - \frac{1}{N^2}[l^T\Omega l - tr(\Omega)] + 12\sigma_{\mu}^{2}$$
$$E[r_{t,t+1}^{TSMOM}] = \frac{tr(\Omega)}{N} + 12\frac{\mu^T\mu}{N}$$

Here $\Omega$ is the covariance matrix, $\mu$ is the mean returns vector, $\sigma_{\mu}^{2}$ is the cross-sectional variance of the means, $N$ is the number of assets in the basket and $tr()$ is sum of the diagonals. The above expression clearly shows the points made in the previous paragraphs. The TSMOM returns is driven by auto-covariance and strength of the drift terms. Where as XSMOM, in addition to auto-covariance, also depends on cross-variance (dispersion) and cross-variance of the mean returns (dispersion, again), but not particularly on the strength of the mean returns. These results are valid for linearly weighted basket (linear in returns), but in general give good guidance.