Recovering Lost Information from Exponential Moving Averages on HFT Market Data

Document Type



When it comes to algorithmic trading, traders and quantitative researchers work to develop models that use certain signals that study the market in a way that best predicts the times in which to open and close trades, with an aim to maximize the returns by the end of the trading period. Because of all the noise that comes with price changes, and the lack of interpretability that follows, algorithmic traders develop their signals by using a smoother version of the price changes through the application of a moving average. Moving averages are a technical tool that creates a smoother version of price data by bucketing noisy data in batches of size W, then mapping every data point to the bucketed average. This aims to simulate the noiseless trend of pricing data by taking the average shift of values over W , thereby removing any noise within the data itself. A specific type of moving averages is called the Exponential Moving Average (EMA), and it is heavily used when dealing with unevenly space time series data. EMA formulas include a specific ratio of every data point into the curve depending on the time it took since the last data point entered the stream. Subsequent iterations shrink the data point by a user-specified decay factor until its contribution is rendered negligible. A main disadvantage of this approach occurs in the high frequency trading world, when time differences occur in fractions of a second, and are even identical at times. This causes data points to enter the moving average stream at a ratio of 0.0%, causing significant information to be lost in the process. In this paper, we explore this phenomenon, and succeed in proposing an approach to recover the consequently lost information. Our solution introduces an additional hyperparameter r, representing the desired ratio at which every data point is to enter the smoothing curve, without compromising the logic of the EMA process, or allowing noisy data to contribute to the curve. Our proposed approach significantly improves on the currently used approach both on the quantitative metrics, and further shows the importance of generating real-time trends for the benefits of traders, where a reinforcement learning agent trained on each signal separately generates more profit on unseen data when using our proposed method, than using the current approach or the noisy market value data as a signal.

First Page


Last Page


Publication Date



Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfillment of the requirements for the M.Sc degree in Machine Learning

Advisors: Dr. Le Song, Dr. Bin Gu

Online access for MBZUAI patrons