Understanding Financial Market: Structures and Signals

Date of Award


Document Type


Degree Name

Master of Science in Natural Language Processing


Natural Language Processing

First Advisor

Prof. Preslav Nakov

Second Advisor

Prof. Kun Zhang


The stock market has a lot of different perspectives from people of different domains, experiences, cultures, nationalities, and more. Once in a while, some theories appear claiming to explain certain phenomena occurring in the markets. These include an Efficient Market Hypothesis, Dow Theory, and Factor models. However, many such theories work for a period of time or at specific market conditions. Further, identifying and validating the active market participants at any time, including their trading style, is a very difficult process. Trading strategies that work at one point may not work in another instrument, market, exchange, or country. Introducing new trading symbols, instrument types, expiry days, and initial public offerings are some efforts taken by the stock exchanges to engage everyone, thereby achieving enough liquidity. All those components appeal to traders and non-traders for opportunities and confusion. While some participants can navigate through all this easily, others tend to make it complex. It still remains a black box when some individuals or funds stay net positive in the long run. In fact, hedge fund management companies, institutional trading funds, mutual funds, and banks are all expected to have a dynamic alpha. Hundreds and thousands of proprietary strategies keep them in the long run. Yet, none of them reveal the technical details to the public or to their own employees being highly confidential. On the other hand, some traders and non-traders believe that stock markets are random processes and, ultimately, a casino for gambling. There is also a considerable number of books, podcasts, and media publishing content daily debating both sides. Computers have already brought in a lot of advancement to the stock trading business as a whole. The main advantage of bringing computers and machines to execute trading strategies is eliminating all the subjective processes involved in decision-making. A subjective decision-making process has to go through many emotions, fear, greed, and luck in different quantities to arrive at a final decision while computers deal with them rigorously. Still, considering the intelligence level of computer software, most of the active trading systems in the present day are only semi-automatic. It is believed that the recent progress and achievements in data-driven artificial intelligence are expected to be complementary in solving and decoding the markets in an explainable way. The main contribution of this thesis is an attempt to understand the financial market structures and signals in different data and periods of time. Natural Language Processing (NLP) methods address news’s impact, which is more critical than any other NLP data. The sentiment and the impact level of news play a vital role. Causal methods help to answer the why of certain market phenomena. This understanding could radically change the development of trading strategies. The representation and machine learning methods used aim to improve the returns of a stock trading system to be more profitable.


Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfilment of the requirements for the M.Sc degree in Science in Natural Language Processing

Advisors: Preslav Nakov, Kun Zhang

Online access available for MBZUAI patrons