Complex Event Processing for Financial Applications
Event processing has helped companies to identify and react to situations quickly and effectively, and there are many solutions that monitor various events happening both within an enterprise and outside. However, there are some situations that still require manual effort and intelligence to identify and react to those situations.
Such situations could include:
One basic attribute that is common across all the above situations is ‘effective data management’. With the advent of electronic communication, the amount of data shared in the financial world has grown exponentially, and in order to leverage the hidden information, this data needs to be analysed.
The existing methodologies follow a conventional approach – storing all information in a data warehouse and then mining this information – however, the nature of problems is such that the earlier they are identified and solved, the lesser the risk involved and the greater the profits.
An event represents a business change and therefore does not exist without the business. Every event needs to be interpreted so that subsequent business related actions can be triggered. This interpretation and triggering of subsequent actions is known as simple event processing (SEP).
There can be scenarios where information is hidden across multiple events. This information is known as business intelligence (BI) and identification of this intelligence and initiation of subsequent actions is known as complex event processing (CEP).
Typically, current solutions store data in a database management system(DBMS), and then fire queries across this data. But CEP inverts this common design pattern by first storing and indexing the queries/rules into an efficient structure and then streaming data through structures.
This approach has the following advantages:
There are multiple methodologies knows as CEP agents, on which CEP solutions are based. The following diagram illustrates three of them.
Figure 2: CEP Agents
Source: Polaris
Stream query engines are SQL-based solutions, which have queries, data and results. But the way in which these interact with each other is different.
These agents work in a three-step process:
Streams are analogous to table in relational database management systems (RDBMS). Every record in a stream is similar to a record in a database. The difference being that stream records are ordered, i.e. they carry a time stamp, whereas records in RDBMS are not ordered. Information received from external sources is converted into streams and then fed to query engine.
Stream queries are continuous in nature and are generally configured with time as a dimension, i.e. query repeatedly runs only on those stream of records that were received after the last run.
The result is the output is received from firing queries on the stream of data. This is transformed into events and passed on to the next system in line.
The stream query approach offers the following advantages:
CEP solutions based on the inference rule engine run on predefined rules (representing business functionality) on the incoming events (known as facts). Most of the rules engines present today in market are based on the Rete Algorithm – an efficient pattern matching algorithm designed by Dr Charles in 1979.
In Rete algorithm, rules are a ‘collection of pattern matching constructs that are kept as nodes of directed acyclic graphs’ (refer to Figure 3). Nodes can be shared across rules, provided they do not introduce any cycles. So when a fact comes it passes through the nodes in the graph for a given rule. If a fact reaches the leaf node for a rule, it passes the rule. Once a rule is passed, further action can be taken, which can be an event to external system, or application of some other set of rules to the same fact.
Generally each node has an associated working memory, which helps in problems where time is a dimension (see Figure 4).
If a pattern is to be identified on a fact continuously for 10 minutes only if it is found consistently will it proceed with execution of subsequent patterns in the rule.
In such a scenario, nodes can store the intermediate results in its working memory. Working memory is generally online and brings near time feature to pattern matching solutions.
In most of the inference rule engines used in CEP, matching constructs are if-then-else blocks, which are defined declaratively, and are loaded into the engine dynamically. Some engines even provide rule maintenance systems with interactive interfaces, through which business users can configure and maintain rules. These features allow inference rules engine to change dynamically for quickly responding to changing business need, and in turn provide flexibility to the business.
CEP has found many applications in the financial world. The following are a few examples:
In the stock market, generally we follow market data and trade (buy/sell) based on our position against the market data. High-frequency trading aims to capture just a fraction of a penny per share or currency unit on every trade; traders move in and out of short-term positions several times each day.
With CEP programmes take the job of analysing market data. Not only they analyse the market data but also take trading decisions and use trading opportunities.
The following are statistics from the NY Times, which reveal that application of CEP had a great impact on the stock trading in US:
Figure 5: HFT Trends in NYSE
Source: Polaris
In order to reduce market impact and to get the best prices, trade orders can be split or aggregated and placed at different market centres.
CEP solutions, known as smart order router (SOR), help traders to achieve above mentioned goals. SOR receives and caches information such as market liquidity from various venues, and takes orders from multiple sources (OMS, trading systems, ECNs etc), splits/aggregates them, and places orders across various venues based on the information cached and configured rules.
Most of the current risk management applications provide post trade, risk analysis. But with CEP traders can do the pre-trade analysis as well.
CEP risk management solutions can sit on the existing systems and augment their capacity by virtually running the trade against the historical data (containing data for various market conditions over a period of time) and provide statistics even before the trade is captured by trading system. This allows traders to predict impact of the trade on their portfolios for various market conditions, even before entering into a trade. Currently risk management is the second most common type of CEP implementation, behind trading.
CEP solution sits along with the other solutions that an enterprise has and listens to the information that an enterprise receives from all the external systems (exchanges, brokers, news agencies, price sources) or generated from systems within the enterprise. It consumes and processes information and provides the results again as event to the event bus. These resultant events can be picked by other systems – trading platforms, order management systems (OMS), risk management solutions, which will initiate further actions.
The following diagram depicts how a CEP solution can fit into an existing infrastructure.
Currently almost every software vendor has a CEP solution. Following is a list of some of these vendors. Each of them is based on different agents and hence needed to be evaluated for specific requirement.
The following is a synopsis of the history and future of CEP as seen by David Lukham.
The first stage has been termed as ‘early struggle for market traction’ where, like any other initiatives, CEP had to go through initial struggle for creating a place in IT horizon – be it the dot com implosion of 2001 or a struggle to create awareness among potential users – stock broking and other related financial players. During this state all CEP initiatives were either research and development (R&D) university projects or small startups initiated by people who had identified and understood the strength of event processing. Most of developers around this stage were of database background, hence most of the early CEP solutions were query stream based.
The second stage has been termed as cCreeping CEP’, where people realise the potential in CEP and start incorporating them into their existing solutions. Another noticeable trend is the entrance of big vendors. These vendors either buy small vendors or create their own solutions and use them as add-ons to their existing service orientated architecture (SOA) based solutions. Though business activity monitoring (BAM) was introduced in first stage, but it started gaining prominence as a CEP solution around 2005.
We are currently in the third stage, during which we will see CEP becoming key part of information technology and will help in solving complex use cases spread across various fields such as air lines, traffic control, data security, etc. It will help us to process, analyse, and relate information so that we can identify various situations as they happen and react to them.
In the last stage CEP will become a holistic event processing, where it shall cover all areas where ever information exchange or processing is involved.