Background on SGD algorithms, Point Processes and Cox proportional hazards model

Get Complete Project Material File(s) Now! »

Complexity of the algorithm

Compared with existing state-of-the-art methods to estimate the kernel functions, e.g., the ordinary di!erential equations-based (ODE) algorithm in [ZZS13], the Granger Causalitybased algorithm in [XFZ16], the ADM4 algorithm in [ZZS13], and the Wiener-Hopf-based algorithm in [BM16], our method has a very competitive complexity. This can be understood by the fact that those methods estimate the kernel functions, while in NPHC we only estimate their integrals. The ODE-based algorithm is an EM algorithm that parametrizes the kernel function with M basis functions, each being discretized to L points. The basis functions are updated after solving M Euler-Lagrange equations. If n denotes the maximum number of events per component (i.e. n = max1ΣiΣd |Zi |) then the complexity of one iteration of the algorithm is O(Mn3d2 +ML(nd +n2)). The Granger Causality-based algorithm is similar to the previous one, without the update of the basis functions, that are Gaussian kernels. The complexity per iteration is O(Mn3d2). The algorithm ADM4 is similar to the two algorithms above, as EM algorithm as well, with only one exponential kernel as basis function. The complexity per iteration is then O(n3d2). The Wiener-Hopf-based algorithm is not iterative, on the contrary to the previous ones. It Ærst computes the empirical conditional laws on many points, and then invert the Wiener-Hopf system, leading to a O(nd2L +d4L3) computation. Similarly, our method Ærst computes the integrated cumulants, then minimize the objective function with Niter iterations, and invert the resulting matrix bR to obtain bG . In the end, the complexity of the NPHC method is O(nd2+Niterd3). According to this analysis, summarized in Table III.1 below, one can see that in the regime n¿d, the NPHC method outperforms all the other ones.

Revising the 8-dimensional mono-asset model of [BJM16] : A sanity check

In [BM14a, BM16], the authors outlined a method for non-parametric estimation of the Hawkes kernel functions based the inÆnitesimal covariance density and the numerical solution of a Wiener-Hopf system of integral equations that links the covariance matrix and the kernel matrix. Their method has been applied to high-frequency Ænancial data in [BM14a, BJM16], and [RBL17]. The aim of this section is to compare the newly proposed NPHC methodology with the Wiener-Hopf method mentioned above in order to assess the reliability of the new NPHC method. To this end, we reproduce the results obtained in [BJM16]. As it was done there, we consider the DAX and Bund futures data4 and for each asset we separate Level-I order book events into 8 categories as deÆned above: P+, P°, T a, T b, La, Lb, and Ca, Cb. Note that here a price move can be of any type. We then consider the timestamp associated with all events as a realization of a 8-dimensional Hawkes process and we use both the NPHC method outlined in Section 3 and the Wiener-Hopf method of [BM16] to estimate the integrated kernel interaction matrix G from the data. For the Wiener-Hopf method, we follow the same procedure as [BJM16] and in particular we estimate the covariance density up to a maximum lag of º 1000s using a log-linear spaced grid5, while for the NPHC method we follow the steps outlined in Section 3 and we Æx H = 500s so to be on a comparable scale with the Wiener-Hopf method. Let us note that this scale is several orders of magnitude larger than the typical inter-event time. Indeed, on the assets considered median inter-event times are of the order of 300μs (the mean being º 50ms), with minimum time distances in the tens of microseconds.
In Figure V.3, we compare the kernel integral matrices G obtained with the NPHC method (left) with those obtained with the Wiener-Hopf approach (right) on the DAX future. Although the precise values of the matrix entries di!er somewhat, as it is di$cult to tune the estimation parameters of the two methods as to produce the exact same numerical results, we note that the two methods produce very consistent results. Indeed, they recover the same interaction structure and thus lead to the same interpretation of the underlying system dynamics. In our view, this represents a good sanity check for the proposed NPHC methodology. Analogous results are obtained for the Bund future. Let us also point out that the small asymmetries between symmetric interactions (such as e.g. T +!T ° and T °!T +) can be used get a rough measure of the estimation error. In the case presented here, the average absolute di!erence between symmetric interactions kernels is 0.03, which means relative error of a few percent on the most relevant interactions.

A 12-dimensional mono-asset model

By estimating directly the norm of the kernels and not the whole kernel function, the NPHC method can be used to investigate systems of greater dimension. In this section we extend the model of Section 4.2 to 12 dimensions by separating the type of events that lead to a price move. The 12 even types we consider are thus T + (T °), L+ (L°), C+ (C°), T a (T b), La (Lb), Ca (Cb). We then apply the NPHC algorithm to estimate the branching ratio matrix. When not otherwise speciÆed, we set H = 500s. To further assess the validity of our methodology and the impact of time-of-day e!ects, we Ærst estimate the model using di!erent time slots within the trading day. In Section 4.3.2 we also check the robustness of our results as respect to the choice of the parameter H.

Table of contents :

Contents ix
Introduction
Motivations
Outline
1 Part I: Large-scale Cox model
1.1 Background on SGD algorithms, Point Processes and Cox proportional hazards model
1.2 SVRG beyond Empirical Risk Minimization
2 Part II: Uncover Hawkes causality without parametrization
2.1 Hawkes processes
2.2 Generalized Method of Moments approach
2.3 Constrained optimization approach
3 Part III: Capture order book dynamics with Hawkes processes
3.1 A single asset 12-dimensional Hawkes order book model
3.2 A multi-asset 16-dimensional Hawkes order book model
Part I Large-scale Cox model
I Background on SGD algorithms, Point Processes and Cox proportional hazards model
1 SGD algorithms
1.1 DeÆnitions
1.2 SGD algorithms from a general distribution
1.3 SGD algorithms from a uniform distribution
1.4 SGD with Variance Reduction
2 Point Processes
2.1 DeÆnitions
2.2 Temporal Point Processes
3 Cox proportional hazards model
3.1 Survival analysis
3.2 Existing methods
II Large-scale Cox model
1 Introduction
2 Comparison with previous work
3 A doubly stochastic proximal gradient descent algorithm
3.1 2SVRG: a meta-algorithm
3.2 Choice of ApproxMCMC
4 Theoretical guarantees
5 Numerical experiments
6 Conclusion
7 Proofs
7.1 Proof of Proposition 1
7.2 Preliminaries to the proofs of Theorems 1 and 2
7.3 Proof of Theorem 1
7.4 Proof of Theorem 2
8 Supplementary experiments
9 Simulation of data
10 Mini-batch sizing
Part II Uncover Hawkes causality without parametrization
III Generalized Method of Moments approach
1 Introduction
2 NPHC: The Non Parametric Hawkes Cumulant method
2.1 Branching structure and Granger causality
2.2 Integrated cumulants of the Hawkes process
2.3 Estimation of the integrated cumulants
2.4 The NPHC algorithm
2.5 Complexity of the algorithm
2.6 Theoretical guarantee: consistency
3 Numerical Experiments
4 Technical details
4.1 Proof of Equation (8)
4.2 Proof of Equation (9)
4.3 Integrated cumulants estimators
4.4 Choice of the scaling coe$cient Σ
4.5 Proof of the Theorem
5 Conclusion
IV Constrained optimization approach
1 Introduction
2 Problem setting
3 ADMM
3.1 The ADMM algorithm
3.2 Convergence results
3.3 Examples
4 Numerical results
4.1 Simulated data
4.2 Order book data
5 Conclusion
6 Technical details
6.1 Convex hull of the orthogonal group
6.2 Updates of ADMM steps
Part III Capture order book dynamics with Hawkes processes
V Order book dynamics
1 Introduction
2 Hawkes processes: deÆnitions and properties
2.1 Multivariate Hawkes processes and the branching ratio matrix G
2.2 Integrated Cumulants of Hawkes Process
3 The NPHC method
3.1 Estimation of the integrated cumulants
3.2 The NPHC algorithm
3.3 Numerical experiments
4 Single-asset model
4.1 Data
4.2 Revising the 8-dimensional mono-asset model of [BJM16] : A sanity check
4.3 A 12-dimensional mono-asset model
5 Multi-asset model
5.1 The DAX – EURO STOXX model
5.2 Bobl – Bund
6 Conclusion and prospects
1 Origin of the scaling coe$cient Σ
Bibliography