Chapter 4 Parameter estimation for rotating stellar core collapse
Due to the analytical intractability and complex multi-dimensional nature of rotating stellar core collapse events, a signi cant amount of computational time must go into numerically simulating the gravitational waveforms. This involves simulating all of the physics relating to stellar core collapse, which can be considered a melange of general relativity, particle physics, and nuclear physics. Unlike binary inspiral events, one cannot simply use template-based search methods for supernova burst events as it is computationally impossible to cover the entire signal parameter space. It is therefore important to derive alternative parameter estimation techniques.
The expected rate of core collapse supernovae in the Milky Way is around three per century (Adams et al., 2013). It is therefore important to ensure that appropriate data analysis routines are in place so as not to miss an opportunity to detect and make timely statements about these rare events. Some of the following approaches have aimed to estimate key parameters of stellar core collapse events using their gravitational wave signals.
Summerscales et al. (2008) utilised the maximum entropy framework to deconvolve noisy data from multiple (coherent) detectors, with the goal of extracting a core collapse supernova gravitational wave signal. Inference on amplitude and phase parameters was conducted using cross correlation between the recovered waveform and the set of competing waveforms from the Ott et al. (2004) catalogue. A match was de ned as the model with the maximum cross correlation to the recovered waveform.
Heng (2009) rst proposed a PCA approach to simplify the problem by reducing a given supernova waveform catalogue space down to a small number of basis vectors. Rover et al. (2009) extended this approach and created a novel Metropolis-within-Gibbs sampler to reconstruct test signals from the Dimmelmeier et al. (2008) catalogue in noisy data using a PCR model with random e ects and unknown signal arrival time. The authors then attempted to exploit the structure of the posterior PC coe cients with a simple 2 measure of distance to determine which catalogue waveform best matched the injected test signal. Although the Bayesian reconstruction method showed much promise, extraction of the underlying physical parameters had limited success.
Logue et al. (2012) used the nested sampling algorithm of Skilling (2006) to compute Bayesian evidence for PCR models under three competing supernova mechanisms | neutrino, magnetoro-tational, and acoustic mechanisms. Each supernova mechanism has a noticeably distinct gravita-tional waveform morphology, and the method was successful at correctly inferring a vast majority of injected signals. The authors found that for signals embedded in simulated Advanced LIGO noise, the magnetorotational mechanism could be distinguished to a distance of up to 10 kpc, and the neutrino and acoustic mechanisms up to 2 kpc. This work was later extended by Powell et al. (2016), using real noise from a network of detectors and Bayes factors to select an appropriate number of PCs.
Abdikamalov et al. (2014) generated a new rotating core collapse waveform catalogue and applied matched ltering to infer total angular momentum to within 20% for rapidly rotating cores. For slowly rotating cores, this changed to 35%. Along with matched ltering, they employed the Bayesian model selection method presented by Logue et al. (2012) to illustrate that under certain assumptions of the rotation law, the second generation of gravitational wave detectors (Advanced LIGO, Advanced Virgo, and KAGRA), could also extract information about the degree of precollapse di erential rotation. The two methods worked particularly well for rapidly rotating cores.
In contrast to the Bayesian approach, Engels et al. (2014) used multivariate regression and classi-cal hypothesis testing to analyse important astrophysical parameters from rotating core collapse signals. Rather than reconstructing the waveforms using a linear combination of PCs, the authors used least squares to nd an encoded relationship between the PC basis functions and the as-trophysical parameters. They could identify the most important astrophysical parameters in the presence of simulated detector noise.
The majority of this chapter describes the research presented by Edwards et al. (2014). The aim of this research was to show that it is possible to extract astrophysically meaningful information about a rotating progenitor undergoing core collapse using its statistically reconstructed gravitational wave signal. Bayesian PCR models were used to reconstruct a gravitational wave signal embedded in simulated Advanced LIGO noise. Known astrophysical parameters were then regressed on the posterior means of the PC coe cients using a Bayesian linear regression model. The ratio of rotational kinetic energy to gravitational potential energy of the inner core at bounce was estimated by sampling from the posterior predictive distribution, and precollapse di erential rotation was classi ed using two supervised machine learning algorithms.
In addition to the results of Edwards et al. (2014), some new (unpublished) results are also presented. An alternative signal reconstruction method based on birth-death reversible jump MCMC is implemented to remove the arbitrariness of choosing an appropriate number of principal components. A Bayesian ordinal probit model is used to classify precollapse di erential rotation. Further parameter estimation results based on model comparison and cross-validation are also presented, with the aim of inferring the nuclear equation of state (EOS).
Core collapse waveform catalogue
A description of the rotating core collapse and bounce gravitational wave simulations by Abdika-malov et al. (2014) is given in this section. This data catalogue is used to show in the sections to follow that it is possible to extract astrophysically meaningful information about rotating stellar core collapse using Bayesian methods and supervised machine learning algorithms.
The waveforms come from two-dimensional numerical axisymmetric general-relativistic hydrody-namic rotating core collapse and bounce supernova simulations. Based on ndings that gravita-tional wave signals are essentially independent of the progenitor zero age main sequence (ZAMS) mass by Ott et al. (2012), a single presupernova progenitor model (the 12 solar mass at ZAMS solar-metallicity progenitor model from Woosley and Heger (2007)) was adopted. The cylindrical rotation law from Ott et al. (2004) was also assumed.
The gravitational wave catalogue is partitioned into a training set and a test set. The training set contains l = 92 signals with ve levels of precollapse di erential rotation A (where higher values of A represent weaker di erential rotation), a grid of values for initial central angular velocity c, as well as a grid of values for the ratio of rotational kinetic energy to gravitational energy of the inner core at bounce ic;b (since ic;b is a function of c for a xed progenitor structure). Each signal in the training set was generated using the microphysical nite-temperature Lattimer-Swesty (LS) EOS (Lattimer and Swesty, 1991), parametrised deleptonisation scheme from Dimmelmeier et al. (2008), and neutrino leakage scheme from Ott et al. (2012). As well as varying A, c, and ic;b, the test set contains m = 47 signals with di ering EOS and deleptonisation parametrisations Ye( ). Speci cally, some test signals were generated using the Shen EOS (Shen et al., 1998), or an increase/decrease in Ye( ) parametrisation by 5%. The values of c and ic;b in the test set are in the same parameter space as those in the training set, but with an alternative grid. The object of this analysis is to infer the physical parameters ( ic;b, A, and EOS) of the signals in the test set using information gleaned about signals in the training set.
The signals were initially sampled at 100 kHz and subsequently downsampled by a rational factor to 16384 Hz (the sampling rate of the Advanced LIGO detectors). As mentioned in Section 3.4.6, downsampling by a rational factor essentially involved two steps: upsampling by an integer factor via interpolation and then applying a low-pass lter to eliminate the high frequency components (this step is necessary to avoid aliasing at lower sampling rates); and downsampling by an integer factor to achieve the desired sampling rate. The resampled data was zero-bu erred to ensure each signal was the same length, n = 16384, which corresponded to 1 s of data at the Advanced LIGO sampling rate. Each signal was then aligned so that the rst negative peak (not necessarily the global minimum), corresponding to the time of core bounce, occurred halfway through the time series.
In this analysis, the source of a gravitational wave emission is assumed to be optimally ori-ented (perpendicular) to a single interferometer. Each signal is linearly polarised with zero cross-polarisation.
A general waveform morphology is illustrated in Figure 4.1. During core collapse, there is a slow increase in gravitational wave strain until the rst local maximum is reached (before 0.5 s). This is followed by core bounce, where the strain rapidly decreases towards a local minimum (at 0.5 s). This corresponds to the time when the inner core expands at bounce. After this, there is a period of ring-down oscillations. For slowly rotating progenitors (as seen in the top panel of Figure 4.1), the gravitational wave strain is essentially the same during collapse and bounce and only di ers during the stochastic ring-down. For the rapidly rotating progenitors (presented in the bottom panel of Figure 4.1), larger precollapse di erential rotation results in: a smaller local maximum during core collapse; a more negative local minimum during core bounce; and a larger rst ring-down peak. Because of these patterns, Abdikamalov et al. (2014) concluded that inferences about precollapse di erential rotation could in principle be made for rapidly rotating cores.
The data analysed are rotating core collapse gravitational wave signals injected in additive Gaus-sian noise, coloured by the Advanced LIGO design sensitivity spectral density, f(:). According to LALAdvLIGOPsd (Sathyaprakash, 2007), the Advanced LIGO noise PSD is where is the frequency in Hertz and 0 = 215 Hz (not to be confused with the Nyquist frequency of the same symbol). The returned value is scaled up by f0 = 1049. Noise is generated in the frequency-domain by sampling Gaussian random variables with zero mean and variances given in 4.3. These complex-valued random variables are inverse Fourier transformed and then the real parts are taken to be the time-domain Gaussian random variables.
The data are then Tukey windowed to mitigate spectral leakage. Rather than xing source distance to 10 kpc (as done by Abdikamalov et al. (2014)), this analysis assumes a xed SNR of % = 20. As done by Rover et al. (2009), f(:) is estimated a priori by averaging 1000 empirical periodograms from identically simulated Advanced LIGO noise using Bartlett’s method. This corresponds to a realistic scenario where the noise spectrum must be estimated as well. Although supernovae from the Milky Way will not produce SNRs as small as % = 20, this value was chosen to illustrate that the methods are robust at lower SNRs.
PCA and PCR approaches to signal reconstruction and parameter estimation of supernova grav-itational wave signals have become popular over recent years (Abdikamalov et al., 2014, Heng, 2009, Logue et al., 2012, Powell et al., 2016, Rover et al., 2009). Signal reconstruction models 1 and 2 in the following subsections are based on the work of Rover et al. (2009) and used in Edwards et al. (2014) to infer important astrophysical parameters. Model 3 is a novel alternative to the problem, based on trans-dimensional reversible jump MCMC and model averaging.
Signal reconstruction model 1: Metropolis-within-Gibbs PCR with known signal arrival time
A PCA is applied to the training set to reduce dimensionality. Each training waveform is repre-sented as a linear combination of orthonormal basis vectors, where the projection of the data onto the rst basis vector has maximum variance, the projection onto the second basis vector has second highest variance, and so on. By considering only projections on the rst d < l basis vectors, the so-called d PCs, a parsimonious representation of the catalogue signals in d dimensions is achieved that preserves as much of the information of the original training set as possible. The rst four PCs from the Abdikamalov et al. (2014) catalogue are illustrated in 4.2. Notice particularly how the rst PC contains the obvious collapse ( rst positive peak) and bounce (large negative peak) morphology, and that the higher PCs (i.e., the ones with the lower eigenvalues) show more of the post-bounce ringdown morphology, and are on a smaller scale.
Once PCA is conducted, the rst d PCs are treated as the explanatory variables of a linear model. The data analysed are a time series vector y of length n and decomposes into additive signal and n d design matrix, whose columns are the Fourier transformed mean-centred PC vectors from the base catalogue. The frequency domain linear model is where is the vector of PCR coe cients and ~ is the Fourier transformed coloured zero-mean Gaussian noise vector whose variance terms are Here, j refers to (nonredundant positive) frequency in Hertz, t is the sampling interval, and f(:) is the a priori known (i.e., pre-estimated) noise spectral density. Due to Hermitian symmetry, the frequency domain data vector y~ contains only the non-redundant real and imaginary components and is therefore the same length as the time domain vector y. Conversion between time and frequency domains is conducted using a FFT.
The likelihood for the Bayesian PCR model with known signal arrival time is D = diag( 2 ) is the diagonal covariance matrix of individual variances for the noise component. This multivariate normal distribution can be sampled directly with no MCMC required.
Noninformative priors were chosen for this model as there is very little information about the constraints of the PCs. Most information learnt in the algorithm therefore comes from the likeli-hood rather than the prior. It was important to keep the data and prior knowledge separate and distinct, and to avoid using information from the waveform catalogue for both purposes. That is, an empirical Bayes approach was not deemed appropriate for the given scenario. As the only data available for analysis were the generated gravitational waves, complete prior ignorance on all reconstruction model parameters was assumed.
For comparability with the signal reconstruction model in the next subsection, 100,000 values are directly sampled from the posterior, 10,000 are removed as \burn-in », and the remaining samples are thinned by a factor of 5.
Signal reconstruction model 2: Metropolis-within-Gibbs PCR with unknown signal arrival time
The Bayesian PCR model presented in the previous section assumed a known signal arrival time. The precise arrival time of a gravitational wave signal to an interferometer will generally not be known in practice, and must therefore be included as an additional unknown parameter in the statistical model.
Let be a cyclical time shift representing the unknown signal arrival time, and let be the T XT
Fourier transformed design matrix ~ shifted by lag , such that the Fourier transformed PCs are X T aligned with the Fourier transformed data vector y~. This transformation can be done directly in the frequency domain as a phase shift by multiplying the columns of ~ by exp( 2 i ), where is frequency in Hertz (not angular frequency).
This signal reconstruction model is based on the model presented by Rover et al. (2009), although the primary goal in this analysis is inferring the physical parameters of a supernova progenitor, and not signal reconstruction.
Using the same reasoning described in the previous section, assume at priors on and T . The likelihood for the Bayesian PCR model with unknown signal arrival time is For a given time shift T , the conditional posterior distribution for the PC coe cients jT is T , a Markov chain is constructed, whose stationary distribution is the posterior distribution of interest using Metropolis-within-Gibbs sampler. This is essentially a Gibbs sampler that alternates between the full set of conditional posterior distributions p( jT; y~) and p(T j ; y~). The former can be sampled directly using equation (4.9), and the latter requires a random walk Metropolis step.
After initialisation, step i + 1 in the Metropolis-within-Gibbs algorithm is:
Directly sample the conditional posterior of (i+1)jT (i) using equation (4.9);
Propose T ( ) from t df (T (i); 2) and accept T (i+1) = T ( ) with the Metropolis acceptance
Otherwise reject and set T (i+1) = T (i).
A Student-t distribution was chosen as the proposal distribution for the algorithm. It has a similar (symmetrical) shape to the normal distribution but has heavier tails and an additional degrees-of-freedom parameter, df . The heavier tails of the Student-t distribution results in bolder proposals than the normal distribution. This mitigates getting stuck in local modes and aims to improve mixing. The degrees-of-freedom parameter was set to df = 3, which is the smallest integer that yields a distribution with nite variance. Note that smaller df implies heavier tails in the Student-t distribution, which gives higher probability density towards the tails. This is what allows for bolder proposals, which could improve mixing. The proposal for T (i+1) is centred on T (i), and has scale parameter 2 that is initially and arbitrarily set to 0.05, and subsequently automatically tuned during the algorithm to ensure good mixing and acceptance rates using an approach that aims to achieve an acceptance probability of 0:44.
Simulations using this model run for 100,000 iterations, with a burn-in period of 10,000 and a thinning factor of 5.
Declaration of Authorship
2 Gravitational wave astronomy
2.1 Gravitational waves
2.2 Gravitational wave sources
2.3 Gravitational wave detectors
2.4 Stellar core collapse
2.5 Gravitational wave data analysis
3 Statistical methods
3.1 The Bayesian paradigm
3.2 Markov chain Monte Carlo
3.3 Bayesian nonparametrics
3.4 Fourier analysis of time series
3.5 Spectral density estimation .
3.6 Principal component analysis
4 Parameter estimation for rotating stellar core collapse
4.2 Core collapse waveform catalogue
4.3 Signal reconstruction
4.4 Ratio of rotational kinetic energy to gravitational potential energy of the inner core at bounce
4.5 Precollapse dierential rotation
4.6 Nuclear equation of state .
5 Spectral density estimation for Advanced LIGO noise
5.2 Advanced LIGO spectral density estimation using the Bernstein polynomial prior
5.3 Advanced LIGO spectral density estimation using the B-spline prior
5.4 Advanced LIGO spectral density estimation using a nonparametric correction to a parametric likelihood
6 Conclusions and outlook
GET THE COMPLETE PROJECT
Bayesian modelling of stellar core collapse gravitational wave signals and detector noise