Journal Article

A sparse Gaussian process framework for photometric redshift estimation

Ibrahim A. Almosallam, Sam N. Lindsay, Matt J. Jarvis and Stephen J. Roberts

in Monthly Notices of the Royal Astronomical Society

Volume 455, issue 3, pages 2387-2401
ISSN: 0035-8711
Published online November 2015 | e-ISSN: 1365-2966 | DOI:
A sparse Gaussian process framework for photometric redshift estimation

More Like This

Show all results sharing this subject:

  • Astronomy and Astrophysics


Show Summary Details


Accurate photometric redshifts are a lynchpin for many future experiments to pin down the cosmological model and for studies of galaxy evolution. In this study, a novel sparse regression framework for photometric redshift estimation is presented. Synthetic data set simulating the Euclid survey and real data from SDSS DR12 are used to train and test the proposed models. We show that approaches which include careful data preparation and model design offer a significant improvement in comparison with several competing machine learning algorithms. Standard implementations of most regression algorithms use the minimization of the sum of squared errors as the objective function. For redshift inference, this induces a bias in the posterior mean of the output distribution, which can be problematic. In this paper, we directly minimize the target metric Δz = (zszp)/(1 + zs) and address the bias problem via a distribution-based weighting scheme, incorporated as part of the optimization objective. The results are compared with other machine learning algorithms in the field such as artificial neural networks (ANN), Gaussian processes (GPs) and sparse GPs. The proposed framework reaches a mean absolute Δz = 0.0026(1 + zs), over the redshift range of 0 ≤ zs ≤ 2 on the simulated data, and Δz = 0.0178(1 + zs) over the entire redshift range on the SDSS DR12 survey, outperforming the standard ANNz used in the literature. We also investigate how the relative size of the training sample affects the photometric redshift accuracy. We find that a training sample of >30 per cent of total sample size, provides little additional constraint on the photometric redshifts, and note that our GP formalism strongly outperforms ANNz in the sparse data regime for the simulated data set.

Keywords: methods: data analysis; galaxies: distances and redshifts

Journal Article.  10032 words.  Illustrated.

Subjects: Astronomy and Astrophysics

Full text: subscription required

How to subscribe Recommend to my Librarian

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.