We propose a simple, statistically principled, and theoretically justified
method to improve supervised learning when the training set is not
representative, a situation known as covariate shift. We build upon a
well-established methodology in causal inference, and show that the effects of
covariate shift can be reduced or eliminated by conditioning on propensity
scores. In practice, this is achieved by fitting learners within strata
constructed by partitioning the data based on the estimated propensity scores,
leading to approximately balanced covariates and much-improved target
prediction. We demonstrate the effectiveness of our general-purpose method on
two contemporary research questions in cosmology, outperforming
state-of-the-art importance weighting methods. We obtain the best reported AUC
(0.958) on the updated "Supernovae photometric classification challenge", and
we improve upon existing conditional density estimation of galaxy redshift from
Sloan Data Sky Survey (SDSS) data.