Library Subscription: Guest
Begell Digital Portal Begell Digital Library eBooks Journals References & Proceedings Research Collections
International Journal for Uncertainty Quantification
IF: 0.967 5-Year IF: 1.301

ISSN Print: 2152-5080
ISSN Online: 2152-5099

Open Access

International Journal for Uncertainty Quantification

DOI: 10.1615/Int.J.UncertaintyQuantification.2015013799
pages 469-489


Luis G. Crespo
NASA Langley Research Center, MS 308, Hampton, Virginia 23681, USA
Sean P. Kenny
NASA Langley Research Center, MS 308, Hampton, Virginia 23681, USA
Daniel P. Giesy
NASA Langley Research Center, MS 308, Hampton, Virginia 23681, USA


This paper proposes techniques for constructing linear parametric models describing key features of the distribution of an output variable given input-output data. By contrast to standard models, which yield a single output value at each value of the input, random predictors models (RPMs) yield a random variable. The strategies proposed yield models in which the mean, the variance, and the range of the model's parameters, thus, of the random process describing the output, are rigorously prescribed. As such, these strategies encompass all RPMs conforming to the prescription of these metrics (e.g., random variables and probability boxes describing the model's parameters, and random processes describing the output). Strategies for calculating optimal RPMs by solving a sequence of optimization programs are developed. The RPMs are optimal in the sense that they yield the tightest output ranges containing all (or, depending on the formulation, most) of the observations. Extensions that enable eliminating the effects of outliers in the data set are developed. When the data-generating mechanism is stationary, the data are independent, and the optimization program(s) used to calculate the RPM is convex (or, when its solution coincides with the solution to an auxiliary convex program), the reliability of the prediction, which is the probability that a future observation would fall within the predicted output range, is bounded rigorously using Scenario Optimization Theory. This framework does not require making any assumptions on the underlying structure of the data-generating mechanism.