Numerical predictions with continuous real numbers. For all continuous predictions, minima and maxima are defined as lower
and upper
(defaults: lower = -Inf, upper = Inf
).
A numeric point prediction.
CSV column name: point
Validity:
- Numeric
- Not NA
lower <= point <= upper
Numeric samples for continous outcomes between (and including) lower
and upper
.
CSV column name: sample
Validity:
- Numeric
- No NAs
lower <= sample <= upper
Predictions specified as a set of probabilities corresponding to a discrete set of bins across a range of possible numeric outcomes defined by lower
and upper
. The specific bins may be specified by a bin interval (generates equally sized bins) or a vector of specific bins defined by the lower bounds of each bin. Either version assumes the lower bound is inclusive and upper bound not inclusive, except for the final bin ending at upper
. For example, for observable values in x
, the bins include the probability that observation x
is greater than or equal to the bin-specific lower bound and less than the bin-specific upper bound: bin_lwr <= x < bin_upr
, except for at the upper
bound, where bin_lwr <= x <= upper
.
Interval-defined binned predictions
Bins are defined by bounds and intervals. For example, Continous(prob = probs, type='Bin', lower = 0, upper = 100, interval = 1)
requires 101 probabilities (probs
) that cover the bins 0 <= probs[1] < 1
, 1 <= probs[2] < 2
, ... 98 <= probs[1] < 99
, 99 <= probs[101] <= 100
.
Interval-defined binned predictions are represented internally as a list of:
lower
: the lower bound of the range of possible predictionsupper
: the upper bound of the range of possible predictionsinterval
: the span of each binprob
: the ordered probabilities assigned to each bin
Validity:
- All inputs are numeric
- No NAs
lower != -Inf
andupper != Inf
- A probability is specified for each bin defined by
lower
,upper
, andinterval
- The sum of
prob
is 1.0
Bin-defined binned predictions
Bins are defined explicitly by their lower bounds. For example, Continous(prob = probs, type='Bin', lwr = lwr_bounds)
defines the bins by their lower bounds (lwr
) and accepts an equal number of probabilities (probs
), which are associated in order with those bins.
Bin-defined binned predictions are represented internally as a data.frame with two columns:
lwr
: inclusive numeric lower bounds for sequential bins (equal intervals)prob
: probabilities assigned to each bin
Validity:
- All inputs are numeric
- No NAs
lower != -Inf
lwr[1] == lower
max(lwr) < upper
- A probability is specified for each bin (
length(prob) == length(lwr)
) - The sum of
prob
is 1.0
Predictions characterized by parametric distributions defined according to base R. Distribution truncation has not been configure, so upper
and lower
should not be specified and default to those for the respective distribution.
Parametric predictions are represented internally as a data.frame with 2 columns:
parameter_name
with the parameter name (from the set describe below)parameter_value
the corresponding numeric parameter
The following distributions and parameters are currently supported:
- Normal:
mean
,sd
(Support:lower = -Inf, upper = Inf
) - Log-normal:
meanlog
,sdlog
(Support:lower = 0, upper = Inf
) - Gamma:
shape
,rate
(orshape
,scale
) (Support:lower = 0, upper = Inf
) - Beta:
shape1
,shape2
(Support:lower = 0, upper = 1
)
Validity:
- The supplied parameter names (
parameter_name
) must exactly match those of the specified parametric distribution - The parameter values (
parameter_value
) must be numeric and not include NA - The parameter values (
parameter_value
) must be appropriate for the specified parametric distribution (e.g.shape > 0
) lower
andupper
must be equivalent to those of the specified parametric distribution
Quantitative discrete numeric forecasts.
A numeric point prediction.
CSV column name: point
Validity:
- Numeric
- Not NA
A prediction of the most likely categorical outcome.
CSV column name: point
Validity:
- String
- Not NA
A prediction of the most likely categorical outcome.
CSV column name: point
Validity:
- Numeric
- Not NA
- 0 <= value <= 1
All dates are formated in ISO standard format: YYYY-MM-DD
. Forecasts may be specific to a time period, such as a week, month, or year. Those should be consistently defined in the context of the forecast as they are not defined explicitly in the predx
object.
Times are formatted in ISO standard 24 hour format:
YYYY-MM-DDTHH:MM:SS+HH:MM
, where the final HH:MM is the adjustment for the time zone compared to Coordinated Universal Time (UTC). If the final :MM
in the time zone is :00
, it may be dropped.
Examples:
- 2020-12-18T13:20:37+00:00 is 13:20:37 (1:20 PM with 37 seconds) on 12 December 2020 in UTC (Greenwich Mean Time)
- 2025-01-03T02:00:00-05:00 or 2025-01-03T02:00:00-05 is 2:00 (2:00 AM) on 3 January 2025 in UTC-05 (Eastern Standard Time).
A prediction of the most likely date.
CSV column name: point
Validity:
- ISO date or time (described above)
- Not NA
All dates are formated in
A prediction of the most likely categorical outcome.
CSV column name: point
Validity:
- Numeric
- Not NA
- 0 <= value <= 1
A character string point prediction, e.g. associated with SampleCat or BinCat.
CSV column name: point
Validity:
- Not NA
A numeric probability.
CSV column name: prob
Validity:
- Not NA
- 0 <= prob <= 1
Binned distribution defined by inclusive lower bounds for each bin.
A data.frame object with two columns:
- lwr: inclusive numeric lower bounds for sequential bins (equal intervals)
- prob: probabilities assigned to each bin
CSV column names: lwr, prob
Validity:
- No NAs in lwr or prob
- Probabilities are positive
- Probabilities sum to 1.0
- Bins are in ascending order
- Bin sizes are uniform
Binned distribution with a category for each bin.
A data.frame object with two columns:
- cat: character strings representing each possible outcome category
- prob: probabilities assigned to each bin
CSV column names: cat, prob
Validity:
- No NAs in lwr or prob
- Probabilities are positive
- Probabilities sum to 1.0
Numeric samples.
CSV column name: sample
Validity:
- No NAs
Character string samples.
CSV column name: sample
Validity:
- No NAs