Verification of ensemble relative frequencies without binning #1742

RogerHar · 2022-08-11T17:32:27Z

RogerHar
Aug 11, 2022

What is the best way to produce verification measures for probabilistic forecasts for an ensemble of size $m$ when I want the probabilities to be the full set of ensemble relative frequencies $0, 1/m, 2/m, ... 1$ rather than the midpoint of binned probabilities? I can use Gen-Ens-Prod to generate the relative frequencies by including frequency = TRUE in ensemble_flag, but how should I then set up Point-Stat, Grid-Stat or Series-Analysis to produce e.g. Brier scores for these?

MET's definition of verification measures for probabilistic forecasts uses the midpoint of bins specified by user-defined thresholds to define the probabilities used in the formulae. I want to use the relative frequencies themselves, i.e. the values output by Gen-Ens_Prod without any binning. Although it seems possible to specify the bins in such a way as to make their midpoints the relative frequencies, it gets rather awkward:

For example for an ensemble size of 6, the ensemble relative frequencies are [0, 0.16667, 0.33333, 0.5, 0.66667, 0.83333, 1]. If I set the thresholds as:
cat_thresh = [>=0, >=0.00001, >=0.08333, >=0.25, >=0.41667, >=0.58333, 0.75, 0.91667, >=0.99999, >=1.0], I think that will give the ensemble relative frequencies as the bin midpoints, plus two empty bins in the second and second-to-last positions.

This is tedious to set up by hand for large ensemble sizes though, and the two empty bins seem a bit clumsy. It would be easier if MET had an option to use full set of probabilities found in the forecast without binning, equivalent to bins=FALSE in the brier functions in the verification R package (and also in its attribute and verify functions). Alternatively, might it be possible to create or modify a METplus wrapper to set up the the cat_thresh values in this way?

Thanks for your time,

Roger

Answered by j-opatz

Aug 12, 2022

Hi Roger,

And thank you for your question. As you've indicated well in your question, the quickest way to obtain an uncalibrated probability forecast is through the use of the gen-ens-prod tool by turning on the frequency ensemble_flag setting.

Unfortunately, from my experience with MET and METplus (especially with probability forecasts), there is no way that I'm aware of to dynamically set the categorical thresholds for any subsequent analytical tools (grid-stat, point-stat, series-analysis, etc.). As background, a score like Brier Score where the probability of the forecast (f) and observation (o) fields are traditionally directly reviewed for the calculation (e.g. Sum of f_t - o_t squa…

View full answer

j-opatz · 2022-08-12T20:41:14Z

j-opatz
Aug 12, 2022
Collaborator

Hi Roger,

And thank you for your question. As you've indicated well in your question, the quickest way to obtain an uncalibrated probability forecast is through the use of the gen-ens-prod tool by turning on the frequency ensemble_flag setting.

Unfortunately, from my experience with MET and METplus (especially with probability forecasts), there is no way that I'm aware of to dynamically set the categorical thresholds for any subsequent analytical tools (grid-stat, point-stat, series-analysis, etc.). As background, a score like Brier Score where the probability of the forecast (f) and observation (o) fields are traditionally directly reviewed for the calculation (e.g. Sum of f_t - o_t squared divided by total pairs), MET still relies on the use of nx2 contingency tables and thus requires thresholds. As you correctly linked and provided an example of, the best way to make these two approaches equivalent is to make the midpoint of each bin the value you desire for f_t. But this isn't without its drawbacks, as it's finicky (finding the exact half of a repeating decimal value) and becomes tedious with larger threshold quantities.

I'm not sure how easy it would be for METplus to be coded to dynamically set the bin width based on the input/ a set value, or the (potentially) larger task of allowing direct sampling of probabilities for BSS calculation. @JohnHalleyGotway, do you have any insight into how this could be done in MET, to dynamically set the cat_thresh options to allow the midpoint of a given list to act as the f_t in a Brier Score calculation? Is there a possibility of doing something similar to cdf_bins, where we tell cat_thresh how many bins we desire and MET can create regularly spaced intervals for it? It already has something similar to that with the == prompt, but I think that has to evenly divide into 1.0.

3 replies

JohnHalleyGotway Aug 23, 2022
Maintainer

@RogerHar and @j-opatz, searching through the MET code, I want to call your attention to these 2 lines:

if ( use_center ) x = 0.5*(Thresholds[row] + Thresholds[row + 1]);
else              x = Thresholds[row];

When use_center is true, the center of each probability bins is used. When false, the value from the left side is used instead. The trouble is that use_center is just hard-coded at the top of that file as true!

static const int use_center = 1;

Given this, it seems pretty obvious we should just make this a configurable option instead of a hard-coding it to true. Do you agree that that's how we should handle this?

And I'm wondering, we clearly want the option of using the center point of each probability bin as well as the value from the left side of the bin. Is there ever a scenario in which we'd want to use the value from the right side of each bin? I'm trying to figure out if the options should be "true/false" or "left/center/right".

I'll wait to hear back from you before writing this up as a new development issue for MET. I just want to confirm that this is how you'd like to proceed.

RogerHar Aug 24, 2022
Author

Hi John, that's interesting and does look like it could provide a solution.

Continuing my example above of an ensemble of 6 members giving relative frequencies [0, 1/6, 2/6, 3/6, 4/6, 5/6, 1]: Using the left sides of the bins I'd guess I'd need to set cat_thresh = [>=0.0, >=0.16666, >=0.33333, >=0.5, >=0.66666, >=0.83333, >=0.99999, >=1.0]. Or possibly I could miss out the >=0.99999 or include >=1.0 twice instead so that the left and right side of the last bin are the same?? (I was under the impression that the last entry has to be >=1.0 to indicate the right side of the final bin but this FAQ appears to disagree so maybe that's has changed since I did the MET 8.0 tutorial?)

I think I'd also need to be a bit careful to use e.g. 0.16666 rather than 0.16667 (assuming the relative frequencies are output from Gen-Ens-Prod to higher precision than 5 d.p.) so that might require documenting somewhere.

I can't myself think of a scenario that would require the flexibility to use the right side of each bin, but perhaps you might as well include it in case it turns out to be useful to someone? Up to you.

JohnHalleyGotway Sep 27, 2022
Maintainer

@RogerHar I realize that I've failed to followup on your response! I'll use this discussion to create this GitHub issue dtcenter/MET#2280. I propose that we mark this discussion as being answered and that we move future conversation over to comments on that issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Verification of ensemble relative frequencies without binning #1742

{{title}}

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Verification of ensemble relative frequencies without binning #1742

RogerHar Aug 11, 2022

Replies: 1 comment · 3 replies

j-opatz Aug 12, 2022 Collaborator

JohnHalleyGotway Aug 23, 2022 Maintainer

RogerHar Aug 24, 2022 Author

JohnHalleyGotway Sep 27, 2022 Maintainer

RogerHar
Aug 11, 2022

Replies: 1 comment 3 replies

j-opatz
Aug 12, 2022
Collaborator

JohnHalleyGotway Aug 23, 2022
Maintainer

RogerHar Aug 24, 2022
Author

JohnHalleyGotway Sep 27, 2022
Maintainer