Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantitative comparison between autocorrelation function and block averaging #53

Open
ocmadin opened this issue Sep 6, 2018 · 16 comments

Comments

@ocmadin
Copy link

ocmadin commented Sep 6, 2018

In section 7.3, there is a discussion of autocorrelation analysis and block averaging as methods for estimating the number of independent samples, but the discussion does not make any recommendations on whether it is better to use either method in specific cases.

Have there been any studies to quantitatively compare these two measures? For example, testing the minimum number of samples before each method becomes unreliable, or whether the extra information in a block averaging scheme makes a difference in uncertainty. We are working on a best practices document for property calculation from MD and are interested in the effect of choice of method.

@mrshirts
Copy link

mrshirts commented Sep 6, 2018

I'll just echo this (Owen is a student in my lab)! We are willing to do some tests, but want to make sure that we aren't reduplicating work that other people have done previously!

@agrossfield
Copy link
Collaborator

agrossfield commented Sep 6, 2018 via email

@dmzuckerman
Copy link
Owner

I'm not aware of a head-to-head comparison, but perhaps @mangiapasta may know. I would be quite interested in your results if you obtain any.

One advantage of block-averaging is that it gives you an uncertainty estimate directly, without the need to separately compute the number of samples.

@SeroNISTPI
Copy link
Collaborator

I don't know offhand if there has been a comparison. I have a suspicion that under certain assumptions (e.g. finite correlation time) there may be simple relationships between estimates of the variances computed from each method, but don't quote me on that.

@dwsideriusNIST
Copy link
Collaborator

dwsideriusNIST commented Sep 6, 2018 via email

@ajschult
Copy link
Collaborator

ajschult commented Sep 6, 2018

We tried computing uncertainties using the autocorrelation function recently and found the results to be quite poor in comparison to block averaging, especially when the autocorrelation fluctuates out to long times where it cannot be computing precisely. Block averaging has more trouble with such situations than those that decay quickly, but not nearly as much as the autocorrelation approach.

@dmzuckerman
Copy link
Owner

In regards to @dwsideriusNIST comment "block averaging became popular because it was cheap and
could be done on the fly if the block size was preselected"

The proper/informative way to do block averaging explicitly requires checking all possible block sizes and not pre-selecting the block size. This is key. Is it 'rigorous'? I think it's close enough for physicists. Certainly the tail challenge in ACF noted above and in prior discussions is quite tricky.

@mrshirts
Copy link

mrshirts commented Sep 6, 2018

Glad to get all the discussion! This just SEEMS like a question that the statistics community must have answered at some point, isn't it? Understanding what you can conclude from timeseries is their bread and butter, right?

If not, some recognition of what is what seems to be highly needed . . . Especially how to determine the number of independent samples (either explicitly (autocorrelation) or implicitly (block averaging)) in the most reliable way possible under standard conditions. When one transitions from the "we have collected thousands of correlation time of data, everything works" to the "we have just a few correlation times", presumably one method breaks before the other. Below "we have just a few correlation times" I would imagine everything fails. But it's at which point before it fails (where "fails" needs to be defined a bit better) that's most interesting.

@mrshirts
Copy link

mrshirts commented Sep 6, 2018

Also, block averaging vs. block bootstrapping is of interest . . .

@dmzuckerman
Copy link
Owner

i think block bootstrapping implicitly assumes you have multiple samples ... which is (part of) what one is trying to figure out.

fyi a caution on bootstrapping: http://statisticalbiophysicsblog.org/?p=213

@dwsideriusNIST
Copy link
Collaborator

in reply to @dmzuckerman "The proper/informative way to do block averaging explicitly requires checking all possible block sizes and not pre-selecting the block size. This is key. Is it 'rigorous'? I think it's close enough for physicists. Certainly the tail challenge in ACF noted above and in prior discussions is quite tricky."

I think this gets back to one of our purposes in writing the paper: education about the underlying assumptions / statistical foundation of uncertainty estimation techniques. I'm entirely on board with the need to "check your block sizes" in post-process, hence the inclusion of that exact point in our paper. But I'll also echo something @mangiapasta said earlier: "... you'd be surprised what people do ..." Particularly when using codes traceable to old editions of Allen&Tildesley or Frenkel&Smit that used pre-set block sizes.

@richardjgowers
Copy link

It's not a full investigation of the two approaches, but we briefly compared them in Figs 2&3 here:

https://www.tandfonline.com/doi/pdf/10.1080/08927022.2017.1375492?needAccess=true

We found that autocorrelation plots were much easier to read/automate reading than blocks. It's also a little strange to measure the autocorrelation by trying to find a blocksize where you don't see the effects of it (ie eq 25 in the Flyvberg paper).

I think @dwsideriusNIST is correct wrt why block averaging was a thing. When I was doing multiple blocks to find the smallest blocksize, I found that just directly calculating the autocorrelation (with FFTs) was faster anyway.

@dmzuckerman
Copy link
Owner

There are two goals in this type of error analysis: (1) deterimine autocorr time and (2) determine uncertainty. For (1) I guess autocorr plot is better. But for (2), which arguably is more bottom-line, I would put my money on blocking. In Fig. 3 of @richardjgowers paper linked above, the BSE seems to be convincingly estimated. I'm sure both approaches struggle with insufficient data - no surprise there.

@mrshirts
Copy link

So I guess I'd leave for the next version of this document in 1-2 years; it would be super useful the field as whole to have some more quantitative answer to this question.

@davidlmobley
Copy link

I think Mike Gilson's group has done some work trying to look at some of this. Someone may want to rope him in for a discussion.

With respect to Michael's comment:

So I guess I'd leave for the next version of this document in 1-2 years; it would be super useful the field as whole to have some more quantitative answer to this question.

Note that this can be addressed in the repo as soon as anyone wants to, and then those changes just naturally roll into the next peer-reviewed version when they are ready. :)

@ppernot
Copy link

ppernot commented Dec 5, 2018

There is still another method based on time-series analysis which is commonly used in statistics for the analysis of Markov Chain Monte Carlo samples. It is based on the fit of the sample by an Auto Regressive process. See e.g. Thompson 2010. This might be an interesting addition to the discussion in Sect 7.3...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants