-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #267: Refactor to allow custom event priors and marginalise the latent likelihood #474
Conversation
aabcfeb
to
b8e61e9
Compare
Noting this has been slowed down as I realised that reparameterisation wasn't working currently with general families and |
Something else to note here is that the matrix multiplication and handling of the uniform priors appears to be very inefficient so in its current form it doesn't really make a great deal of sense. |
I don't think that matrix multiplication issue is possible to get around so I think I would argue this PR is repurposed to address the issues found whilst exploring this and we hit pause on custom priors and look for another method for correcting the likelihood |
This PR now:
I also added #478, #476, and #477 to enhance functionality from this PR. In general, I found quite a few edge cases in places (especially the prior handling) when looking at this that I think might need ongoing work. Due to the inefficiency of the window as formulas approaches this implements the log likelihood where windows are marginalised out. I think this is more correct that the random unlinked sample version but is potentially not ideal. The only way I can think of to write down the full likelihood is to somehow switch the model from the direct version to the formula version when doing posterior prediction. That feels kind of bonkers though. The marginalised likelihood is currently very slow (due to the issues noted above) so it may feel less of a problem as those are resolved. |
2dc537a
to
5beebde
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I think the way to do the reparameterisation is cool. I do worry about it being a bit hacky, and relying on regex. Perhaps though the regex is relatively robust and the pattern it's looking for in
brms
Stan code can be relied not to change. Also to note I'm unsure about why this feature is a part of this PR. (Is it related to refactor to use formula forpwindow
andswindow
?). Also if it's not using S3 suggest dropping S3. - On the
log_lik
,posterior_predict
andposterior_epred
I think the generalisation beyond being about the latent model is good. I wonder about how we will extend this to working with themarginal_model
(PR Issue #221: Add marginal model #426). For the naive model it already has these methods as it doesn't need a custom family. One could wonder about simulating from the right likelihood after using the naive model, but perhaps it's not going to happen / be a useful feature. - It's exciting to have priors on the
pwindow_raw
andswindow_raw
. I wonder about how easy it will be for users on the journey to changing those priors (knowing which parameters to set). - About the prior infrastructure, it seems to be getting quite complicated and or brittle. Perhaps you disagree. I wonder about what we might be able to do to simplify it. I think at some level I feel quite bad about including an argument like
merge_priors
in the mainepidist()
function as it shows our approach isn't really working that well. I wonder what use cases the "built in approach isn't flexible enough for" -- is it mainly this one about prior for all coefficients? I was a bit confused by that because in the previous approach I thought we were matching on all entries in the dataframe. Now it seems like we are doing some regex and it's a different approach. - Is there some way we can document how users can make use of the priors on window functionality? And when they would do it? Inclusion in some vignette? Create new issue on this? I think otherwise it is quite a challenging thing to figure out.
ty for the review @athowes. I think I have addressed your specific concerns and discussed your points below. My high level is I agree some parts of this are a bit complex but I think we might need some more user testing to see how it all shakes out before making updates.
The S3 option is there in case it isn't working well enough and so you can write a s3 method as a workaround. If we didn't have this you would be just stuck. As I said in the specific comment I think we should monitor this.
Its part of this PR as I refactored the stan code section of the latent model - one part of that was this (it also needed to be done in order to make the model work at all when we were passing the window parameters as dpars which we no longer are).
Yes I agree. I think we should make sure that the marginal model has the vreals etc that this needs and then it should just be plug and play
As above and yes I think so. I think we might want to wait to see what custom priors looks like in the marginal model (if and when that goes in) as that will have other limitations/features we should probably discuss. I'll make an issue now though.
I found multiple instances where this didn't work (with some of them in the new tests). My working with this more generally made me think we might need a list argument per low level function in epidist to pass optional parameters as it felt pretty limited when playing around.
It got so complicated because the current implementation was pretty brittle when actually trying to use it (mostly when trying to pass pwindow and swindow in the formula which a future model might need to do). I think we might need some more user testing to see where we are but I really can't see how we can do away with the handling of manual priors as a special case for example.
Not at all easy I think and there are lots of difficult edge cases (also posterior_predict etc is uniform only) I think we might need another issue to expand on this/add some guard rails. |
… latent likelihood (#474) * first pass at refactoring latent model to use window formulas * add docs to stan function * check getting started -drive by fix plotting * update approach to handling formulas * get reparameterisation from brms itself vs enforcing manual declaration * work on regexing: * test manually setting new priors * fix .replace_prior * reset for pause * add back in lower bounds * revert pass in via formula * add custom priors pass in * write priors down more neatly * add manual prior mode and optout * clean up easy test failures * use marginalised log likelihood * debug marginalised likelihood * workaround for liklihood vectorisation * further increase prior complexity options * update prior ordering * catch printing issue for .replace_prior * add news iteem * add PR links * speeed up test * code read through * clean up precommit * turn off priorsense to check theory its numerical instability for extreme log lik values * review comments Former-commit-id: 5f6baaa183c156a9d981a669f3717daed3806468 [formerly 9869c9563464c7752c57123f718bd66b3b08d5a3] Former-commit-id: b1b61cbc1600a76b14924cf86eacf66c5acefbcb
Description
This PR closes #267 by moving to a formula approach for
pwindow
andswindow
. On top of allowing for the correct likelihood this also allows users to set priors on pwindow and swindow. This could be extended to allow for custom formulas (for pwindow this would be a useful feature but not for swindow).Checklist