Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notes on LMH implementation #9

Open
dtolpin opened this issue Aug 30, 2017 · 3 comments
Open

Notes on LMH implementation #9

dtolpin opened this issue Aug 30, 2017 · 3 comments
Assignees

Comments

@dtolpin
Copy link

dtolpin commented Aug 30, 2017

Provide notes about LMH implementation in Anglican and how it can be done in CPProb

@dtolpin dtolpin self-assigned this Aug 30, 2017
@dtolpin
Copy link
Author

dtolpin commented Aug 30, 2017

the only problem which cannot be avoided with single-site MH (and which
is not in the paper Mario cited --- http://okmij.org/ftp/kakuritu/Hakaru10/PPS2016.pdf)
is that if a trace has two variables
and both must be resampled to get an acceptable sample (think about a
program implementing XOR) single-site MH cannot do the inference. Hard
inference is in general problematic, that's why Anglican relies on ABC
(approximate bayesian computation) via 'observe'.

The problem of having a sample with 0 probability is solved in Anglican
by resampling instead of rescoring. These are lines 75-81 in
src/anglican/lmh.clj:

    75          value (if (< (/ -1. 0.) log-p (/ 1. 0.)) value
    76                  ;; The retained value is not in support, resample
    77                  ;; the value from the prior.  When the value is
    78                  ;; resampled, log-p is no longer valid, but log-p
    79                  ;; of a resampled value is ignored anyway (see
    80                  ;; `utility' below).
    81                  (sample* (:dist smp)))

Which means that sometimes more than a single site is resampled, which is also true if the control flow of the program changes. It is safe.

You can use any addressing for LMH, the only thing you should ensure to keep it sound is that different sample points at run time always get different addresses. Some sampling schemes will be more efficient than others, leading to more reuse and better acceptance rate, but any addressing giving distinct addresses is sound.

Regarding acceptance rate and state transition. If you look at the paper by Wood etc A new approach to ..., then there is a derivation of acceptance rate for LMH, making clear that the acceptance rate depends on two transitions:

  • from current state to next state
  • from next state to current state

Because random database (the table of random choices) is collected and maintained between states, the current state reached from the next state is different from the current state reached from the previous-state.

Hence, the code from line 93 on ensures that the transition probability is computed correctly by comparing the so-called 'utility' of the next state relative to the current state and of the current state relative to the next state. One trick here is that resampled values do not affect acceptance. This is somewhat unexpected but you can do the derivation and see for yourself. Therefore, utility (lines 133-139) is computed as the sum of log-weight (log-likelihood due to observe) and of log-likelihood of rescored variables only.

So the steps to implement LMH, following the code in Anglican, would be

  • run the particle and produce a new state with the random database
    * if while running the particle a retained value has zero probability, resample it and mark as such
  • reconstruct the current state from the new state (the random database will reflect transitions from the next state to the current state, rather from the previous one).
  • compute the 'utilities' of the next-state and reconstructed current state.
  • accept the new state or retain the current state accordingly.

I think that two non-trivial things in the whole (rather short) implementation of LMH in Anglican are the reconstruction of current state from new-state (function 'prev-state') and the way the utility is computed.

@dtolpin
Copy link
Author

dtolpin commented Aug 30, 2017

I also found my old notes from early days of developing the current Anglican code base (the notes are relative to Interpreted Anglican). This is the derivation of acceptance rate, which may either help or be misleading :-) but I think the derivation can be re-done easily. The notes mention m!. "m!" was the internal development name of the new version of Anglican (pronounced "embang", embedded anglican).

Acceptance ratio in RDB:

Pr-new = + spent-new observed-new restored-new
Pr-old = + spent-old observed-old restored-old
rcp-old = - $ log N-old
rcp-new = - $ log N-new
rnd-old = spent-old
rnd-new = spent-new
old-to-new = + rcp-old val-new rnd-new
new-ro-old = + rcp-new val-old rnd-old

ratio = - (+ pr-new  new-to-old ) (+ pr-old old-to-new)

= - (+ observed-new restored-new (log |old|) val-old )
  (+ observed-old restored-old (log |new|) val-new )

= - (+ log-weight-new (- restored-new val-new) (log |old|))
    (+ log-weight-old (- restored-old val-old) (log |new|))

;; in m! would be

= - (+ log-weight-new rescored-new (log |old|))
        (- log-weight-old rescored-old (new |new|))


@dtolpin dtolpin assigned lezcano and gbaydin and unassigned lezcano and gbaydin Aug 30, 2017
@dtolpin
Copy link
Author

dtolpin commented Aug 30, 2017

@gbaydin @lezcano for your info

@lezcano lezcano assigned lezcano and unassigned dtolpin Sep 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants