SBI with noisy exponential simulator: convergence of posteriors for SNPE_C, SNLE and SNRE #1333
Replies: 10 comments
-
You are currently using 100 simulations (i.e. datapoints) to estimate a 50D distribution (with NLE). I would suggest to increase the number of simulations to at least 1000. |
Beta Was this translation helpful? Give feedback.
-
What you suggested does seem to help but it doesn't fully address the issue:
I changed the no. of simulations to 1500/2000 |
Beta Was this translation helpful? Give feedback.
-
As general advice, if you perform such studies, use the same simulator for x_o as you use in training. In your case simulator = lambda theta: ((1/initial_condition)*np.exp(-t/theta) + 0.05*torch.rand_like(theta)).float() Which uses a uniform noise range from 0. to 0.05. On the other hand, the observation uses This adds the complication of model-misspecification to your problem, which can be problematic for the neural nets involved (i.e. x_o falls out of training distribution). But also leads to the fact that the true parameter does not need to be covered by the estimated posterior. I suppose this was not intentional. Specifically, in this case, the observation might not even be within the support of your data distribution implied by the simulator, and no parameter might exist to reproduce the observation (and Bayesian inference is the ill-defined anyway). Next as general advice for (multi-round) training:
Applying this to NPE will lead to consistently good posterior estimates on my machine: import numpy as np
import torch
from torch.distributions import Uniform, Exponential
from sbi.inference import NPE
from sbi.analysis import pairplot
# ## Function to introduce noise into the exponential:
def noisy_exponential(initial_cond, scale, input):
x_noise = np.zeros((len(input)),)
for i in range(len(input)):
value = 1/(initial_cond)*np.exp(-input[i]/scale) + 0.05*np.random.normal(0,1)
x_noise[i] = value
return x_noise
# Define the prior
class WrappedExponential(Exponential):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def log_prob(*args, **kwargs):
log_probs = Exponential.log_prob(*args, **kwargs)
return log_probs.squeeze()
exp_prior_mod = WrappedExponential(torch.tensor([2.0]))
n = 50
t = np.linspace(0,10,n)
num_dims = 1
num_sims = 1000
num_rounds = 2
theta_true = torch.tensor([1.])
initial_condition = 1.0
#x_o = torch.tensor(noisy_exponential(1.0, theta_true,t)).float()
simulator = lambda theta: ((1/initial_condition)*np.exp(-t/theta) + 0.05*torch.rand_like(theta)).float()
x_o = simulator(theta_true)
inference = NPE(prior=exp_prior_mod, density_estimator="nsf")
proposal = exp_prior_mod
# Later rounds use the APT loss.
for _ in range(num_rounds):
theta = proposal.sample((num_sims,))
x = simulator(theta)
x[x>10] = 10
_ = inference.append_simulations(theta, x, proposal=proposal).train(use_combined_loss = False, retrain_from_scratch=False)
posterior = inference.build_posterior().set_default_x(x_o)
proposal = posterior |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Hey, In the end, you are working with few model simulations in a relatively high dimensional x space. So, it is expected that there will be deviations in the posteriors returned by the different methods. Particularly, NL(R)E will have to estimate a 50d density with just a few samples. A general rule of thumb is that NPE > N(L)RE if dim(x) >> dim(theta). Feel free to close the issue if this resolves your problem. Kind regards, |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
In general, under the true posterior, we would expect that within the 95% highest density region (i.e., the peak in your density plots), the true parameters is contained 95% of the time. So if you say that "sometimes" it lies outside of the posterior, this does not necessarily indicate an error. In fact, given the true posterior, we would expect it to lay outside 5% of independent trials (following the example above). A way to characterize if there is a problem is simulation-based calibration analysis. This, in fact e.g., it simply checks if the above is true for your approximation (for all x% levels). There are a bunch of diagnostic tools implemented in SBI, that only require a test set of new thetas:
If you have rather high-dimensional data, then NPE does give you the opportunity to include an embedding net. This can be jointly trained with the density estimator or/and include certain regularity constraints you might want to impose. It is helpful if you have very high-dimensional data. For example if you have images, it makes sense to use a convolutional neural network as an embedding net, that compresses the image into e.g a 100 dim "summary" vector. |
Beta Was this translation helpful? Give feedback.
-
@paarth-dudani did the suggestions from Manuel help? |
Beta Was this translation helpful? Give feedback.
-
I am moving this to discussions as it seems to be related to a more specific use-case. Please continue the discussion there. Thanks 🙏 |
Beta Was this translation helpful? Give feedback.
-
Hi @paarth-dudani has this been resolved? Can you please post an update here if not? Thanks! |
Beta Was this translation helpful? Give feedback.
-
Describe the bug
SNLE, SNRE and SNPE_C converge arbitrarily for a simple simulator model with just one parameter to infer.
To Reproduce
All else same
All else is same
Examples of pathological convergence:
Expected behavior
A proper and robust convergence to the true theta value with low variability across iterations and across various true values of the parameter to be inferred.
Additional context
I would appreciate suggestions as it is an urgent issue on which I have been stuck for more than a week now.
Beta Was this translation helpful? Give feedback.
All reactions