Does the bisim loss influence the convergence of the forward and reward dynamics model? #21

chaobiubiu · 2022-03-10T06:46:55Z

Hello, in my opinion, the bisimulation loss makes the distance between any two latents equal to the difference (reward_dist + \gamma * (transition_distribution_dist)), which approximates the bisimulation metric. But such operation can make the latent vary over the time and the forward model (z_t, a_t --> z_{t+1}) may regress to an time-varying target z_{t+1}. So I want to know whether such conflict between these two losses may hurt the final performance and influence the convergence of the forward dynamics model? I am looking forward to your reply, thank you !

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does the bisim loss influence the convergence of the forward and reward dynamics model? #21

Does the bisim loss influence the convergence of the forward and reward dynamics model? #21

chaobiubiu commented Mar 10, 2022

Does the bisim loss influence the convergence of the forward and reward dynamics model? #21

Does the bisim loss influence the convergence of the forward and reward dynamics model? #21

Comments

chaobiubiu commented Mar 10, 2022