Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Does the bisim loss influence the convergence of the forward and reward dynamics model? #21

Open
chaobiubiu opened this issue Mar 10, 2022 · 0 comments

Comments

@chaobiubiu
Copy link

Hello, in my opinion, the bisimulation loss makes the distance between any two latents equal to the difference (reward_dist + \gamma * (transition_distribution_dist)), which approximates the bisimulation metric. But such operation can make the latent vary over the time and the forward model (z_t, a_t --> z_{t+1}) may regress to an time-varying target z_{t+1}. So I want to know whether such conflict between these two losses may hurt the final performance and influence the convergence of the forward dynamics model? I am looking forward to your reply, thank you !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant