Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

np.argmax bias #3

Open
kvkpraneeth opened this issue Sep 24, 2020 · 0 comments
Open

np.argmax bias #3

kvkpraneeth opened this issue Sep 24, 2020 · 0 comments

Comments

@kvkpraneeth
Copy link

kvkpraneeth commented Sep 24, 2020

The function np.argmax, does not cut ties randomly:

eg:
Q = [1,0,1]
np.argmax always picks the first 1.

Whereas all ties while selecting should be cut randomly. This can lead to action bias.

np.argmax was used in bandit.py in chapter 2 folder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant