Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Research] Inventory-next anonymous user sees dataset page #2314

Closed
FuhuXia opened this issue Oct 21, 2020 · 18 comments
Closed

[Research] Inventory-next anonymous user sees dataset page #2314

FuhuXia opened this issue Oct 21, 2020 · 18 comments
Assignees

Comments

@FuhuXia
Copy link
Member

FuhuXia commented Oct 21, 2020

Inventory-next displays dataset page to anonymous user.

How to reproduce

Open chrome in incognito mode, go to inventory-next page on sandbox or staging.

Expected behavior

See message "You are not authorized to do this" and a Log in button.

Actual behavior

See the dataset list.

First Step:

  • As a team, decide if we want actual behavior or expected behavior
@mogul
Copy link
Contributor

mogul commented Oct 22, 2020

Needs discussion as to whether this is more desirable behavior in any case. This may be more about documenting acceptable/recommended usage of inventory.data.gov for users who might be trying to use it for distributing private data.

@mogul
Copy link
Contributor

mogul commented Oct 22, 2020

Summary: We need to investigate why the behavior is different from the previous version, but we also need to decide if the previous behavior is what we wanted.

@adborden
Copy link
Contributor

adborden commented Oct 22, 2020

These are alternative issues that I think we should consider instead of this issue. We know the existing behavior on inventory-classic is incorrect/incomplete. Instead I think we want to consider making datasets explicitly public and tweak the draft behavior:

#2096
#2095
#1863

@mogul
Copy link
Contributor

mogul commented Oct 22, 2020

We should also research how/if the redaction feature is being used as-is.

@jbrown-xentity
Copy link
Contributor

Questions:

  • Are agencies putting any private data (metadata fields or data files or existence of data itself) on inventory?
    • In your use of inventory, do you have any data or metadata that should not be viewable by the public?
    • Would you utilize a feature in the future that could restrict access to the public for specific metadata fields or data files?
  • If inventory will be publicly available, should we add a robots.txt file to limit/stop search engines from indexing?

@hkdctol
Copy link
Contributor

hkdctol commented Nov 2, 2020

@jbrown-xentity I think the two questions you have as sub-bullets are good to send out as-is--does that sound ok? I can get it to inventory users today

@jbrown-xentity
Copy link
Contributor

@hkdctol sounds good to me!

@hkdctol
Copy link
Contributor

hkdctol commented Nov 10, 2020

@jbrown-xentity reviewed 12 google form responses and a few others that came by Word doc. We also sent again today to open data listserv. A fair amount of interest in being able to store non public information going forward. Doesn't seem that anyone is using it for non-public information right now. Let's see if we get more responses in the next few days.

@mogul
Copy link
Contributor

mogul commented Nov 12, 2020

Hyon's giving this a couple more days before we make some decisions.

@mogul
Copy link
Contributor

mogul commented Nov 16, 2020

Survey results summary:

  • People are interested in using inventory for non-public information.
  • People seem confused about what that actually means.

We are all still confused about which settings apply, and what the norms should be, so Hyon is scheduling a meeting for us to talk in more detail.

@adborden
Copy link
Contributor

Questions to be answered:

  • Where are all the bits of code in inventory-classic that are affecting this observed behavior (links to commits in USMetadata, ckan, ckanext-saml2)?
  • Does CKAN configuration support defaulting new datasets to private? Does the new metadata form use CKAN's default value for private?
  • Do we need to fix the Draft status behavior?

@jbrown-xentity
Copy link
Contributor

jbrown-xentity commented Nov 19, 2020

Found this discussion detailing how to allow access to CKAN datasets only when logged in, leaving it here in case we need it.
CKAN does not have a configuration option for detailing if created packages are public or private by default, not sure what the default is in code...

@chris-macdermaid chris-macdermaid self-assigned this Dec 3, 2020
@adborden
Copy link
Contributor

adborden commented Dec 3, 2020

@adborden and @chris-macdermaid to circle up on next steps.

@jbrown-xentity
Copy link
Contributor

Questions to be answered:

* Where are all the bits of code in inventory-classic that are affecting this observed behavior (links to commits in USMetadata, ckan, ckanext-saml2)?

TODO

* Does CKAN configuration support defaulting new datasets to private? Does the new metadata form use CKAN's default value for `private`?

No, this is not supported in CKAN configuration. Unknown for new metadata form. This logic to make datasets public is part of the classic and next inventory backend, with a standard disclaimer banner at the top of the resource edit page for inventory classic (I don't believe there is a similar feature in inventory-next).
Unknown if new metadata form defaults to private for datasets otherwise.

* Do we need to fix the [Draft status behavior](https://github.com/GSA/datagov-deploy/issues/2095)?

Already addressed.

@jbrown-xentity
Copy link
Contributor

Items left for discovery:

  • Identify how datasets were created by default private in classic, and verify in inventory-next (and when data is added, dataset is made public)
  • An anonymous user should be redirected to a login needed or access denied type page

@pjsharpe07 pjsharpe07 self-assigned this Dec 17, 2020
@jbrown-xentity
Copy link
Contributor

The code that was handling the redirects and locking out anonymous users is here.
This code was first commented and then removed for the usmetadata ckan-2-8 branch.
I believe this was originally edited because this is no longer how we should handle authorization.
The next step should be to read through the CKAN docs on IAuthFunctions interface and establishing where this code should be implemented (new extension, etc).

@jbrown-xentity
Copy link
Contributor

jbrown-xentity commented Dec 21, 2020

This logic makes datasets that are private into public when a resource is loaded onto inventory (so that users can view/download the resource).
This logic combined with the old template defaulted all datasets to private, and hid any configuration from users. We will need to do something similar in the dcat_usmetadata extension that when the package_create api is called we default to private=True. The logic would seem to be added here, but some of the logic seems wrong (why are we setting the bureau_code and program_code at this level, shouldn't that be user defined?).
We will let this work be complete before suggesting any changes.

@jbrown-xentity
Copy link
Contributor

Research is complete, new tickets created as necessary.

@mogul mogul added this to the Sprint 20201223 milestone Dec 23, 2020
@jbrown-xentity jbrown-xentity changed the title Inventory-next anonymous user sees dataset page [Research] Inventory-next anonymous user sees dataset page Dec 23, 2020
@mogul mogul closed this as completed Dec 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants