Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

authorize requests to ml-pipeline endpoint if contains any trusted principals #2753

Conversation

kromanow94
Copy link
Contributor

Pull Request Template for Kubeflow manifests Issues

Summary

Allow access to ml-pipeline endpoint if the request contains any trusted principal.

Due to changes introduced in #2544, the kubeflow-userid header is always populated with user/machine details. This is in conflict with the AuthorizationPolicy/ml-pipeline which is configured to only allow access from istio-ingressgateway and also block if the kubeflow-userid header is present. In the past it was made like so for security reason to block header injection, which might a form of attacking the KFP API.

With the changes introduced in the beforementioned PR, Istio will now always verify if the request is authenticated through JWT and will always put the email or sub claim in the header, mitigating the attack through header injection and enforcing user authentication.

Because of that, the AuthorizationPolicy/ml-pipeline has to be updated to allow API Access from in-cluster KF Notebook from any user authorized with JWT.

✏️ A brief description of the changes

  • I'm changing the spec of AuthorizationPolicy/ml-pipeline to allow access if the request contains any trusted principal.
    • There are more instances of AuthorizationPolicy/ml-pipeline across the kubeflow/manifests repository so this has to be made in a few places.
  • I'm adding tests through gh-workflows in which a KF Notebook should start a KF Pipeline Run. This KF Notebook will use PodDefaults allowing access to ml-pipeline.

🐛 If this PR is related to an issue, please put the link of the issue here.

✅ Unit Test Checklist

  • 🛠️ Make sure you have installed kustomize == 5.2.1+
  • ✍️ Have you written new tests for your core changes, as applicable?
  • 🔄 Have you successfully run existing tests with your changes ?
  • 🚀 Have you successfully run existing and new tests with your changes ?

✅ Contributor checklist

@rimolive
Copy link
Member

Looks like it's good to merge, but can you sign-off the commit?

@kromanow94
Copy link
Contributor Author

@rimolive I'm adding tests for gh-actions to run a test pipeline from kf notebook. I have things worked out already, just doing some small cosmetical changes. I'll have it ready in a few hours. Then I'll push to this PR and we should be good to merge.

@kromanow94 kromanow94 force-pushed the fix-ml-pipeline-access-from-kf-notebook branch from 7797226 to e5a45ef Compare June 21, 2024 08:21
@google-oss-prow google-oss-prow bot added size/L and removed size/XS labels Jun 21, 2024
Signed-off-by: Krzysztof Romanowski <[email protected]>
Signed-off-by: Krzysztof Romanowski <[email protected]>
Signed-off-by: Krzysztof Romanowski <[email protected]>
Signed-off-by: Krzysztof Romanowski <[email protected]>
Signed-off-by: Krzysztof Romanowski <[email protected]>
Signed-off-by: Krzysztof Romanowski <[email protected]>
Signed-off-by: Krzysztof Romanowski <[email protected]>
@kromanow94 kromanow94 force-pushed the fix-ml-pipeline-access-from-kf-notebook branch from 1e5473a to 10134df Compare June 21, 2024 09:48
@kromanow94
Copy link
Contributor Author

Hey, so the scope of changes and acceptance criteria seems to be complete but it seems we've hit some limitation of the underlying infra...

Warning  WorkflowNodeError      6s    workflow-controller  Error node addition-pipeline-fnrwn.root.add.executor: Error (exit code 1): failed to put file: Storage backend has reached its minimum free disk threshold. Please delete a few objects to proceed.

https://github.com/kubeflow/manifests/actions/runs/9611836057/job/26511255711?pr=2753

I'm not sure how to proceed from this moment. The test right now fails if the pipeline fails but the test itself can as well just focus on if the pipeline run was successfully created from KF Notebook with plain client = kfp.Client() (so no additional credentials or any configuration to run the test).

@rimolive , @juliusvonkohout , any advice? Do you consider we can drop the requirement to fail the gh action if kf pipeline failed?

@kromanow94
Copy link
Contributor Author

@rimolive , @juliusvonkohout, I know the time is short so I changed the test for now to accept failure of pipeline run. The main goal of verifying if Pipeline Run can be started from KF Notebook without any additional configuration to the kfp.Client has succeeded.

If you accept it in this form, feel free to merge. If not, let me know so I can revert the last commit. Also, I added @juliusvonkohout to my repository because in just a few hours I'm going for a little weekend retreat and I'm not going to be available. If time is critical, feel free to make changes on the branch/PR directly.

@kromanow94
Copy link
Contributor Author

Oh, but now the test has succeeded and the pipeline run didn't fail... Ok, if this fails only sometimes, I prefer to have the test failed if the pipeline failed. We can always retry.

2024-06-21 11:21:35,027 - run_and_wait_for_pipeline - INFO - Pipeline Run finished in state: SUCCEEDED.
2024-06-21 11:21:35,027 - run_and_wait_for_pipeline - INFO - Pipeline Run finished with error: None.

https://github.com/kubeflow/manifests/actions/runs/9612748813/job/26514028934?pr=2753

This reverts commit 32323a4.

Signed-off-by: Krzysztof Romanowski <[email protected]>
@kromanow94 kromanow94 force-pushed the fix-ml-pipeline-access-from-kf-notebook branch from 1bc3d72 to 205db93 Compare June 21, 2024 11:27
@kromanow94
Copy link
Contributor Author

Now it randomly failed again.

System.IO.IOException: No space left on device : '/home/runner/runners/2.317.0/_diag/Worker_20240621-112805-utc.log'
You are running out of disk space. The runner will stop working when the machine runs out of disk space. Free space left: 7 MB

https://github.com/kubeflow/manifests/actions/runs/9612888226?pr=2753

Can somebody retry, please?

@juliusvonkohout
Copy link
Member

Thank you very much. At least we have a test now, which is better than the previous state.

@juliusvonkohout
Copy link
Member

@kimwnasptd any objections?

notValues: ['*']
- from:
- source:
requestPrincipals: ["*"] # allow access by any trusted principal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the important thing to approve. The rest is rather straightforward.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kimwnasptd wants it in #2747 (comment) so it should be approved by him.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the changes, although these should be reflected in the kubeflow/pipelines project and not directly in the manifests repo

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, can you track that with them? Otherwise it will break in the next synchronization.

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Jun 24, 2024

@rimolive i will review this PR until Monday July 1 first.

@juliusvonkohout
Copy link
Member

@KRomanov we also need negative test to check that it does not work without the proper token, but just the faked userid header.
Lets do that in a follow up PR.

/lgtm
/approve

since we have to get RC.2 ready and Kimonas approved it in #2747 (comment). We have to do a full assessment next week.

Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: juliusvonkohout

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 43eec94 into kubeflow:master Jun 25, 2024
5 checks passed
@kromanow94 kromanow94 deleted the fix-ml-pipeline-access-from-kf-notebook branch July 8, 2024 07:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants