Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the KFP / User Guides / Core Functions docs #3795

Merged

Conversation

thesuperzapper
Copy link
Member

@thesuperzapper thesuperzapper commented Jul 7, 2024

This PR makes significant formatting updates and rewording to the Kubeflow Pipelines / User Guides / Core Functions, and Migrate to Kubeflow Pipelines v2 guides.

For the most part this is a formatting update, but I draw reviwers attention to a content change I made on the Connect the SDK to the API page. I have updated the Python script given as an example for connecting the KFP SDK with Dex. I have used the same reference example as in the deployKF docs. (Note, it will work for all other distributions, including manifests, as long as they use Dex).

The most important changes in this PR are:

  • Re-naming the Core Functions pages to have shorter titles to be more clear
  • Re-ordering the Core Functions pages to better tell a story and be readable from top to bottom
  • Significantly improving the Migrate to Kubeflow Pipelines v2 page which was very hard to read before.

The easiest way to review this PR will be to look at the Netlify preview and review the following pages:

After we merge this, we will need to do the same thing for the "Create Components" and "Data Handling" sections.

@google-oss-prow google-oss-prow bot requested review from chensun and hbelmiro July 7, 2024 01:44
@thesuperzapper
Copy link
Member Author

@hbelmiro I would appreciate your review of this.

I believe it significantly improves the readability and usefulness of the "Core Functions" and "Migrate to Kubeflow Pipelines v2" guides.

/assign @hbelmiro

@hbelmiro
Copy link
Contributor

hbelmiro commented Jul 8, 2024

It would be nice if @rimolive and @diegolovison could also take a look.


One of the benefits of KFP is cross-platform portability. The KFP SDK compiles pipeline definitions to [IR YAML][ir-yaml] which can be read and executed by different backends, including the [Kubeflow Pipelines open source backend][oss-be] and [Vertex AI Pipelines](https://cloud.google.com/vertex-ai/docs/pipelines/introduction).
One of the benefits of KFP is cross-platform portability.
The KFP SDK compiles pipeline definitions to [IR YAML][ir-yaml] which can be read and executed by different backends, including the Kubeflow Pipelines [open source backend][oss-be] and [Vertex AI Pipelines](https://cloud.google.com/vertex-ai/docs/pipelines/introduction).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the docs must be doc-neutral. May we remove Vertex AI ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see #3795 (comment)


```python
client.create_run_from_pipeline_package('pipeline.yaml', arguments={'param': 'a', 'other_param': 2})
#from kfp import compiler, dsl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you like to keep the code commented?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the idea is to highlight that there are two stages to this:

  1. Create the YAML
  2. Submit the YAML

I also dont want people accidentially overwiting files by bildly runing the compile code.

```

See the [KFP SDK Client reference documentation][kfp-sdk-api-ref-client] for a detailed description of the `Client` constructor and method parameters.
## Run Pipeline - KFP CLI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the first step should be how to connect KFP CLI to the host

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's create a follow up issue, because I am not trying to add significant content in this PR, only improve formatting and layout.

I also don't have that information readily available to write up.

@diegolovison
Copy link
Contributor

Hi @thesuperzapper

I left some comments. It will be good if you agree with my suggestion do it in a new commit. Easy for review :)
We can squash and merge later.

Tks

@@ -26,7 +26,7 @@ be marked with a green "arrow from cloud" icon.
## How to use caching

Caching is enabled by default for all components in KFP. You can disable caching
for a component by calling `.set_caching_options(False)` on a task object.
for a component by calling [`.set_caching_options(enable_caching=False)`](https://kubeflow-pipelines.readthedocs.io/en/latest/source/dsl.html#kfp.dsl.PipelineTask.set_caching_options) on a task object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't safe to pin this link to a specific SDK version?

@rimolive
Copy link
Member

rimolive commented Jul 8, 2024

I also added some comments but overall lgtm.

* [Using the Kubeflow Pipelines SDK](/docs/components/pipelines/legacy-v1/tutorials/sdk-examples/)
* [Kubeflow Pipelines SDK Reference](https://kubeflow-pipelines.readthedocs.io/en/stable/)
* [Experiment with the Kubeflow Pipelines API](/docs/components/pipelines/legacy-v1/tutorials/api-pipelines/)
<br>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this <br>?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think the page is very hard to read without it, it provides separation between two very large sections.


In the following example, we use the `DockerRunner` type. Runner types are covered in more detail below.
Local execution comes with several limitations:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Local execution comes with several limitations:
It comes with several limitations:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this proposed change makes it unclear what the following bullet points are talking about if you are just skimming the page.


In general, platform-specific plugin libraries provide functions that act on tasks similarly to [task-level configuration methods][task-level-config-methods] provided by the KFP SDK directly. Platform-specific plugin libraries may also provide pre-baked components.
<!-- TODO: add docs on how to create a platform-specific authoring library -->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than adding a TODO comment, please create an issue.

--- | ---
SDK v1 | The `1.x.x` versions of the [`kfp`](https://pypi.org/project/kfp/1.8.22/) Python SDK.
SDK v2 | The `2.x.x` versions of the [`kfp`](https://pypi.org/project/kfp/) Python SDK.
SDK v1 (v2-namespace) | The preview V2 module that was available in the V1 SDK (e.g. `from kfp.v2 import *`).<br>_Only ever used by [Google Cloud Vertex AI Pipelines][vertex-pipelines] users._
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid mentioning products to be more vendor-neutral.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see #3795 (comment)


## SDK v1 to SDK v2
### **Migrate from 'SDK v1' to 'SDK v2'**

KFP SDK v2 is generally not backward compatible with user code that uses the KFP SDK v1 main namespace. This section describes some of the important breaking changes and migration steps to upgrade to KFP SDK v2.

We indicate whether each breaking change affects [KFP OSS backend][oss-be-v1] users or [Google Cloud Vertex AI Pipelines][vertex-pipelines] users.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid mentioning products to be more vendor-neutral.
Breaking changes that affect Vertex AI should be on the Vertex AI documentation. Can we mention only the breaking changes that affect KFP (and remove the "**Affects:** KFP OSS users" warning)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see #3795 (comment)

@thesuperzapper thesuperzapper force-pushed the improve-kfp-v2-user-guides branch from 4ee4c51 to e740394 Compare August 19, 2024 18:20
@thesuperzapper
Copy link
Member Author

A few people raised concerns about keeping the content which talks about Vertex AI (on the migration guide). Let's discuss this in a follow up issue, the existing page already mentions Vertex AI, and I don't feel comfortable removing it without a larger community discussion that should not block this PR.

Personally, I think it's important to clarify that there are some parts of the Kubeflow Pipelines SDK are only for Vertex AI (and not used by the open source Kubeflow Pipelines). If we as a community disagree with the presence of this code, it should be removed from the SDK, not silently dropped from the documentation.

@thesuperzapper
Copy link
Member Author

I have also fixed an issue which is being reported since the release of 1.9.0 in the Kubeflow Pipelines "Connect the SDK to the API - Outside Cluster" guide:

The issue is simply that Kubeflow 1.9.0 introduced a OIDC "approve grant" screen (which the old script was getting stuck on), see the preview site for the updated code:

@thesuperzapper
Copy link
Member Author

@james-jwu or @chensun or @terrytangyuan can you please approve this important update to the KFP docs.

Among other things, it fixes two critical issues impacting users:

  1. Fixes the "connect SDK outside cluster" script, which is no longer working on Kubeflow 1.9.0:
  2. Improves the KFP V2 migration guide, as users are struggling to follow it right now:

@thesuperzapper thesuperzapper force-pushed the improve-kfp-v2-user-guides branch from f0ddc5b to 0264512 Compare August 20, 2024 02:54

### Example for Dex
{{% alert title="Tip" color="info" %}}
The process to authenticate the Pipelines SDK from outside the cluster will vary by [Kubeflow Distribution](/docs/started/installing-kubeflow/#packaged-distributions) and identity provider.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this DEX part should link to the Kubeflow Platform page in general https://www.kubeflow.org/docs/started/installing-kubeflow/#kubeflow-platform instead of directly to the distributions. Furthermore we should say that below DEX there is oauth2-proxy that supports machine to machine authentication with Kubernetes Tokens.

Copy link
Member Author

@thesuperzapper thesuperzapper Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated link in 95d20ea.

Let's raise a separate PR which adds a "Kubeflow Platform - Dex Token Exchange", or whatever you want to call it, presuming you are talking about the ability of Dex to exchange external JWTs for its own ones:

If you are talking about suggesting people exfiltrate their Kubernetes ServiceAccount tokens to use off-cluster I strongly recommend against that.

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Aug 20, 2024

I think the DEX part in comnect-api.md should link to the Kubeflow Platform page in general https://www.kubeflow.org/docs/started/installing-kubeflow/#kubeflow-platform instead of directly to the distributions. Furthermore we should say that below DEX there is oauth2-proxy that supports machine to machine authentication with Kubernetes tokens. For machine accounts you should rely on service accounts and not usernames+passwords. Oauth2-proxy should have a more stable interface and be more platform/distribution agnostic in the future. But for some experimentation Dex might be enough for new users.

CC also @kimwnasptd for platform and @andreyvelich @terrytangyuan from the KSC since it is a large change.

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Aug 20, 2024

The question is when and how do we want to address the Dex/oauth2-proxy changes in kubeflow/manifests#2844 (comment)

@kimwnasptd @thesuperzapper
@kromanow94 I am fine with presenting both in the documentation and adjusting it slightly as proposed in the comment.

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Aug 20, 2024

@thesuperzapper maybe it makes sense to link to the recently added https://github.com/kubeflow/manifests/blob/master/tests%2Fgh-actions%2Ftest_dex_login.py and adjust it if needed. Because this way we always have it up to date and it is tested regularly. Code on the website can soon deprecate or get out of synchronization.

@thesuperzapper
Copy link
Member Author

@juliusvonkohout we need to focus on getting this merged so the Dex script on the website actually works with Kubeflow 1.9.0. All previous users will be using the dex approach.

Can you confirm that the new script under "Kubeflow Platform - Outside the Cluster" works on your test 1.9.0 clusters?


In terms of future updates like including the m2m stuff on this page, let's do that in a follow-up PR, so we don't block the fix to the dex script.

PS: I think it's more user friendly to include all the information directly on the website, so we can add explanations and comments, and not break if you update your test script.

PSS: once we merge this PR, please update your test login script to the new one from this PR, because its currently disabled on your repo (hence why the script on the website is broken).

@thesuperzapper
Copy link
Member Author

@james-jwu or @chensun can you please approve this critical update to the KFP docs?

It fixes two critical issues impacting users:

  1. Fixes the "connect SDK outside cluster" script, which is no longer working on Kubeflow 1.9.0:
  2. Improves the KFP V2 migration guide, as users are struggling to follow it right now:

@rimolive
Copy link
Member

/lgtm

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Aug 21, 2024

@juliusvonkohout we need to focus on getting this merged so the Dex script on the website actually works with Kubeflow 1.9.0. All previous users will be using the dex approach.

/lgtm
with the linking to Kubeflow platform

Can you confirm that the new script under "Kubeflow Platform - Outside the Cluster" works on your test 1.9.0 clusters?

there is positive user feedback in kubeflow/manifests#2844 (comment) but i did not have the time to test it myself yet

In terms of future updates like including the m2m stuff on this page, let's do that in a follow-up PR, so we don't block the fix to the dex script.

Yes that is fine.

PS: I think it's more user friendly to include all the information directly on the website, so we can add explanations and comments, and not break if you update your test script.

PSS: once we merge this PR, please update your test login script to the new one from this PR, because its currently disabled on your repo (hence why the script on the website is broken).

Please comment in the follow up kubeflow/manifests#2854. Note this does not yet contain your changes here.

@thesuperzapper
Copy link
Member Author

Can @andreyvelich or @terrytangyuan please approve?

We need a root approver for this because we are also changing content from the Katib section (to fix broken links).

We have 2 LGTMs already, and given the very important changes (including fixes the KFP "out of cluster" script for 1.9.0), I think we should merge ASAP.

Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/assign @kubeflow/wg-pipeline-leads @james-jwu @zijianjoy
Please check this PR, so we can merge it

@andreyvelich
Copy link
Member

/cc @connor-mccarthy

@james-jwu
Copy link

/lgtm
/approve

Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: james-jwu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit d3ca1b1 into kubeflow:master Aug 27, 2024
6 checks passed
@thesuperzapper thesuperzapper deleted the improve-kfp-v2-user-guides branch August 27, 2024 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants