-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync with upstream main #456
Sync with upstream main #456
Conversation
…rve#4012) * Fix readiness probe logic and update test scenarios for HTTPGet, TCPSocket, and Exec handling Signed-off-by: Snehomoy <[email protected]> * Update: Refactor logic for readiness probe handling Signed-off-by: Snehomoy <[email protected]> * Apply gofmt formatting to agent_injector.go Signed-off-by: Snehomoy <[email protected]> * Added logger to replace fmt.Printf for better consistency and observability Signed-off-by: Snehomoy <[email protected]> * Formatted file using goimports with -local Signed-off-by: Snehomoy <[email protected]> --------- Signed-off-by: Snehomoy <[email protected]>
) (kserve#4018) * Feat: Fix memory issue by replacing io.ReadAll with io.Copy (kserve#4017) Previously, io.ReadAll was causing out-of-memory problems when downloading large files from GCS. This change replaces io.ReadAll() with io.Copy() to stream data and prevent excessive memory usage. Signed-off-by: ops-jaeha <[email protected]> * Feat: Fix add newline at end of file to satisfy golang lint Signed-off-by: ops-jaeha <[email protected]> * Feat: Refact log Info for golang lint (kserve#4017) Signed-off-by: ops-jaeha <[email protected]> --------- Signed-off-by: ops-jaeha <[email protected]>
chore: Fix CVE-2024-26130 - NULL Pointer Dereference - Upgrade cryptography to version 42.0.4 or higher. Update Python version to match KServe 0.14.0 Update tensorflow, tensorflow-io-gcs-filesystem and dill libraries Signed-off-by: Spolti <[email protected]>
Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
Signed-off-by: Kursat Aktas <[email protected]>
…rve#4024) * Fix huggingface srever not work with return_probabilities Signed-off-by: oplushappy <[email protected]> * Fix pytest huggingface server assertion error Signed-off-by: oplushappy <[email protected]> * Fix the lint error and Add approx for assertion Signed-off-by: oplushappy <[email protected]> * Parse string output to dictionary for accurate assertion Signed-off-by: oplushappy <[email protected]> * Fix linting error Signed-off-by: oplushappy <[email protected]> --------- Signed-off-by: oplushappy <[email protected]>
* Add deeper readiness and liveness check for transformer Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Add unit tests Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * put the feature behind flag Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Update tests Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * resolve comments Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Make use of inference client Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Add e2e test Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Make inference client singleton and lazy initialize Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Raise 503 If server is not ready / live Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Add test for custom transformer with rest protocol Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Fix CI running out of space Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Increase memory limit Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Check for model ready Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Webhook debug Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Address reviews Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Check for retry count in grpc client Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Update python/kserve/kserve/model_server.py Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Sivanantham <[email protected]> --------- Signed-off-by: Sivanantham Chinnaiyan <[email protected]> Signed-off-by: Sivanantham <[email protected]> Co-authored-by: Dan Sun <[email protected]>
…#4006) chore: Fix CVE-2024-47874 Signed-off-by: Spolti <[email protected]>
remove duplicated import Signed-off-by: carlory <[email protected]>
Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
* add storageaccesskey to azure env builder Signed-off-by: bentohset <[email protected]> * update integration and unit test for azure storage access key Signed-off-by: bentohset <[email protected]> * fix formatting Signed-off-by: bentohset <[email protected]> --------- Signed-off-by: bentohset <[email protected]>
* support single digit azure zone id Signed-off-by: bentohset <[email protected]> * add single digit azure dns zone id tests Signed-off-by: bentohset <[email protected]> * fix formatting Signed-off-by: bentohset <[email protected]> --------- Signed-off-by: bentohset <[email protected]>
* Fix trust_remote_code not passed in encoder model Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Add test Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Fix name conflict in e2e test Signed-off-by: Sivanantham Chinnaiyan <[email protected]> --------- Signed-off-by: Sivanantham Chinnaiyan <[email protected]> Signed-off-by: Sivanantham <[email protected]>
* introduce the prepare-for-release.sh script chore: The purpose of this script is to facilitate the release process by updating the KServe version everywhere that is necessary. fixes kserve#3399 Signed-off-by: Spolti <[email protected]> * review - update release_process_v2.md Signed-off-by: Spolti <[email protected]> * Update hack/prepare-for-release.sh Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Filippe Spolti <[email protected]> * Update hack/prepare-for-release.sh Signed-off-by: Filippe Spolti <[email protected]> * Update hack/prepare-for-release.sh Signed-off-by: Filippe Spolti <[email protected]> --------- Signed-off-by: Spolti <[email protected]> Signed-off-by: Filippe Spolti <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]>
* LocalModelNode Daemonset Controller Skeleton (kserve#4026) * hello world controller Signed-off-by: Gavin Li <[email protected]> * go fmt Signed-off-by: Gavin Li <[email protected]> * daemonset Signed-off-by: Gavin Li <[email protected]> * Update Makefile Co-authored-by: Jin Dong <[email protected]> Signed-off-by: Gavin Li <[email protected]> * make generate Signed-off-by: Gavin Li <[email protected]> * install LocalModelNode CRD Signed-off-by: Gavin Li <[email protected]> * feedback Signed-off-by: Gavin Li <[email protected]> * make manifests Signed-off-by: Gavin Li <[email protected]> * agent Signed-off-by: Gavin Li <[email protected]> Co-authored-by: Jin Dong <[email protected]> * LocalModelController creates LocalModelNode resource for ready nodes (kserve#4036) * Manage localmodelNode Signed-off-by: Jin Dong <[email protected]> * Update patch Signed-off-by: Jin Dong <[email protected]> * Fix rbac Signed-off-by: Jin Dong <[email protected]> * Add a test to controller_test.go Signed-off-by: Jin Dong <[email protected]> * Update pkg/controller/v1alpha1/localmodel/controller.go Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Jin Dong <[email protected]> --------- Signed-off-by: Jin Dong <[email protected]> Co-authored-by: Dan Sun <[email protected]> * Delete from LocalModelNode when the localmodel is deleted (kserve#4053) * Delete model from LocalModelNode Signed-off-by: Jin Dong <[email protected]> * Cleanup code Signed-off-by: Jin Dong <[email protected]> * Cleanup code Signed-off-by: Jin Dong <[email protected]> * Fix lint Signed-off-by: Jin Dong <[email protected]> * Initializer node status map Signed-off-by: Jin Dong <[email protected]> * Address comments Signed-off-by: Jin Dong <[email protected]> --------- Signed-off-by: Jin Dong <[email protected]> * Update Model status from LocalModelNode status (kserve#4056) * Delete model from LocalModelNode Signed-off-by: Jin Dong <[email protected]> * Cleanup code Signed-off-by: Jin Dong <[email protected]> * Cleanup code Signed-off-by: Jin Dong <[email protected]> * Fix lint Signed-off-by: Jin Dong <[email protected]> * Initializer node status map Signed-off-by: Jin Dong <[email protected]> * Update status Signed-off-by: Jin Dong <[email protected]> * Update localmodel node status Signed-off-by: Jin Dong <[email protected]> * Remove job dependency from localmodel controller Signed-off-by: Jin Dong <[email protected]> * Remove some unused lines Signed-off-by: Jin Dong <[email protected]> * Add comments Signed-off-by: Jin Dong <[email protected]> --------- Signed-off-by: Jin Dong <[email protected]> * LocalModelNode Agent that creates download jobs and update statuses from jobs (kserve#4075) * download working Signed-off-by: Gavin Li <[email protected]> * delete working Signed-off-by: Gavin Li <[email protected]> * cleanup Signed-off-by: Gavin Li <[email protected]> * gofmt Signed-off-by: Gavin Li <[email protected]> * Delete model from LocalModelNode Signed-off-by: Jin Dong <[email protected]> * Cleanup code Signed-off-by: Jin Dong <[email protected]> * Fix lint Signed-off-by: Jin Dong <[email protected]> * Initializer node status map Signed-off-by: Jin Dong <[email protected]> * Update status Signed-off-by: Jin Dong <[email protected]> * Update localmodel node status Signed-off-by: Jin Dong <[email protected]> * Remove job dependency from localmodel controller Signed-off-by: Jin Dong <[email protected]> * Remove some unused lines Signed-off-by: Jin Dong <[email protected]> * Add comments Signed-off-by: Jin Dong <[email protected]> * Update manager Signed-off-by: Jin Dong <[email protected]> * Update rbac Signed-off-by: Jin Dong <[email protected]> * Add tests and temporarily remove delete models code Signed-off-by: Jin Dong <[email protected]> * Do not create download jobs if model is already downloaded Signed-off-by: Jin Dong <[email protected]> * remove mislieading log line Signed-off-by: Jin Dong <[email protected]> * Clean up code a little bit Signed-off-by: Jin Dong <[email protected]> * Update configurations Signed-off-by: Jin Dong <[email protected]> * update test Signed-off-by: Jin Dong <[email protected]> * Use a fixed name for the download container Signed-off-by: Jin Dong <[email protected]> --------- Signed-off-by: Gavin Li <[email protected]> Signed-off-by: Jin Dong <[email protected]> Co-authored-by: Gavin Li <[email protected]> * Delete models from local disk when they are not in LocalModelNode spec (kserve#4084) * download working Signed-off-by: Gavin Li <[email protected]> * delete working Signed-off-by: Gavin Li <[email protected]> * Delete model from LocalModelNode Signed-off-by: Jin Dong <[email protected]> * Initializer node status map Signed-off-by: Jin Dong <[email protected]> * Update status Signed-off-by: Jin Dong <[email protected]> * Update localmodel node status Signed-off-by: Jin Dong <[email protected]> * Update manager Signed-off-by: Jin Dong <[email protected]> * Update rbac Signed-off-by: Jin Dong <[email protected]> * Add tests and temporarily remove delete models code Signed-off-by: Jin Dong <[email protected]> * Do not create download jobs if model is already downloaded Signed-off-by: Jin Dong <[email protected]> * Delete function Signed-off-by: Jin Dong <[email protected]> * Update configurations Signed-off-by: Jin Dong <[email protected]> * Add test and Fix deletion code Signed-off-by: Jin Dong <[email protected]> * Use a fixed name for the download container Signed-off-by: Jin Dong <[email protected]> * Remove deleted models from status and periodically trigger reconciliation Signed-off-by: Jin Dong <[email protected]> * Fix storagecontainer permissions and a minor change Signed-off-by: Jin Dong <[email protected]> --------- Signed-off-by: Gavin Li <[email protected]> Signed-off-by: Jin Dong <[email protected]> Co-authored-by: Gavin Li <[email protected]> --------- Signed-off-by: Jin Dong <[email protected]> Signed-off-by: Gavin Li <[email protected]> Co-authored-by: Gavin Li <[email protected]> Co-authored-by: Jin Dong <[email protected]>
storage containers typo fix Signed-off-by: Andrews Arokiam <[email protected]>
Support datetime object in v1/v2 response Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
* Update ClusterLocalModel to LocalModelCache Signed-off-by: Dan Sun <[email protected]> * Fix generation fmt Signed-off-by: Dan Sun <[email protected]> * black fmt Signed-off-by: Dan Sun <[email protected]> * Fix generated code Signed-off-by: Dan Sun <[email protected]> * Run go mod tidy Signed-off-by: Dan Sun <[email protected]> * Fix model status Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]>
* Fix LocalModel controller reconciles deleted resource Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Rebase Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Fix path base routing e2e workflow Signed-off-by: Sivanantham Chinnaiyan <[email protected]> --------- Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
…erve#4003) * Requeue and then double check the Pending status Signed-off-by: Hannah DeFazio <[email protected]> * Add test case, fix old tests Signed-off-by: Hannah DeFazio <[email protected]> * Check the retun value for PropagateModelStatus, add knative failure case Signed-off-by: Hannah DeFazio <[email protected]> --------- Signed-off-by: Hannah DeFazio <[email protected]> Co-authored-by: Hannah DeFazio <[email protected]>
* init Signed-off-by: Gavin Li <[email protected]> * broken code Signed-off-by: Gavin Li <[email protected]> * register webhook Signed-off-by: Gavin Li <[email protected]> * rename + working Signed-off-by: Gavin Li <[email protected]> * pass in client Signed-off-by: Gavin Li <[email protected]> * check storageURI Signed-off-by: Gavin Li <[email protected]> --------- Signed-off-by: Gavin Li <[email protected]>
…art (kserve#4111) add localmodelnode agent image Signed-off-by: Rituraj Singh <[email protected]> Co-authored-by: Rituraj Singh <[email protected]>
* added vllm cpu image dockerfile Signed-off-by: ayush <[email protected]> * updated predictor controller to add '-gpu' suffix to huggingfaceserver image tag for GPU deployments Signed-off-by: ayush <[email protected]> * cleanup Signed-off-by: ayush <[email protected]> * added unit testcase for UpdateImageTag util Signed-off-by: ayush <[email protected]> * added documentation for vLLM CPU support Signed-off-by: ayush <[email protected]> * updated vllm-cpu example with llama 3.1 model Signed-off-by: ayush <[email protected]> * modified dockerfile to use vllm requirements-build to install dependencies Signed-off-by: ayush <[email protected]> * shifted to use vLLM with OpenVINO for CPU workloads Signed-off-by: ayush <[email protected]> * upgraded vllm and torch versions for huggingfaceserver Signed-off-by: ayush <[email protected]> * change base image to ubuntu Signed-off-by: ayush <[email protected]> * addressed comments in dockerfile and github workflow Signed-off-by: ayush <[email protected]> * added e2e test case Signed-off-by: ayush <[email protected]> * added huggingface_server_cpu_openvino image build in CI Signed-off-by: ayush <[email protected]> * updated poetry version Signed-off-by: ayush <[email protected]> * done linting Signed-off-by: ayush <[email protected]> * ran poetry lock --no-update Signed-off-by: ayush <[email protected]> * ran black formatting Signed-off-by: ayush <[email protected]> * removed huggingface server gpu image build in e2e tests Signed-off-by: ayush <[email protected]> * made separate job for e2e test of huggingface server vllm backend Signed-off-by: ayush <[email protected]> * updated vllm completion response in test Signed-off-by: ayush <[email protected]> * added vllm marker in pytest.ini file Signed-off-by: ayush <[email protected]> * reverted to vLLM v0.6.3.post1 Signed-off-by: ayush <[email protected]> * added vllm-openvino limitations in documentation Signed-off-by: ayush <[email protected]> * updated poetry lock Signed-off-by: ayush <[email protected]> --------- Signed-off-by: ayush <[email protected]> Signed-off-by: Ayush Sawant <[email protected]>
Signed-off-by: datta0 <[email protected]>
* chore: use patch instead of update for finalizer changes Signed-off-by: Derek Wang <[email protected]> * go mod tidy Signed-off-by: Derek Wang <[email protected]> * lint Signed-off-by: Derek Wang <[email protected]> --------- Signed-off-by: Derek Wang <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]>
* Fix localmodelcache permission for isvc Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Patch localmodelcache webhook for kubeflow overlay Signed-off-by: Sivanantham Chinnaiyan <[email protected]> --------- Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
831042d
to
4ad78f2
Compare
/rerun-all |
test |
Signed-off-by: Edgar Hernández <[email protected]>
@hdefazio: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hdefazio, israel-hdez The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Type of changes
Please delete options that are not relevant.
Feature/Issue validation/testing:
Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.
Test A
Test B
Logs
Special notes for your reviewer:
Checklist:
Release note:
Re-running failed tests
/rerun-all
- rerun all failed workflows./rerun-workflow <workflow name>
- rerun a specific failed workflow. Only one workflow name can be specified. Multiple /rerun-workflow commands are allowed per comment.