Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved invoking documentation #287

Merged
merged 1 commit into from
Jan 21, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/invoking-async.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,6 @@ using asynchronous requests for every file uploaded to the bucket, which generat

These jobs will be queued up in the Kubernetes scheduler and will be processed whenever there are resources available. If OSCAR cluster has been deployed as an elastic Kubernetes cluster (see [Deployment with IM](https://docs.oscar.grycap.net/deploy-im-dashboard/)), then new Virtual Machines will be provisioned (up to the maximum number of nodes defined) in the underlying Cloud platform and seamlessly integrated in the Kubernetes clusters to proceed with the execution of jobs. These nodes will be terminated as the worload is reduced. Notice that the output files can be stores in MinIO or in any other storage back-end supported by the [FaaS supervisor](oscar-service.md#faas-supervisor).

Note that if your OSCAR service runs an AI model for inference, each job will load the AI model weights before performing the inference. You can mitigate this penalty by adjusting the inference code to process a compressed file with several images.

If you want to process a large number of data files, consider using [OSCAR Batch](https://github.com/grycap/oscar-batch), a tool designed to perform batch-based processing in OSCAR clusters. It includes a coordinator tool where the user provides a MinIO bucket containing files for processing. This service calculates the optimal number of parallel service invocations that can be accommodated within the cluster, according to its current status, and distributes the image processing workload accordingly among the service invocations. This is mainly intended to process large amounts of files, for example, historical data.
4 changes: 3 additions & 1 deletion docs/invoking.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,11 @@ OSCAR services can be executed:

After reading the different service execution types, take into account the following considerations to better decide the most appropriate execution type for your use case:

* **Scalability**: Asynchronous invocations provide the best throughput when dealing with multiple concurrent data processing requests, since these are processed by independent jobs which are managed by the Kubernetes scheduler. A two-level elasticity approach is used (increase in the number of pods and increase in the number of Virtual Machines, if the OSCAR cluster was configured to be elastic). This is the recommended approach when each processing request exceeds the order of tens of seconds.
* **Scalability**: Asynchronous invocations provide the best throughput when dealing with multiple concurrent data processing requests, since these are processed by independent jobs which are managed by the Kubernetes scheduler. A two-level elasticity approach is used (increase in the number of pods and increase in the number of Virtual Machines, if the OSCAR cluster was configured to be elastic). This is the recommended approach when each processing request exceeds the order of tens of seconds.

* **Reduced Latency** Synchronous invocations are oriented for short-lived (< tens of seconds) bursty requests. A certain number of containers can be configured to be kept alive to avoid the performance penalty of spawning new ones while providing an upper bound limit (see [`min_scale` and `max_scale` in the FDL](fdl.md#synchronoussettings), at the expense of always consuming resources in the OSCAR cluster. If the processing file is in the order of several MBytes it may not fit in the payload of the HTTP request.

* **Easy Access** For services that provide their own user interface or their own API, exposed services provide the ability to execute them in OSCAR and benefit for an auto-scaled configuration in case they are [stateless](https://en.wikipedia.org/wiki/Service_statelessness_principle). This way, users can directly access the service using its well-known interfaces by the users.

If you want to process a large number of files, please consider using [OSCAR-batch](https://github.com/grycap/oscar-batch).