Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TELCODOCS-1788: OpenShift 4.15 secrets disabled #27

Merged
merged 3 commits into from
Mar 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions gpu-operator/release-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,10 @@ Fixed issues
Known Limitations
------------------

* When installing on Red Hat OpenShift Container Platform 4.15 clusters that disable the integrated image registry,
secrets are no longer automatically generated and this change causes installation of the Operator to stall.
Refer to :ref:`special considerations for openshift 4.15` for more information.

* The ``1g.12gb`` MIG profile does not operate as expected on the NVIDIA GH200 GPU when the MIG configuration is set to ``all-balanced``.
* The GPU Driver container does not run on hosts that have a custom kernel with the SEV-SNP CPU feature
because of the missing ``kernel-headers`` package within the container.
Expand Down
31 changes: 26 additions & 5 deletions openshift/steps-overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,13 +41,34 @@ You can deploy the Operator on a a newly deployed cluster that was not upgraded
.. * OpenShift 4.8.22 and above z-streams
.. * All the versions of OpenShift 4.9 except 4.9.8

.. note::
=========================================
Special Considerations for OpenShift 4.15
=========================================

The Driver Toolkit, which enables entitlement-free deployments of the Operator, is available for certain z-streams on OpenShift
4.8 and all z-streams on OpenShift 4.9. However, some Driver Toolkit images are broken, so we recommend maintaining entitlements for
all OpenShift versions prior to 4.9.9. See :ref:`broken driver toolkit <broken-dtk>` for more information.
In OpenShift 4.15, secrets are no longer automatically generated when the integrated OpenShift image registry is disabled.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly got lost about whether the issue was the registry is disabled or if it was that storage for the registry wasn't configured.

For more information, refer to the `OpenShift 4.15 Release Notes <https://docs.openshift.com/container-platform/4.15/release_notes/ocp-4-15-release-notes.html#ocp-4-15-auth-generated-secrets>`__.

You do not need an entitlement on OpenShift Container Platform versions greater than 4.9.9.
This change affects the installation of NVIDIA GPU Operator.
During installation, the Driver Toolkit daemon set checks for the existence of a ``build-dockercfg`` secret for the Driver Toolkit service account.
When the secret does not exist, the installation stalls.

You can run the following command to determine if your cluster is affected.

.. code-block:: console

$ oc get configs.imageregistry.operator.openshift.io cluster -o jsonpath='{.spec.storage}{"\n"}'

If the output from the preceding command is empty, ``{}``, then your cluster is affected and you must configure your registry to use storage.
Refer to `Configuring the registry for bare metal <https://docs.openshift.com/container-platform/latest/registry/configuring_registry_storage/configuring-registry-storage-baremetal.html>`__
for information about configuring the registry with a PVC.
For platforms other than bare metal, refer to the additional resources section of the `Image Registry Operator in OpenShift Container Platform <https://docs.openshift.com/container-platform/latest/registry/configuring-registry-operator.html>`__ page.

If the output from the preceding command is any value other than empty, your cluster is not affected.


*********************************
Preparing to Install the Operator
*********************************

- Verify your cluster has the OpenShift Driver toolkit:

Expand Down