Skip to content

Latest commit

 

History

History
301 lines (259 loc) · 16.8 KB

v24.10.0.md

File metadata and controls

301 lines (259 loc) · 16.8 KB

DOCA Platform Framework v24.10.0

This is the first release of the DOCA Platform Framework (DPF), focused to streamline the provisioning and orchestration of NVIDIA BlueField DPUs in Kubernetes environments.

Features

  • Kubernetes integration - extends K8s API to manage BlueField DPUs and DPU services in a declarative manner.
  • Automated cluster-wide orchestration - enables discovery, provisioning, and configuration of BlueField DPU devices in K8s clusters.
  • Software Consistency - synchronizes firmware, OS, DOCA, and DPU services.
  • Lifecycle Management - manages lifecycles of hosts and DPUs seamlessly.
  • Service Orchestration - handles installation, upgrades, and removal of DPU services. = Service Function Chaining (SFC) - supports chaining of DPU services on DPF-provisioned DPUs.

Supported DPU Services:

Dependencies

Hardware and Software Requirements

  • DPU Hardware: NVIDIA BlueField-3 DPUs
  • Minimal DOCA BFB Image: DOCA v2.5 or higher (must be pre-installed on DPUs to support DPF provisioning)
  • Provisioned DOCA BFB Image: DOCA v2.9

Refer to Prerequisites for detailed requirements.

Known Issues and Limitations

  1. Title: On initial deployment DPF CRs maybe stuck in initial state (Pending, Initializing, etc) and not progressing
    Description:
    In case DPF CRs were created before DPF components are running they maybe be stuck in their initial state. DPF CRs need to be created after the DPF components have been deployed. In case CRs were created before they may remain in an initial state.
    Internal Ref #4241297
    Workaround:
    Delete any CRs that were created before the System components have been deployed and recreate them.

  2. Title: DPUService stuck in its Deleting phase
    Description:
    DPUService can be stuck in its Deleting phase when a pod on the DPU created as part of the DPUService can not be deleted.
    Internal Ref #4213229
    Workaround:
    Reprovision the DPU where the DPUService is deployed.

  3. Title: DPU Cluster CNI not deployed when hitting docker rate limit
    Description:
    Flannel, the DPU Cluster default CNI can fail getting deployed if we have exceeded docker rate limit - this can happen since we pull flannel directly from Docker Hub.
    Internal Ref #4214500
    Workaround:
    Workaround is to add personal docker account to pull images, follow the bellow process:

    1. Create the docker pull image secret:
      1. Create the secret (add your personal user and access token):
        kubectl -n dpf-operator-system create secret docker-registry dockerio-secret --docker-server=docker.io --docker-username=<username> --docker-password=<personal access token>
      2. label the secret as a DPF image pull secret:
        kubectl -n dpf-operator-system label secret dockerio-secret dpu.nvidia.com/image-pull-secret=""
    2. Disable flannel:
      1. Option 1: On a clean install (before deploying DPF controllers)
        1. Add secret to dpfoperatorconfig.spec.ImagePullSecrets list
        2. Disable flannel from the config spec.flannel.disable: true
      2. Option 2: Running cluster
        1. Patch the DPFOperatorConfig object to disable flannel DPUservice:
          kubectl -n dpf-operator-system patch dpfoperatorconfig dpfoperatorconfig -p '{"spec":{"flannel":{"disable":true}}}' --type=merge
        2. Remove the existing flannel DPUService
    3. Deploy Flannel as a DPUService with the correct image pull secret, create the following DPUService:
      apiVersion: svc.dpu.nvidia.com/v1alpha1
      kind: DPUService
      metadata:
      name: flannel
      namespace: dpf-operator-system
      spec:
      helmChart:
          source:
          repoURL: https://flannel-io.github.io/flannel
          chart: flannel
          version: v0.25.1
          values:
          global:
              imagePullSecrets:
              - name: dockerio-secret
  4. Title: Host DPU PF0 shouldn’t be assigned to workload pods
    Description:
    Creating a NetworkAttachmentDefinition CR with host-device plugin to assign the DPU physical network device (PF) for a pod will break host and DPF won't be able to manage it.
    Internal Ref #4217296
    Workaround:
    Avoid creating NetworkAttachmentDefinition CR with host-device plugin to assign the DPU physical network device on the host to a pod, only assign SR-IOV Virtual Functions (VF) to workload pods as host-device interfaces

  5. Title: Incompatible DPU flavor can cause DPU to get into an unstable state
    Description:
    Using an incompatible DPU flavor can cause the DPU Device to get into an error state which requires manual intervention.
    For example allocating 14GB of hugepages in a DPU of 16GB memory.
    Internal Ref #4200717
    Workaround:
    Manually provision DPU or follow DOCA troubleshooting documentation to return DPU to operational state https://docs.nvidia.com/networking/display/bfswtroubleshooting.

  6. Title: DPUServiceInterface is not Immutable
    Description:
    DPUServiceInterface which are used by a DPUDeployment can be manually deleted which leads to unsupported behavior.
    Internal Ref #4217384
    Workaround:
    Redeploy DPUDeployment object.

  7. Title: DPUSet which is managed by DPUDeployment can be modified
    Description:
    DPUSet which is managed by DPUDeployment can be modified which leads to mismatch in DPUDeployment, and recreation of old DPUSet when DPUDeployment is updated.
    Internal Ref #4217345
    Workaround:
    Avoid updating DPUSet manually in case it was created from a DPUDeployment.

  8. Title: DpfOperatorConfig CR deletion hangs on uninstallation process
    Description:
    When uninstalling DPF the DpfOperatorConfig CR might hang and not get deleted, due to leftover DPUServiceChains objects that need to be manually deleted.
    Internal Ref #4214285
    Workaround:
    Removal all DPUServiceInterfaces remaining.

  9. Title: Traffic loss after reconfiguration of DPUServices with chain between
    Description:
    Reconfiguration of DPUServices with chain between them may cause traffic loss due to outdated service chains.
    Internal Ref #4178445
    Workaround:
    Recreated SFC object between services.

  10. Title: Stale ports after DPU reboot
    Description:
    When rebooting DPU, the old DPU service ports won’t get deleted from DPU’s OVS and would be stale
    Internal Ref #4174183
    Workaround:
    No workaround, known issue, shouldn’t effect performance.

  11. Title: BFB filename must be unique
    Description:
    If BFB CR#1 bfb.spec.filename is the same as a BFB CR#2 bfb.spec.filename but references a different URL (actual bfb file to download) then BFB CR#1 would reference the wrong bfb.
    Internal Ref #4143309
    Workaround:
    Use unique bfb.spec.filename when creating new bfb CRs.

  12. Title: DPU Cluster control-plane connectivity is lost when physical port P0 is down on the worker node
    Description:
    Link down of p0 port on the DPUwill result in DPU control plane connectivity loss of DPU components.
    Internal Ref #3751863
    Workaround:
    Make sure P0 link is up on the DPU, if down either restart DPU or refer to DOCA troubleshooting https://docs.nvidia.com/networking/display/bfswtroubleshooting.

  13. Title: DPU Provisioning operations wouldn’t be retried
    Description:
    DPU Provisioning operations wouldn’t be retried, this can lead to DPU object in an error phase because of small environment glitch which would have worked if retried.
    Internal Ref #4202272
    Workaround:
    Delete the DPU object in an error phase which will cause it to get recreated and operation to begin from scratch.

  14. Title: Changes to a DPU Object will trigger reprovisioning
    Description:
    Any edits or changes to a DPU Object will cause it to recreate even if the changes are small and doesn’t require DPU reprovisioning such as changing the labels on the Node for the DPU in the DPU Cluster.
    Internal Ref #4206910
    Workaround:
    No workaround, known limitation.

  15. Title: Cluster MTU value cannot be dynamically changed
    Description:
    It is possible to deploy DPF cluster with a custom MTU value, however once deployed, it is not possible to modify the MTU value which is applied on multiple distributed components.
    Internal Ref #4241316
    Workaround:
    Uninstall DPF and re-install from scratch using the new MTU value.

  16. Title: nvidia-k8s-ipam and servicechainset-controller DPF system DPUServices are in “Pending” phase
    Description:
    As long as there are no provisioned DPUs in the system, the nvidia-k8s-ipam and servicechainset-controller will appear as not ready / pending when querying dpuservices.
    This has no impact on performance or functionality since DPF system components are only relevant when there are DPUs to provision services on.
    Internal Ref #4241324
    Workaround: No workaround, known issue

  17. Title: DPU object in pending phase without clear reason
    Description:
    DPU object wait in pending phase in case the BFB Object it references doesn't exist or non in Ready phase, this is the required behavior but doesn't have clear indication on DPU object.
    Internal Ref #4203694
    Workaround:
    Fix referenced BFB object.

  18. Title: [OVN Kubernetes DPUService] Nodes marked as NotReady
    Description:
    When installing OVN Kubernetes as a CNI on a node running containerd version 1.7.0 and above the Node never becomes ready.
    Internal Ref #4241282
    Workaround:
    Option 1: Use containerd version below v1.7.0 when using OVN Kubernetes as a primary CNI.
    Option 2: Manually restart containerd on the host.

  19. Title: [OVN Kubernetes DPUService] Fragmented packets are silently dropped when using custom MTU
    Description:
    When using custom MTU configuration, workload packets which require IP fragmentation (with size exceeding configured custom MTU) will be silently dropped.
    Internal Ref #4241299
    Workaround:
    None. For custom MTU configuration follow the DPF Reference Deployment Guide.

  20. Title: [OVN Kubernetes DPUService] control plane node is not functional after reboot or network restart
    Description:
    During OVN Kubernetes CNI installation on the control plane nodes, the management interface is moved with its IP into a newly created OVS bridge.
    Since this network configuration is not persistent it will be lost during node or network restart.
    Internal Ref #4241306
    Workaround:

    1. Pre-define the OVS bridge on each control plane node with the OOB port MAC and IP address and ensure it gets a persistent IP

      #Ubuntu example for netplan persistent network configuration:
      network:
      ethernets:
          oob:
          match:
              # the mac address of the oob
              macaddress: xx:xx:xx:xx:xx:xx
          set-name: oob
      bridges:
          br0:
          addresses: x.x.x.x/x
          interfaces: [oob]
          # the mac address of the oob
          macaddress: xx:xx:xx:xx:xx:xx
          openvswitch: {}
      version: 2
    2. Set OVS bridge "bridge-uplink" in OVS metadata.

      ovs-vsctl br-set-external-id br0 bridge-id br0 -- br-set-external-id br0 bridge-uplink oob
  21. Title: [OVN Kubernetes DPUService] Only a single OVN-Kubernetes DPU service version can be deployed across the cluster
    Description:
    OVN-Kubernetes service does not fully support customization using Helm parameters, meaning we only support a single OVN-Kubernetes DPU service across the entire cluster.
    Internal Ref #4209524
    Workaround:
    No workaround, known limitation.

  22. Title: [OVN Kubernetes DPUService] Lost traffic from workloads to control plane components or K8S services after dpu reboot, port flapping, ovs restart or manual network configuration
    Description:
    Connectivity issues between workload pods to control plane components or K8S services may occur after the following events: DPU reboot without host reboot, high speed port flapping (link down/up), ovs restart, DPU network configuration change (for example using "netplan apply" command on DPU).
    The issues are caused by network configuration that was applied by ovn CNI on DPUs and wont get reapplied automatically.
    When rebooting DPU without the host, or high speed port link is going down/up, or manually changing dpu network (for example with netplan apply), network configuration which was applied by the dpu CNI components may be lost and won’t reapply automatically.
    Internal Ref #4191019
    Workaround:
    Need to recreate the ovnk8s node pod on the host to reapply the configuration.

  23. Title: [OVN Kubernetes DPUService] host network configuration may result in lost traffic from host workloads (on overlay)
    Description:
    When changing host network (for example with netplan apply) custom network configuration which is done by the host CNI components may be lost and won’t reapply automatically.
    Internal Ref #4188044
    Workaround:
    Need to recreate the ovnk8s node pod on the host to reapply the configuration.

  24. Title: [OVN DPUservice] traffic between workloads stops working after 5 minutes
    Description:
    Traffic between workloads (not hostnetwork) stops working after 5 minutes of workload creation.
    Internal Ref #4224295
    Workaround:
    Set the DHCP server for high speed network fabric (fabric connected to DPU ports) lease time to a short interval (less than 5 minutes).

  25. Title: [OVN DPUservice] No network connectivity for SR-IOV accelerated workload pods after DPU reboot
    Description:
    SR-IOV accelerated workload pod is losing its VF interface upon DPU reboot. VF is available on the host however not injected back into the pod.
    Internal Ref #4234474
    Workaround:
    Recreate the SR-IOV accelerated workload pods.

  26. Title: [OVN DPUservice] DPU re-provisioning or VTEP IP change may result in lost traffic between cluster components
    Description:
    In a L2 setup with OVN, DPU VTEP IP address change caused by DPU re-provisioning or DHCP IP re-assignment, may lead to lost traffic between control plane, workers and K8s services components, due to stale DPU VTEP IP information on ovnk8s components across the cluster.
    Internal Ref #4238288, #4241388
    Workaround:
    Recreate the ovnkube-node pods on all control plane nodes and the ovnk8s DPUService to make sure all ovnk8s components are aligned with the new VTEP IPs.

  27. Title: [HBN DPUService] HBN DPUService cannot dynamically reload configurations
    Description:
    When updating HBN configuration through a configmap, the running HBN container won’t reload it, and need to get restarted with the new configuration.
    Internal Ref #4217788
    Workaround:
    Recreate HBN DPU service after changing configuration.

  28. Title: [HBN DPUService]Invalid HBN configuration is not reflected to user in case it is syntactically valid
    Description:
    If the HBN YAML configuration is valid but contains values that are illegal from an NVUE perspective then the HBN service will start with the last known valid configuration and it won’t be reflected to the end user.
    Internal Ref #
    Workaround:
    No workaround, known issue.

  29. Title: [HBN + OVN DPUServices] HBN service restarts on DPU causes worker to lose traffic
    Description:
    If the HBN pod on the DPU will reset then the workloads on the host (any traffic on the OVN overlay) will not receive traffic.
    Internal Ref #4220185, #4223176
    Workaround:
    power cycle the host.

  30. Title: [DTS DPUservice] DTS appears as OutOfSync
    Description:
    When creating a DPUDeployment for DTS DPU service, the DPUService object can be marked as OutOfSync although the pods are running on the DPUs.
    Internal Ref #4182929
    Workaround:
    No workaround, known issue.