Cannot run benchmarks in k8s due to excessive spilling & OOM #44

andygrove · 2024-11-14T03:31:21Z

I cannot get benchmarks running in k8s. I suspect that too many tasks are being scheduled in parallel.

I added resource constraints in the code:

@ray.remote(num_cpus=1)
def execute_query_stage(

...

@ray.remote(num_cpus=1)
def execute_query_partition(

I am running the benchmark with

RAY_ADDRESS='http://localhost:8265' ray job submit --working-dir `pwd` -- python3 tpcbench.py --benchmark tpch --queries /home/ray/datafusion-benchmarks/tpch/queries/ --data /mnt/bigdata/tpch/sf100  --concurrency 4

My cluster definition is:

apiVersion: ray.io/v1alpha1
kind: RayCluster
metadata:
  name: datafusion-ray-cluster
spec:
  headGroupSpec:
    rayStartParams:
      num-cpus: "0"
    template:
      spec:
        containers:
          - name: ray-head
            image: andygrove/datafusion-ray-tpch:latest
            imagePullPolicy: Always
            resources:
              limits:
                cpu: 2
                memory: 8Gi
              requests:
                cpu: 2
                memory: 8Gi
            volumeMounts:
              - mountPath: /mnt/bigdata  # Mount path inside the container
                name: ray-storage
        volumes:
          - name: ray-storage
            persistentVolumeClaim:
              claimName: ray-pvc  # Reference the PVC name here
  workerGroupSpecs:
    - replicas: 2
      groupName: "datafusion-ray"
      rayStartParams:
        num-cpus: "4"
      template:
        spec:
          containers:
            - name: ray-worker
              image: andygrove/datafusion-ray-tpch:latest
              imagePullPolicy: Always
              resources:
                limits:
                  cpu: 5
                  memory: 64Gi
                requests:
                  cpu: 5
                  memory: 64Gi
              volumeMounts:
                - mountPath: /mnt/bigdata
                  name: ray-storage
          volumes:
            - name: ray-storage
              persistentVolumeClaim:
                claimName: ray-pvc

I build my image with this Dockerfie, which extends the datafusion-ray image built from the repo.

FROM andygrove/datafusion-ray

RUN sudo apt update && \
    sudo apt install -y git

RUN git clone https://github.com/apache/datafusion-benchmarks.git

andygrove · 2024-11-16T19:25:55Z

I tried running locally rather than in k8s using ray.init() to create the cluster. The issue is that we are using too much object store memory. For TPC-H q2 @ 100GB, it consumes all the memory on my workstation (128 GB) and then crashed. I tried limiting object store memory with ray.init(num_cpus=concurrency, object_store_memory=512 * 1024 * 1024) and it ran longer, but is spilling huge amounts of data to disk and is taking an unreasonable amount of time.

Here is an example where it is spilling a huge amount of data.

(raylet) Spilled 35419 MiB, 1062 objects, write throughput 1534 MiB/s.

andygrove · 2024-11-16T20:46:22Z

Root cause is #46

andygrove · 2024-12-14T19:30:16Z

this is resolved now that we reverted to disk-based shuffle

andygrove closed this as completed Dec 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot run benchmarks in k8s due to excessive spilling & OOM #44

Cannot run benchmarks in k8s due to excessive spilling & OOM #44

andygrove commented Nov 14, 2024 •

edited

Loading

andygrove commented Nov 16, 2024

andygrove commented Nov 16, 2024

andygrove commented Dec 14, 2024

Cannot run benchmarks in k8s due to excessive spilling & OOM #44

Cannot run benchmarks in k8s due to excessive spilling & OOM #44

Comments

andygrove commented Nov 14, 2024 • edited Loading

andygrove commented Nov 16, 2024

andygrove commented Nov 16, 2024

andygrove commented Dec 14, 2024

andygrove commented Nov 14, 2024 •

edited

Loading