You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried running locally rather than in k8s using ray.init() to create the cluster. The issue is that we are using too much object store memory. For TPC-H q2 @ 100GB, it consumes all the memory on my workstation (128 GB) and then crashed. I tried limiting object store memory with ray.init(num_cpus=concurrency, object_store_memory=512 * 1024 * 1024) and it ran longer, but is spilling huge amounts of data to disk and is taking an unreasonable amount of time.
Here is an example where it is spilling a huge amount of data.
I cannot get benchmarks running in k8s. I suspect that too many tasks are being scheduled in parallel.
I added resource constraints in the code:
I am running the benchmark with
My cluster definition is:
I build my image with this Dockerfie, which extends the datafusion-ray image built from the repo.
The text was updated successfully, but these errors were encountered: