Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic h5_garbage_collect() garbage collection #1186

Open
denglerchr opened this issue Jan 20, 2025 · 6 comments
Open

Automatic h5_garbage_collect() garbage collection #1186

denglerchr opened this issue Jan 20, 2025 · 6 comments

Comments

@denglerchr
Copy link
Contributor

Good afternoon,

There might be a memory leak in HDF5, related to using driver=Drivers.Core(; backing_store=false).
I created a reduced exampled that can be reproduced as follows:

  1. generate a docker file including HDF5
# build with -> docker build -t hdf5test:1.0 .
FROM julia:1.11.2

RUN julia -e "import Pkg; Pkg.add([\"HDF5\", \"H5Zblosc\"])"
ENTRYPOINT ["julia"]
  1. run the following code in the docker container (e.g., run sudo docker run -it --memory=500m hdf5test:1.0 and copy the code ), it will be killed for OOM reason sooner or later
using HDF5

function main()
    while true
        h5open("abc.h5", "w"; driver=Drivers.Core(; backing_store=false)) do fid
            fid["M"] = randn(1000, 1000)
            return Vector{UInt8}(fid)
        end
        # GC.gc() # enabling or diabling doesnt change much
    end
    return nothing
end

main()

The container memory will immediately jump close to the limit and stay there for a while, for higher memory cap, it will take longer for the container to be killed. Once the container is killed, to be sure it was due to memory, you can docker inspect <containerid>

Best regards,
Christian Dengler

@mkitti
Copy link
Member

mkitti commented Jan 20, 2025

Could you see if invokingHDF5.API.h5_garbage_collect() helps?

https://github.com/JuliaIO/HDF5.jl/blob/master/src%2Fapi%2Ffunctions.jl#L67

@denglerchr
Copy link
Contributor Author

I did a quick test, including this in the loop seems to stabilize the memory usage.
I guess this is not a bug then? Or should this be called automatically somehow?

@mkitti
Copy link
Member

mkitti commented Jan 20, 2025

I would consider this to be a workaround for now.

I need to investigate further how well this is documented upstream in HDF5 itself, and when would be appropriate to call this automatically.

Perhaps a HDF5.gc() would br warranted if this is needed to be called by the a user.

@denglerchr
Copy link
Contributor Author

Ok, Ill keep this ticket open in that case

@denglerchr denglerchr changed the title Possible memory leak when creating files in-memory Automatic h5_garbage_collect() garbage collection Jan 20, 2025
@simonbyrne
Copy link
Collaborator

Ideally we should call this when the Julia GC is invoked, but we probably don't want to call it every time an object is freed.

One way to do this would be to add a callback into the Julia GC (so it gets called after the Julia GC is invoked). This can be done by calling jl_gc_set_cb_post_gc with a function pointer. The downside is that we can't call actual Julia code, so we would have to write a C shim around it. This is what I did for NVTX.jl:
https://github.com/JuliaGPU/NVTX.jl/blob/main/src/julia.jl

@mkitti
Copy link
Member

mkitti commented Jan 22, 2025

In this case with the do syntax, I think we could call thr HDF5 GC when closing the "file" when we know that file is backed by allocated memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants