Skip to content

closing and reopening eleveldb may deadlock #71

Closed
@uwiger

Description

@uwiger

When testing a patched mnesia using eleveldb as a backend, I noticed that some test cases could hang forever. I believe the problem is as follows:

  1. The test process creates a table (leveldb database instance) and does some reads and writes
  2. The database is closed and deleted (eleveldb:destroy/2 followed by rm -rf ... just to be sure)
  3. The same process reopens the database. In this particular case, the open() consistently hangs on an IO error.

The key is that the 'client' process reads from the database, i.e. using the Ref. If the Ref remains as garbage on the heap when mnesia is restarted (which triggers a lot of work, but not in the calling process), the Ref will not be freed, as the destructor isn't called until the GC clears out the last reference.

Calling erlang:garbage_collect() in the test process before restarting mnesia fixes the problem in this particular case (with luck, adding debug printouts can achieve the same thing by triggering the GC). But it's not safe to assume that the Ref will ever be completely freed by GC, as some processes may perform work and then idle forever without performing the final GC.

One idea is to let a worker thread call AwaitCloseAndDestructor() [1] right after InitiateCloseRequest() has been called, then have it remove the LevelDB env from the magic binary. I assume this would release the LevelDB lock entry?

[1] https://github.com/basho/eleveldb/blob/master/c_src/refobjects.cc#L137

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions