-
Notifications
You must be signed in to change notification settings - Fork 178
mv async close
- merged to master
- code complete October 14, 2013
- development started October 4, 2013
This branch contains the first of two fixes first released in the 1.4.2-turner branch. And there were error messages about fadvise and fdatasync errors in the syslog.
This branch targets two independent problems:
- PosixMmapFile when used asynchronously by the recovery log has a race condition related to close operations. This caused the syslog error messages. It also implies the recovery log likely has minor corruption at the file end (which is only critical if the file is actually used in a database recovery).
The PosixMmapFile async problem was partially addressed in a previous patch. That patch corrected PosixMmapFile::UnmapCurrentRegion() to only use async unmapping if the file was a recovery log. The problem that remained (undetected) was that Env::Default()->Schedule() did not place sequential file operations on the same thread. It was therefore possible for an unmap with close operation to process prior to simple unmap operations that precede it.
This branch adds logic to maintain a reference count of active objects related to a file. The PosixMmapFile object is one type of "active object". A background BGCloseInfo object is the other. The reference count is malloc'd in memory and works similar to an std::shared_ptr. The difference is that std::shared_ptr is not guaranteed thread safe and this reference counter is. The object that decrements the reference counter to zero causes the file to close. This guarantees that all pending writes have completed (in any order).
The malloc'd ref_count_ is actually a 2 element uint64_t array. The first element is the reference count. The count is incremented and decremented via atomic operations. The second element is the final file size. It is set upon call to PosixMmapFile::Close(). The file is extended in chunks during write operations to facilitate memory mapped operations. The object that close the file will also adjust the file size is to match the actual count of bytes used.
Files edited for this fix:
-
leveldb/include/leveldb/atomics.h: this file originated in eleveldb for the 1.3 release. It is also part of pending changes to the leveldb 2.0 release. While it is "new" to 1.4.2, it is an already reviewed, production file.
-
leveldb/util/env_posix.cc: logic changes were made to the PosixMmapFile class relating to the use of ref_count_. There was a previous mess of four special purpose background task functions: BGFileCloser(), BGFileCloser2(), BGFileUnmapper(), and BGFileUnmapper2. The mess is now reduced to one, general purpose BGFileUnmapper2 function. The asynchronous close is contained within a new function ReleaseRef() that manages the ref_count_ atomically and also performs the file resize / close when count is zero.