Skip to content

Conversation

@jiangphcn
Copy link
Contributor

@jiangphcn jiangphcn commented Aug 22, 2019

Overview

The problem was originally exposed when updating search index with the following error:

{error, "*** eval: {'EXIT',{{badmatch,{error,<<"Lock obtain timed out: NativeFSLock@/srv/search_index/shards/20000000-3fffffff/user1/db1.1509715496/ec8bc1dc690a15d795e65f7dce409e2c/write.lock">>}},\n                   [{erl_eval,expr,3,[]}]}}"}

Finally, it turns out that the soft-deletion path is missing a call to evict the shard from LRU (and therefore release the associated locks). This causes that lock file was left in working directory and search index failed to be updated.

To address this, the fix is to close the corresponding index in LRU before index is to be renamed/deleted.

Testing recommendations

@theburge made excellent troubleshooting and analysis, and also wrote below script to reproduce problem.

repro2.sh.txt

The problem doesn't occur after using the fix here and apache/couchdb#2130

Related Issues or Pull Requests

apache/couchdb#2130

Checklist

  • Code is written and works correctly
  • Changes are covered by tests
  • Any new configurable parameters are documented in rel/overlay/etc/default.ini
  • A PR for documentation changes has been made in https://github.com/apache/couchdb-documentation

@theburge
Copy link
Contributor

@jiangphcn Thanks, that looks much more suitable for closing an individual shard's indexes prior to renaming the shard's directory.

I'm a little concerned about the performance impact of walking the whole LRU, but in practice I've never seen it be more than 5000 items. I think this is fine for now.

I remain more concerned about the race condition that exists in dreyfus: close-then-rename as two separate operations is a potential problem, and the current implementation in dreyfus does not serialise attempts to open and calls to close_lru/1. (Note that close_lru/1 is called in the caller's process, while the rename message is handled by the dreyfus_index_manager gen_server, so an open operation could easily sneak in before the rename message is issued but after the close_lru/1 succeeds partially.)

I don't propose attempting to close that race condition, but I think the implementation here ought to include some defensive logic to clean-up properly when closing, specifically in the code that logs this:

Jul 19 13:32:55 db1.testy013 local5: 2019-07-19 13:32:55,741 clouseau.cleanup [INFO] Renaming '/srv/search_index/shards/00000000-ffffffff/test-user/test-db.1563543159' to '/srv/search_index/shards/00000000-ffffffff/test-user/test-db.20190719.133255.deleted.1563543159'
Jul 19 13:32:55 db1.testy013 local5: 2019-07-19 13:32:55,741 clouseau [WARN] shards/00000000-ffffffff/test-user/test-db.1563543159/b7449da0be4816aa552bd089d7a3f51c Error while closing writer
Jul 19 13:32:55 db1.testy013 org.apache.lucene.store.NoSuchDi directory '/srv/search_index/shards/00000000-ffffffff/test-user/test-db.1563543159/b7449da0be4816aa552bd089d7a3f51c' does not exist
Jul 19 13:32:55 db1.testy013     at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:219)
Jul 19 13:32:55 db1.testy013     at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:243)
Jul 19 13:32:55 db1.testy013     at org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:324)
Jul 19 13:32:55 db1.testy013     at org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:360)
Jul 19 13:32:55 db1.testy013     at org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2095)
Jul 19 13:32:55 db1.testy013     at org.apache.lucene.index.IndexWriter.rollback(IndexWriter.java:2033)
Jul 19 13:32:55 db1.testy013     at com.cloudant.clouseau.IndexService.exit(IndexService.scala:220)
Jul 19 13:32:55 db1.testy013     at scalang.Process.trapExit(Process.scala:58)
Jul 19 13:32:55 db1.testy013     at scalang.Process.handleExit(Process.scala:46)
Jul 19 13:32:55 db1.testy013     at scalang.node.ProcessHolder$$anon$2.onMessage(ProcessAdapter.scala:55)
Jul 19 13:32:55 db1.testy013     at scalang.node.ProcessHolder$$anon$2.onMessage(ProcessAdapter.scala:52)
Jul 19 13:32:55 db1.testy013     at org.jetlang.channels.ChannelSubscription$1.run(ChannelSubscription.java:31)
Jul 19 13:32:55 db1.testy013     at org.jetlang.core.BatchExecutorImpl.execute(BatchExecutorImpl.java:11)
Jul 19 13:32:55 db1.testy013     at org.jetlang.fibers.PoolFiber.flush(PoolFiber.java:82)
Jul 19 13:32:55 db1.testy013     at org.jetlang.fibers.PoolFiber.access$100(PoolFiber.java:21)
Jul 19 13:32:55 db1.testy013     at org.jetlang.fibers.PoolFiber$1.run(PoolFiber.java:38)
Jul 19 13:32:55 db1.testy013     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
Jul 19 13:32:55 db1.testy013     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
Jul 19 13:32:55 db1.testy013     at java.lang.Thread.run(Thread.java:748)
Jul 19 13:32:55 db1.testy013     at overlock.threadpool.ErrorLoggedThread.run(NamedThreadFactory.scala:40)

The writer would need to be explicitly closed/released here when the rollback fails to avoid the writer lock being held indefinitely.

(Naturally, that close/release code should not throw an additional exception under any circumstances: In general, it's a wise precaution to have clean-up code never fail and leak resources, even if it means suppressing an otherwise fault-indicating exception. Also, in the clouseau/dreyfus context, it's usually fine to throw away changes during a failure, since it's secondary data and ought to be rebuilt later.)

What do you think?

@jiangphcn jiangphcn changed the title [WIP]close LRU by database name Close LRU by database path for deleted database/index Nov 1, 2019
Copy link
Contributor

@theburge theburge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@jiangphcn
Copy link
Contributor Author

Thanks again to @theburge As we discussed in apache/couchdb#2130 (review), let us try with current approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants