Skip to content

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented May 30, 2018

When the last indexing operation is completed, we will fire a global
checkpoint sync. Since a global checkpoint sync request is a
replication request, it will acquire an index shard permit on the
primary when executing. If this happens at the same time while we are
issuing the synced-flush, the synced-flush request will fail as it
thinks there are in-flight operations. We can avoid such situation by
retrying another synced-flush if the current request fails due to
ongoing operations on the primary.

Closes #29392

When the last indexing operation is completed, we will fire a global
checkpoint sync.  Since a global checkpoint sync request is a
replication request, it will acquire an index shard permit on the
primary when executing.  If this happens at the same time while we are
issuing the synced-flush, the synced-flush request will fail as it
thinks there are in-flight operations. We can avoid such situation by
not issue the synced-flush until the global checkpoint on the primary is
propagated to replicas.
@dnhatn dnhatn added >non-issue >test Issues or PRs that are addressing/adding tests v7.0.0 :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. v6.4.0 labels May 30, 2018
@dnhatn dnhatn requested a review from bleskes May 30, 2018 21:01
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@dnhatn
Copy link
Member Author

dnhatn commented May 30, 2018

I traced the list of index shard permit acquirers when issuing the synced-flush.

1> [2018-05-30T04:00:48,248][TRACE][o.e.i.f.SyncedFlushService] [node_s1] in flight operations 1, acquirers [GlobalCheckpointSyncAction.Request{shardId=[test][0], timeout=1m, index='test', waitForActiveShards=1}
1> 	at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:229)
1> 	at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationPermit(IndexShard.java:2178)
1> 	at org.elasticsearch.action.support.replication.TransportReplicationAction.acquirePrimaryShardReference(TransportReplicationAction.java:968)
1> 	at org.elasticsearch.action.support.replication.TransportReplicationAction.access$500(TransportReplicationAction.java:98)
1> 	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.doRun(TransportReplicationAction.java:318)
1> 	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
1> 	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:293)
1> 	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:280)
1> 	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66)
1> 	at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:656)
1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:724)
1> 	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
1> 	at java.lang.Thread.run(Thread.java:748)]
1> [2018-05-30T04:00:48,248][INFO ][o.e.i.f.FlushIT          ] Third seal: Total shards: [2], failed: [true], reason: [[1] ongoing operations on primary], detail: []

@dnhatn
Copy link
Member Author

dnhatn commented May 31, 2018

run sample packaging tests

Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@bleskes
Copy link
Contributor

bleskes commented May 31, 2018

I wonder if we still have a race condition - we periodically acquire a permit to do a background global checkpoint sync . Since synced flush is a best effort anyway (other stuff may fail in the future), maybe a better direction is to repeat it until successful?

@dnhatn
Copy link
Member Author

dnhatn commented May 31, 2018

@bleskes

maybe a better direction is to repeat it until successful?

Unfortunately, this is not the case because we are testing a different kind of results: unchanged, partially successful, and entirely successful in these tests.

we periodically acquire a permit to do a background global checkpoint sync

This is fired for every 30 seconds which I think is long enough for the synced-flush tests. How about disable the background global checkpoint sync in the synced-flush tests?

@dnhatn
Copy link
Member Author

dnhatn commented May 31, 2018

Talked to Boaz, we decided to take a slightly different approach for this. We will retry another synced-flush if the current request fails due to ongoing operations on the primary.

@bleskes and @jasontedor Can you please take another look?

@dnhatn dnhatn changed the title TEST: Only issue synced-flush after global checkpoint synced TEST: Retry synced-flush if ongoing ops on primary May 31, 2018
@dnhatn
Copy link
Member Author

dnhatn commented May 31, 2018

I should have provided the reasons for this change. The current approach might not cover these cases:

  • Background global checkpoint sync which should be unlikely but is not eliminated entirely in the current solution.
  • Global checkpoint sync request is async, thus it's possible that we check an old global checkpoint value instead of the latest one. Here we can not use the max_seqno as we have an artificial out-of-sync replicas test.

} catch (InterruptedException e) {
Thread.currentThread().interrupt();
if (listener.error != null) {
return; // stop here so that we can preserve the error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just fold this into an if else?

Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

}
if (listener.result.failed()) {
// only retry if request failed due to ongoing operations on primary
assertThat(listener.result.failureReason(), not(containsString("ongoing operations on primary")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally favor something like:

if (listener.result != null && listener. result.failureReason != null && 
     listener. result.failureReason.contains("ongoing operations on primary") {
    throw new AsserionError(listener.reasult.failureReason); // cause the assert busy to retry
}

instead of these two ifs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@dnhatn
Copy link
Member Author

dnhatn commented Jun 5, 2018

Thanks @jasontedor and @bleskes for reviewing!

@dnhatn dnhatn merged commit 4b893c1 into elastic:master Jun 5, 2018
@dnhatn dnhatn deleted the fix-flush-it branch June 5, 2018 13:02
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jun 5, 2018
* elastic/master:
  [Tests] Muting RatedRequestsTests#testXContentParsingIsNotLenient
  TEST:  Retry synced-flush if ongoing ops on primary (elastic#30978)
  Fix docs build.
  Only auto-update license signature if all nodes ready (elastic#30859)
  Add BlobContainer.writeBlobAtomic() (elastic#30902)
  Add a doc value format to binary fields. (elastic#30860)
  Take into account the return value of TcpTransport.readMessageLength(...) in Netty4SizeHeaderFrameDecoder
  Move caching of the size of a directory to `StoreDirectory`. (elastic#30581)
  Clarify docs about boolean operator precedence. (elastic#30808)
  Docs: remove notes on sparsity. (elastic#30905)
  Fix MatchPhrasePrefixQueryBuilderTests#testPhraseOnFieldWithNoTerms
  run overflow forecast a 2nd time as regression test for elastic/ml-cpp#110 (elastic#30969)
  Improve documentation of dynamic mappings. (elastic#30952)
  Decouple MultiValueMode. (elastic#31075)
  Docs: Clarify constraints on scripted similarities. (elastic#31076)
dnhatn added a commit that referenced this pull request Jun 5, 2018
* master:
  Removing erroneous repeat
  Adapt bwc versions after backporting #30983 to 6.4
  [Tests] Muting RatedRequestsTests#testXContentParsingIsNotLenient
  TEST:  Retry synced-flush if ongoing ops on primary (#30978)
  Fix docs build.
  Only auto-update license signature if all nodes ready (#30859)
  Add BlobContainer.writeBlobAtomic() (#30902)
  Add a doc value format to binary fields. (#30860)
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jun 5, 2018
* ccr:
  [DOCS] Creates rest-api folder in docs
  [Rollup] Disallow index patterns that match the rollup index (elastic#30491)
  Replace exact numDocs by soft-del count in SegmentInfo (elastic#31086)
  Upgrade to Lucene-7.4.0-snapshot-0a7c3f462f (elastic#31073)
  Add cors support to NioHttpServerTransport (elastic#30827)
  [DOCS] Fixes security example (elastic#31082)
  Allow terms query in _rollup_search (elastic#30973)
  Removing erroneous repeat
  Adapt bwc versions after backporting elastic#30983 to 6.4
  [Tests] Muting RatedRequestsTests#testXContentParsingIsNotLenient
  TEST:  Retry synced-flush if ongoing ops on primary (elastic#30978)
  Fix docs build.
  Only auto-update license signature if all nodes ready (elastic#30859)
  Add BlobContainer.writeBlobAtomic() (elastic#30902)
  Add a doc value format to binary fields. (elastic#30860)
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jun 5, 2018
* ccr:
  [DOCS] Creates rest-api folder in docs
  [Rollup] Disallow index patterns that match the rollup index (elastic#30491)
  Replace exact numDocs by soft-del count in SegmentInfo (elastic#31086)
  Upgrade to Lucene-7.4.0-snapshot-0a7c3f462f (elastic#31073)
  Add cors support to NioHttpServerTransport (elastic#30827)
  [DOCS] Fixes security example (elastic#31082)
  Allow terms query in _rollup_search (elastic#30973)
  Removing erroneous repeat
  Adapt bwc versions after backporting elastic#30983 to 6.4
  [Tests] Muting RatedRequestsTests#testXContentParsingIsNotLenient
  TEST:  Retry synced-flush if ongoing ops on primary (elastic#30978)
  Fix docs build.
  Only auto-update license signature if all nodes ready (elastic#30859)
  Add BlobContainer.writeBlobAtomic() (elastic#30902)
  Add a doc value format to binary fields. (elastic#30860)
dnhatn added a commit that referenced this pull request Jun 10, 2018
When the last indexing operation is completed, we will fire a global
checkpoint sync. Since a global checkpoint sync request is a replication
request, it will acquire an index shard permit on the primary when
executing. If this happens at the same time while we are issuing the
synced-flush, the synced-flush request will fail as it thinks there are
in-flight operations. We can avoid such situation by retrying another
synced-flush if the current request fails due to ongoing operations on
the primary.

Closes #29392
dnhatn added a commit that referenced this pull request Jun 10, 2018
When the last indexing operation is completed, we will fire a global
checkpoint sync. Since a global checkpoint sync request is a replication
request, it will acquire an index shard permit on the primary when
executing. If this happens at the same time while we are issuing the
synced-flush, the synced-flush request will fail as it thinks there are
in-flight operations. We can avoid such situation by retrying another
synced-flush if the current request fails due to ongoing operations on
the primary.

Closes #29392
dnhatn added a commit that referenced this pull request Jun 14, 2018
* 6.x:
  SQL: Fix build on Java 10
  [Tests] Mutualize fixtures code in BaseHttpFixture (#31210)
  [TEST] Fix RemoteClusterClientTests#testEnsureWeReconnect
  [ML] Update test thresholds to account for changes to memory control (#31289)
  Reenable Checkstyle's unused import rule (#31270)
  [ML] Check licence when datafeeds use cross cluster search  (#31247)
  Fix non-REST doc snippet
  [DOC] Extend SQL docs
  [DOCS] Shortens ML API intros
  Use quotes in the call invocation (#31249)
  move security ingest processors to a sub ingest directory (#31306)
  SQL: Whitelist SQL utility class for better scripting (#30681)
  Add 5.6.11 version constant.
  Fix version detection.
  [Docs] All Rollup docs experimental, agg limitations, clarify DeleteJob (#31299)
  Add missing release notes.
  Security: fix token bwc with pre 6.0.0-beta2 (#31254)
  Fix compilation error in UpdateSettingsIT (#31304)
  Test: Remove broken yml test feature (#31255)
  Add unreleased version 6.3.1
  [Rollup] Metric config parser must use builder so validation runs (#31159)
  Removes experimental tag from scripted_metric aggregation (#31298)
  [DOCS] Removes coming tag from 6.3.0 release notes
  6.3 release notes.
  Add notion of internal index settings (#31286)
  REST high-level client: add Cluster Health API (#29331)
  Remove leftover usage of deprecated client API
  SyncedFlushResponse to implement ToXContentObject (#31155)
  Add Get Aliases API to the high-level REST client (#28799)
  HLRest: Add get index templates API (#31161)
  Log warnings when cluster state publication failed to some nodes (#31233)
  Fix AntFixture waiting condition (#31272)
  [TEST] Mute RecoveryIT.testHistoryUUIDIsGenerated
  Ignore numeric shard count if waiting for ALL (#31265)
  Update checkstyle to 8.10.1 (#31269)
  Set analyzer version in PreBuiltAnalyzerProviderFactory (#31202)
  Revert upgrade to Netty 4.1.25.Final (#31282)
  Use armored input stream for reading public key (#31229)
  [DOCS] Added 'fail_on_unsupported_field' param to MLT. Closes #28008 (#31160)
  Fix Netty 4 Server Transport tests. Again.
  [DOCS] Fixed typo.
  [DOCS] Added release highlights for 6.3 (#31256)
  [DOCS] Mark SQL feature as experimental
  [DOCS] Updates machine learning custom URL screenshots (#31222)
  Fix naming conventions check for XPackTestCase
  Fix security Netty 4 transport tests
  Fix race in clear scroll (#31259)
  [DOCS] Clarify audit index settings when remote indexing (#30923)
  [ML][TEST] Mute tests using rules (#31204)
  Support RequestedAuthnContext (#31238)
  Validate xContentType in PutWatchRequest. (#31088)
  [INGEST] Interrupt the current thread if evaluation grok expressions take too long (#31024)
  Upgrade to Netty 4.1.25.Final (#31232)
  Suppress extras FS on caching directory tests
  Revert "[DOCS] Added 6.3 info & updated the upgrade table. (#30940)"
  Revert "Fix snippets in upgrade docs"
  Fix snippets in upgrade docs
  [DOCS] Added 6.3 info & updated the upgrade table. (#30940)
  Enable custom credentials for core REST tests (#31235)
  Move ESIndexLevelReplicationTestCase to test framework (#31243)
  Encapsulate Translog in Engine (#31220)
  [DOCS] Adds machine learning 6.3.0 release notes (#31217)
  Remove all unused imports and fix CRLF (#31207)
  [TEST] Fix testRecoveryAfterPrimaryPromotion
  [Docs] Remove mention pattern files in Grok processor (#31170)
  Use stronger write-once semantics for Azure repository (#30437)
  Don't swallow exceptions on replication (#31179)
  Compliant SAML Response destination check (#31175)
  Move java version checker back to its own jar (#30708)
  TEST:  Retry synced-flush if ongoing ops on primary (#30978)
  [test] add fix for rare virtualbox error (#31212)
@jimczi jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >non-issue >test Issues or PRs that are addressing/adding tests v6.3.0 v6.4.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants