Skip to content

Add internal _primary_term field to store shard's primary term #21480

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

dakrone
Copy link
Member

@dakrone dakrone commented Nov 10, 2016

This adds the _primary_term field internally to the mappings. This
field is populated with the current shard's primary term.

It is intended to be used for collision resolution when two document
copies have the same sequence id, therefore, doc_values for the field
are stored but the filed itself is not indexed.

This also fixes the _seq_no field so that doc_values are
retrievable (they were previously stored but irretrievable) and changes
the stats implementation to more efficiently use the points API to
retrieve the min/max instead of iterating on each doc_value value.

Relates to #10708

This adds the `_primary_term` field internally to the mappings. This
field is populated with the current shard's primary term.

It is intended to be used for collision resolution when two document
copies have the same sequence id, therefore, doc_values for the field
are stored but the filed itself is not indexed.

This also fixes the `_seq_no` field so that doc_values are
retrievable (they were previously stored but irretrievable) and changes
the `stats` implementation to more efficiently use the points API to
retrieve the min/max instead of iterating on each doc_value value.

Relates to elastic#10708
@dakrone
Copy link
Member Author

dakrone commented Nov 11, 2016

Closing this for now, I'll reopen it when seq_no has been merged to master and this is rebased.

@dakrone dakrone closed this Nov 11, 2016
dakrone added a commit to dakrone/elasticsearch that referenced this pull request Dec 9, 2016
This adds the `_primary_term` field internally to the mappings. This field is
populated with the current shard's primary term.

It is intended to be used for collision resolution when two document copies have
the same sequence id, therefore, doc_values for the field are stored but the
filed itself is not indexed.

This also fixes the `_seq_no` field so that doc_values are retrievable (they
were previously stored but irretrievable) and changes the `stats` implementation
to more efficiently use the points API to retrieve the min/max instead of
iterating on each doc_value value. Additionally, even though we intend to be
able to search on the field, it was previously not searchable. This commit makes
it searchable.

There is no user-visible `_primary_term` field. Instead, the fields are
updated by calling:

```java
index.parsedDoc().updateSeqID(seqNum, primaryTerm);
```

This includes example methods in `Versions` and `Engine` for retrieving the
sequence id values from the index (see `Engine.getSequenceID`) that are only
used in unit tests. These will be extended/replaced by actual implementations
once we make use of sequence numbers as a conflict resolution measure.

Relates to elastic#10708
Supercedes elastic#21480

P.S. As a side effect of this commit, `SlowCompositeReaderWrapper` cannot be
used for documents that contain `_seq_no` because it is a Point value and SCRW
cannot wrap documents with points, so the tests have been updated to loop
through the `LeafReaderContext`s now instead.
@clintongormley clintongormley added :Engine :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Sequence IDs labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants