Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Segment Store] Add RemoteDirectory interface to copy segment files to/from remote store #3102

Conversation

sachinpkale
Copy link
Member

@sachinpkale sachinpkale commented Apr 28, 2022

Description

As described in the #2700, all the read/write operations to remote segment store will happen via RemoteDirectory interface. This change adds the RemoteDirectory interface along with corresponding RemoteIndexInput (extends IndexInput) and RemoteIndexOutput (extends IndexOutput) classes.

Issues Resolved

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@sachinpkale sachinpkale requested a review from a team as a code owner April 28, 2022 11:24
@sachinpkale sachinpkale force-pushed the feature/durability-enhancements branch from fa3a260 to 9a6812d Compare April 28, 2022 11:33
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure fa3a260fcd8fe5c27a8227545fe743a3339b4210
Log 4852

Reports 4852

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 9a6812d9447c580d2b11deda20db9a09614ad5d9
Log 4853

Reports 4853

@sachinpkale
Copy link
Member Author

Checking the failures.

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 91ebb0f57a39d402ef5957b61496503744e1e36a
Log 4856

Reports 4856

Copy link
Collaborator

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Sachin, this is a good start, few minor sugestions/clarifications

Comment on lines +211 to +191
@Override
public Lock obtainLock(String name) throws IOException {
throw new UnsupportedOperationException();
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to acquire a lock before segment uploads or can we support concurrent uploads as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as segments in the remote store are uploaded and downloaded for backup and restore part, I don't think locking is required. Once we start writing to remote store via IndexWriter, locking would be required.

Open to suggestions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we extend BaseDirectory, it will provide locking functionality with the provided LockFactory.

Comment on lines +60 to +41
@Override
public void copyBytes(DataInput input, long numBytes) throws IOException {
assert input instanceof IndexInput : "input should be instance of IndexInput";
blobContainer.writeBlob(getName(), new InputStreamIndexInput((IndexInput) input, numBytes), numBytes, false);
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets say if we go with concurrent writes to local and remote store would we still need a copyBytes or writeByte

Copy link
Collaborator

@Bukhtawar Bukhtawar May 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also use ByteBuffer for leveraging I/O optimizations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets say if we go with concurrent writes to local and remote store would we still need a copyBytes or writeByte

For concurrent writes, we will need writeByte/s

Do we also use ByteBuffer for leveraging I/O optimizations?

Right. Current implementation may not be optimized. It also depends on how a given repository implementation is handling writeBlob.
We will be having a continuous performance test setup which will provide insight into the performance issues.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another case, let's say it's not concurrent, but we want writes to only happen on remote which can be directly searchable we might want to just do writeBytes to remote directly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the idea is to open up the methods as per the requirement. As you said, writeByte/writeBytes methods will be required when IndexWriter is using RemoteDirectory to create the segments (concurrent writes or writing only to remote store).

Comment on lines +48 to +35
public class RemoteIndexInput extends IndexInput {

private final InputStream inputStream;
private final long size;

public RemoteIndexInput(String name, InputStream inputStream, long size) {
super(name);
this.inputStream = inputStream;
this.size = size;
}

@Override
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have a VerifyingIndexInput we might want to consider that handles the checksums

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, either VerifyingIndexInput or ChecksumIndexInput can be used. I have added it as a ToDo item in the javadoc of the class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue is created for Checksum. Tagging this comment on the issue.

Signed-off-by: Sachin Kale <kalsac@amazon.com>
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure d11d190
Log 4925

Reports 4925

@sachinpkale
Copy link
Member Author

@dblock @nknize Please review.

@dblock
Copy link
Member

dblock commented May 6, 2022

You have failing tests here, do you still want to merge? (feature branch)

@sachinpkale
Copy link
Member Author

You have failing tests here, do you still want to merge? (feature branch)

This seem to be like a flaky test. There was a fix provided for it as a part of #2110. As this feature-branch was created before the fix, test is failing. Changes in this PR should not impact any existing tests.
Once this PR is merged, I will get main branch changes merged to this feature branch to keep it updated.

@dblock
Copy link
Member

dblock commented May 6, 2022

start gradle check

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure d11d190
Log 5091

Reports 5091

@Bukhtawar Bukhtawar added the Storage:Durability Issues and PRs related to the durability framework label May 6, 2022
@Bukhtawar
Copy link
Collaborator

start gradle check

@Bukhtawar
Copy link
Collaborator

@sachinpkale @dblock I have rebased the feature branch from main and tried triggering a fresh build

@owaiskazi19
Copy link
Member

start gradle check

2 similar comments
@dblock
Copy link
Member

dblock commented May 6, 2022

start gradle check

@peterzhuamazon
Copy link
Member

start gradle check

@peterzhuamazon
Copy link
Member

start gradle check try one more time.

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure d11d190
Log 5108

Reports 5108

@sachinpkale
Copy link
Member Author

Checked the logs of last run. The failing test is: :rest-api-spec:yamlRestTest. This is a flaky test, more details at: #1817

@Bukhtawar
Copy link
Collaborator

start gradle check

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success d11d190
Log 5124

Reports 5124

@Bukhtawar Bukhtawar merged commit 0f587d2 into opensearch-project:feature/durability-enhancements May 7, 2022
@peterzhuamazon
Copy link
Member

Does this needs to be backported to 2.0/2.x or is this purely a 3.0 change?

@dblock
Copy link
Member

dblock commented May 10, 2022

@sachinpkale @Bukhtawar ^

@sachinpkale
Copy link
Member Author

Does this needs to be backported to 2.0/2.x or is this purely a 3.0 change?

@peterzhuamazon This would be a 3.0 change.

andrross pushed a commit to andrross/OpenSearch that referenced this pull request May 27, 2022
…ore (opensearch-project#3102)

Signed-off-by: Sachin Kale <kalsac@amazon.com>

Co-authored-by: Sachin Kale <kalsac@amazon.com>
sachinpkale added a commit to sachinpkale/OpenSearch that referenced this pull request May 27, 2022
…ore (opensearch-project#3102)

Signed-off-by: Sachin Kale <kalsac@amazon.com>

Co-authored-by: Sachin Kale <kalsac@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Storage:Durability Issues and PRs related to the durability framework
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants