Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Add Segment download stats to remotestore stats API #8718

Merged
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
d1e688a
Adding Segment download stats to remotestore stats API
shourya035 Jul 5, 2023
8086eb9
Merge branch 'opensearch-project:main' into segment-download-stats
shourya035 Jul 5, 2023
97b733a
Removed unused loggers and updated upload stats field names
shourya035 Jul 5, 2023
e667270
Fixing spotless errors
shourya035 Jul 5, 2023
743c09a
Adding JavaDocs and fixing divide by zero errors on UTs
shourya035 Jul 5, 2023
94f0f58
Merge branch 'opensearch-project:main' into segment-download-stats
shourya035 Jul 6, 2023
ac3fbc0
Excluding stats publishing on non-remote store indices
shourya035 Jul 6, 2023
bb20d64
Merge branch 'opensearch-project:main' into segment-download-stats
shourya035 Jul 6, 2023
b42912f
Excluding stats publishing on non-remote store indices
shourya035 Jul 6, 2023
83f2bd6
Merge branch 'opensearch-project:main' into segment-download-stats
shourya035 Jul 7, 2023
80cf62b
Removing unused logger
shourya035 Jul 10, 2023
d826579
Addressing comments
shourya035 Jul 11, 2023
2213186
Merge branch 'opensearch-project:main' into segment-download-stats
shourya035 Jul 11, 2023
f282d54
Fixing API field name from routing to shards
shourya035 Jul 11, 2023
331cb81
Merge branch 'opensearch-project:main' into segment-download-stats
shourya035 Jul 11, 2023
8de6d00
Changing file upload stats to sync level from file level
shourya035 Jul 11, 2023
9daeb74
Retrigger integs
shourya035 Jul 11, 2023
d507ca8
Empty Commit
shourya035 Jul 13, 2023
8c4d189
Removing unused toString method
shourya035 Jul 13, 2023
27b6ad0
Merge from previous branch
shourya035 Jul 13, 2023
cdf9a49
Removing unused window size vars
shourya035 Jul 13, 2023
c4393b0
Merge branch 'opensearch-project:main' into segment-download-stats-2
shourya035 Jul 17, 2023
b60bb40
Adding Integ tests for download stats
shourya035 Jul 17, 2023
67ca0b6
Merge branch 'main' into segment-download-stats-2
shourya035 Jul 18, 2023
c62926f
Addressing comments
shourya035 Jul 18, 2023
cceb294
Merge branch 'main' into segment-download-stats-2
shourya035 Jul 18, 2023
0d00a79
Adding more integ tests on stats correctness
shourya035 Jul 18, 2023
41497f3
Fixed Spotless checks
shourya035 Jul 18, 2023
7a4d993
Addressing comments
ashking94 Jul 19, 2023
ac7df33
Retrigger Integs
shourya035 Jul 19, 2023
70a60f2
Retrigger Integs 2
shourya035 Jul 19, 2023
2bcf467
Merge branch 'opensearch-project:main' into segment-download-stats-2
shourya035 Jul 19, 2023
df43b9c
Fixing assertion on Stats UT
shourya035 Jul 19, 2023
6e3d1da
Fixing stats correctness test cases for RemoteStoreStatsIT
shourya035 Jul 19, 2023
bf43cb1
Fixing stats IT flakyness
shourya035 Jul 19, 2023
b171ce9
Merge branch 'opensearch-project:main' into segment-download-stats-2
shourya035 Jul 20, 2023
6db95df
Changing replica count on IT
shourya035 Jul 20, 2023
2d552b9
Changing upper and lower bounds of random doc ingestion in IT
shourya035 Jul 20, 2023
edbfde3
Addressing comments
shourya035 Jul 20, 2023
c1b447b
Addressing comments
shourya035 Jul 25, 2023
684d463
Merge branch 'opensearch-project:main' into segment-download-stats-2
shourya035 Jul 25, 2023
d380d1d
Adding JavaDoc for new interface
shourya035 Jul 25, 2023
4a866ae
Manually invoking refresh on ITs
shourya035 Jul 25, 2023
91378d1
Retrigger tests
shourya035 Jul 25, 2023
e843adc
Abstracting out stats population logic to different class
shourya035 Jul 27, 2023
7a2439d
Fixing UTs and removing repeated isRemoteStoreEnabled checks
shourya035 Jul 27, 2023
877e98e
Merge branch 'opensearch-project:main' into segment-download-stats-2
shourya035 Jul 28, 2023
0d248cb
Merge branch 'opensearch-project:main' into segment-download-stats-2
shourya035 Jul 31, 2023
8f8ed29
Moving download stats population to StoreDirectory
shourya035 Jul 31, 2023
c54cd01
Fixing pressure service UTs and updating tracker javadocs
shourya035 Jul 31, 2023
6ce96a8
Merge branch 'opensearch-project:main' into segment-download-stats-2
shourya035 Aug 1, 2023
508eb22
Addressing comments
shourya035 Aug 1, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Moving download stats population to StoreDirectory
Signed-off-by: Shourya Dutta Biswas <114977491+shourya035@users.noreply.github.com>
  • Loading branch information
shourya035 committed Jul 31, 2023
commit 8f8ed298a9a4b879b645d4d4d9abbc25c5c80efd
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ public void testStatsResponseFromAllNodes() {
assertEquals(1, matches.size());
RemoteSegmentTransferTracker.Stats stats = matches.get(0).getStats();
validateUploadStats(stats);
assertEquals(0, stats.downloadBytesStarted);
assertEquals(0, stats.directoryFileTransferTrackerStats.downloadBytesStarted);
}

// Step 3 - Enable replicas on the existing indices and ensure that download
Expand All @@ -93,7 +93,7 @@ public void testStatsResponseFromAllNodes() {
RemoteSegmentTransferTracker.Stats stats = stat.getStats();
if (routing.primary()) {
validateUploadStats(stats);
assertEquals(0, stats.downloadBytesStarted);
assertEquals(0, stats.directoryFileTransferTrackerStats.downloadBytesStarted);
} else {
validateDownloadStats(stats);
assertEquals(0, stats.totalUploadsStarted);
Expand Down Expand Up @@ -124,7 +124,7 @@ public void testStatsResponseAllShards() {
assertTrue(response.getRemoteStoreStats() != null && response.getRemoteStoreStats().length == 3);
RemoteSegmentTransferTracker.Stats stats = response.getRemoteStoreStats()[0].getStats();
validateUploadStats(stats);
assertEquals(0, stats.downloadBytesStarted);
assertEquals(0, stats.directoryFileTransferTrackerStats.downloadBytesStarted);

// Step 3 - Enable replicas on the existing indices and ensure that download
// stats are being populated as well
Expand All @@ -138,7 +138,7 @@ public void testStatsResponseAllShards() {
stats = stat.getStats();
if (routing.primary()) {
validateUploadStats(stats);
assertEquals(0, stats.downloadBytesStarted);
assertEquals(0, stats.directoryFileTransferTrackerStats.downloadBytesStarted);
} else {
validateDownloadStats(stats);
assertEquals(0, stats.totalUploadsStarted);
Expand Down Expand Up @@ -171,7 +171,7 @@ public void testStatsResponseFromLocalNode() {
assertTrue(response.getRemoteStoreStats() != null && response.getRemoteStoreStats().length == 1);
RemoteSegmentTransferTracker.Stats stats = response.getRemoteStoreStats()[0].getStats();
validateUploadStats(stats);
assertEquals(0, stats.downloadBytesStarted);
assertEquals(0, stats.directoryFileTransferTrackerStats.downloadBytesStarted);
}
changeReplicaCountAndEnsureGreen(1);
for (String node : nodes) {
Expand All @@ -188,7 +188,7 @@ public void testStatsResponseFromLocalNode() {
RemoteSegmentTransferTracker.Stats stats = stat.getStats();
if (routing.primary()) {
validateUploadStats(stats);
assertEquals(0, stats.downloadBytesStarted);
assertEquals(0, stats.directoryFileTransferTrackerStats.downloadBytesStarted);
} else {
validateDownloadStats(stats);
assertEquals(0, stats.totalUploadsStarted);
Expand Down Expand Up @@ -240,7 +240,10 @@ public void testDownloadStatsCorrectnessSinglePrimarySingleReplica() throws Exce
.collect(Collectors.toList())
.get(0)
.getStats();
assertTrue(zeroStateReplicaStats.downloadBytesStarted == 0 && zeroStateReplicaStats.downloadBytesSucceeded == 0);
assertTrue(
zeroStateReplicaStats.directoryFileTransferTrackerStats.downloadBytesStarted == 0
&& zeroStateReplicaStats.directoryFileTransferTrackerStats.downloadBytesSucceeded == 0
);

// Index documents
for (int i = 1; i <= randomIntBetween(5, 10); i++) {
Expand All @@ -267,17 +270,18 @@ public void testDownloadStatsCorrectnessSinglePrimarySingleReplica() throws Exce
assertTrue(primaryStats.totalUploadsStarted > 0);
assertTrue(primaryStats.totalUploadsSucceeded > 0);
assertTrue(
replicaStats.downloadBytesStarted > 0
&& primaryStats.uploadBytesStarted - zeroStatePrimaryStats.uploadBytesStarted == replicaStats.downloadBytesStarted
replicaStats.directoryFileTransferTrackerStats.downloadBytesStarted > 0
&& primaryStats.uploadBytesStarted
- zeroStatePrimaryStats.uploadBytesStarted == replicaStats.directoryFileTransferTrackerStats.downloadBytesStarted
);
assertTrue(
replicaStats.downloadBytesSucceeded > 0
replicaStats.directoryFileTransferTrackerStats.downloadBytesSucceeded > 0
&& primaryStats.uploadBytesSucceeded
- zeroStatePrimaryStats.uploadBytesSucceeded == replicaStats.downloadBytesSucceeded
- zeroStatePrimaryStats.uploadBytesSucceeded == replicaStats.directoryFileTransferTrackerStats.downloadBytesSucceeded
);
// Assert zero failures
assertEquals(0, primaryStats.uploadBytesFailed);
assertEquals(0, replicaStats.downloadBytesFailed);
assertEquals(0, replicaStats.directoryFileTransferTrackerStats.downloadBytesFailed);
}, 60, TimeUnit.SECONDS);
}
}
Expand Down Expand Up @@ -328,7 +332,10 @@ public void testDownloadStatsCorrectnessSinglePrimaryMultipleReplicaShards() thr
.filter(remoteStoreStats -> !remoteStoreStats.getShardRouting().primary())
.collect(Collectors.toList());
zeroStateReplicaStats.forEach(stats -> {
assertTrue(stats.getStats().downloadBytesStarted == 0 && stats.getStats().downloadBytesSucceeded == 0);
assertTrue(
stats.getStats().directoryFileTransferTrackerStats.downloadBytesStarted == 0
&& stats.getStats().directoryFileTransferTrackerStats.downloadBytesSucceeded == 0
);
});

int currentNodesInCluster = client().admin().cluster().prepareHealth().get().getNumberOfDataNodes();
Expand All @@ -353,9 +360,9 @@ public void testDownloadStatsCorrectnessSinglePrimaryMultipleReplicaShards() thr
uploadBytesSucceeded = stats.uploadBytesSucceeded;
uploadBytesFailed = stats.uploadBytesFailed;
} else {
downloadBytesStarted.add(stats.downloadBytesStarted);
downloadBytesSucceeded.add(stats.downloadBytesSucceeded);
downloadBytesFailed.add(stats.downloadBytesFailed);
downloadBytesStarted.add(stats.directoryFileTransferTrackerStats.downloadBytesStarted);
downloadBytesSucceeded.add(stats.directoryFileTransferTrackerStats.downloadBytesSucceeded);
downloadBytesFailed.add(stats.directoryFileTransferTrackerStats.downloadBytesFailed);
}
}

Expand Down Expand Up @@ -452,6 +459,7 @@ public void testStatsOnRemoteStoreRestore() throws IOException {

// Index some docs to ensure segments being uploaded to remote store
indexDocs();
refresh(INDEX_NAME);

// Stop one data node to force the index into a red state
internalCluster().stopRandomDataNode();
Expand All @@ -461,7 +469,6 @@ public void testStatsOnRemoteStoreRestore() throws IOException {
internalCluster().startDataOnlyNode();

// Restore index from remote store
assertAcked(client().admin().indices().prepareClose(INDEX_NAME).get());
client().admin().cluster().restoreRemoteStore(new RestoreRemoteStoreRequest().indices(INDEX_NAME), PlainActionFuture.newFuture());

// Ensure that the index is green
Expand All @@ -481,7 +488,10 @@ public void testStatsOnRemoteStoreRestore() throws IOException {
assertTrue(
segmentTracker.totalUploadsStarted > 0 && segmentTracker.totalUploadsSucceeded > 0 && segmentTracker.totalUploadsFailed == 0
);
assertTrue(segmentTracker.downloadBytesStarted > 0 && segmentTracker.downloadBytesSucceeded > 0);
assertTrue(
segmentTracker.directoryFileTransferTrackerStats.downloadBytesStarted > 0
&& segmentTracker.directoryFileTransferTrackerStats.downloadBytesSucceeded > 0
);
});
}

Expand Down Expand Up @@ -510,7 +520,10 @@ public void testNonZeroPrimaryStatsOnNewlyCreatedIndexWithZeroDocs() throws Exce
&& segmentTracker.totalUploadsFailed == 0
);
} else {
assertTrue(segmentTracker.downloadBytesStarted == 0 && segmentTracker.downloadBytesSucceeded == 0);
assertTrue(
segmentTracker.directoryFileTransferTrackerStats.downloadBytesStarted == 0
&& segmentTracker.directoryFileTransferTrackerStats.downloadBytesSucceeded == 0
);
}
});
}, 5, TimeUnit.SECONDS);
Expand Down Expand Up @@ -564,13 +577,13 @@ private void validateUploadStats(RemoteSegmentTransferTracker.Stats stats) {
}

private void validateDownloadStats(RemoteSegmentTransferTracker.Stats stats) {
assertTrue(stats.lastDownloadTimestampMs > 0);
assertTrue(stats.downloadBytesStarted > 0);
assertTrue(stats.downloadBytesSucceeded > 0);
assertEquals(stats.downloadBytesFailed, 0);
assertTrue(stats.lastSuccessfulSegmentDownloadBytes > 0);
assertTrue(stats.downloadBytesMovingAverage > 0);
assertTrue(stats.downloadBytesPerSecMovingAverage > 0);
assertTrue(stats.directoryFileTransferTrackerStats.lastDownloadTimestampMs > 0);
assertTrue(stats.directoryFileTransferTrackerStats.downloadBytesStarted > 0);
assertTrue(stats.directoryFileTransferTrackerStats.downloadBytesSucceeded > 0);
assertEquals(stats.directoryFileTransferTrackerStats.downloadBytesFailed, 0);
assertTrue(stats.directoryFileTransferTrackerStats.lastSuccessfulSegmentDownloadBytes > 0);
assertTrue(stats.directoryFileTransferTrackerStats.downloadBytesMovingAverage > 0);
assertTrue(stats.directoryFileTransferTrackerStats.downloadBytesPerSecMovingAverage > 0);
}

// Validate if the shardRouting obtained from cluster state contains the exact same routing object
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws
builder.startObject(Fields.SEGMENT);
builder.startObject(SubFields.DOWNLOAD);
// Ensuring that we are not showing 0 metrics to the user
if (remoteSegmentShardStats.downloadBytesStarted != 0) {
if (remoteSegmentShardStats.directoryFileTransferTrackerStats.downloadBytesStarted != 0) {
buildDownloadStats(builder);
}
builder.endObject();
Expand Down Expand Up @@ -105,18 +105,21 @@ private void buildUploadStats(XContentBuilder builder) throws IOException {
}

private void buildDownloadStats(XContentBuilder builder) throws IOException {
builder.field(DownloadStatsFields.LAST_SYNC_TIMESTAMP, remoteSegmentShardStats.lastDownloadTimestampMs);
builder.field(
DownloadStatsFields.LAST_SYNC_TIMESTAMP,
remoteSegmentShardStats.directoryFileTransferTrackerStats.lastDownloadTimestampMs
);
builder.startObject(DownloadStatsFields.TOTAL_DOWNLOADS_IN_BYTES)
.field(SubFields.STARTED, remoteSegmentShardStats.downloadBytesStarted)
.field(SubFields.SUCCEEDED, remoteSegmentShardStats.downloadBytesSucceeded)
.field(SubFields.FAILED, remoteSegmentShardStats.downloadBytesFailed);
.field(SubFields.STARTED, remoteSegmentShardStats.directoryFileTransferTrackerStats.downloadBytesStarted)
.field(SubFields.SUCCEEDED, remoteSegmentShardStats.directoryFileTransferTrackerStats.downloadBytesSucceeded)
.field(SubFields.FAILED, remoteSegmentShardStats.directoryFileTransferTrackerStats.downloadBytesFailed);
builder.endObject();
builder.startObject(DownloadStatsFields.DOWNLOAD_SIZE_IN_BYTES)
.field(SubFields.LAST_SUCCESSFUL, remoteSegmentShardStats.lastSuccessfulSegmentDownloadBytes)
.field(SubFields.MOVING_AVG, remoteSegmentShardStats.downloadBytesMovingAverage);
.field(SubFields.LAST_SUCCESSFUL, remoteSegmentShardStats.directoryFileTransferTrackerStats.lastSuccessfulSegmentDownloadBytes)
.field(SubFields.MOVING_AVG, remoteSegmentShardStats.directoryFileTransferTrackerStats.downloadBytesMovingAverage);
builder.endObject();
builder.startObject(DownloadStatsFields.DOWNLOAD_SPEED_IN_BYTES_PER_SEC)
.field(SubFields.MOVING_AVG, remoteSegmentShardStats.downloadBytesPerSecMovingAverage);
.field(SubFields.MOVING_AVG, remoteSegmentShardStats.directoryFileTransferTrackerStats.downloadBytesPerSecMovingAverage);
builder.endObject();
}

Expand Down Expand Up @@ -211,11 +214,6 @@ static final class DownloadStatsFields {
*/
static final String LAST_SYNC_TIMESTAMP = "last_sync_timestamp";

/**
* Total number of sync from the remote store for a specific shard
*/
static final String TOTAL_SYNCS_FROM_REMOTE = "total_syncs_from_remote";

/**
* Total bytes of segment files downloaded from the remote store for a specific shard
*/
Expand All @@ -230,11 +228,6 @@ static final class DownloadStatsFields {
* Speed (in bytes/sec) for segment file downloads
*/
static final String DOWNLOAD_SPEED_IN_BYTES_PER_SEC = "download_speed_in_bytes_per_sec";

/**
* Time taken (in millis) for each segment file downloaded
*/
static final String DOWNLOAD_LATENCY_IN_MILLIS = "download_latency_in_millis";
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ public void afterIndexShardCreated(IndexShard indexShard) {
shardId,
new RemoteSegmentTransferTracker(
shardId,
indexShard.store().getDirectoryFileTransferTracker(),
pressureSettings.getUploadBytesMovingAverageWindowSize(),
pressureSettings.getUploadBytesPerSecMovingAverageWindowSize(),
pressureSettings.getUploadTimeMovingAverageWindowSize()
Expand Down
Loading