Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tracing instrumentation for indexing paths #10273

Merged

Conversation

rayshrey
Copy link
Contributor

@rayshrey rayshrey commented Sep 29, 2023

Description

Added tracing instrumentation for indexing paths.

Sample spans:

Screenshot 2023-10-19 at 10 01 46 PM Screenshot 2023-10-19 at 10 02 19 PM Screenshot 2023-10-19 at 10 03 09 PM Screenshot 2023-10-19 at 10 03 30 PM
{
    "traceID": "47a376fb3e446903a8c1100c38edfde1",
    "spanID": "5f3b3a3227c7c87d",
    "operationName": "bulkShardAction",
    "references": [
        {
            "refType": "CHILD_OF",
            "traceID": "47a376fb3e446903a8c1100c38edfde1",
            "spanID": "67488bb3020a56ff"
        }
    ],
    "startTime": 1697732790132198,
    "duration": 162497,
    "tags": [
        {
            "key": "otel.library.name",
            "type": "string",
            "value": "org.opensearch.telemetry"
        },
        {
            "key": "refresh_policy",
            "type": "string",
            "value": "false"
        },
        {
            "key": "bulk_request_items",
            "type": "int64",
            "value": 6
        },
        {
            "key": "index",
            "type": "string",
            "value": "moviess"
        },
        {
            "key": "shard_id",
            "type": "int64",
            "value": 0
        },
        {
            "key": "thread.name",
            "type": "string",
            "value": "opensearch[runTask-0][transport_worker][T#5]"
        },
        {
            "key": "node_id",
            "type": "string",
            "value": "KHWh8yCPSC-cT0CfS8AgRA"
        },
        {
            "key": "span.kind",
            "type": "string",
            "value": "server"
        },
        {
            "key": "internal.span.format",
            "type": "string",
            "value": "proto"
        }
    ],
    "logs": [],
    "processID": "p1",
    "warnings": null
}

{
    "traceID": "47a376fb3e446903a8c1100c38edfde1",
    "spanID": "7ff2c5c2910a11c7",
    "operationName": "dispatchedShardOperationOnPrimary",
    "references": [
        {
            "refType": "CHILD_OF",
            "traceID": "47a376fb3e446903a8c1100c38edfde1",
            "spanID": "0fa4e84f91da6d81"
        }
    ],
    "startTime": 1697732790135496,
    "duration": 137502,
    "tags": [
        {
            "key": "otel.library.name",
            "type": "string",
            "value": "org.opensearch.telemetry"
        },
        {
            "key": "refresh_policy",
            "type": "string",
            "value": "false"
        },
        {
            "key": "bulk_request_items",
            "type": "int64",
            "value": 6
        },
        {
            "key": "index",
            "type": "string",
            "value": "moviess"
        },
        {
            "key": "shard_id",
            "type": "int64",
            "value": 0
        },
        {
            "key": "thread.name",
            "type": "string",
            "value": "opensearch[runTask-0][write][T#1]"
        },
        {
            "key": "node_id",
            "type": "string",
            "value": "KHWh8yCPSC-cT0CfS8AgRA"
        },
        {
            "key": "span.kind",
            "type": "string",
            "value": "server"
        },
        {
            "key": "internal.span.format",
            "type": "string",
            "value": "proto"
        }
    ],
    "logs": [],
    "processID": "p1",
    "warnings": null
}

    "traceID": "47a376fb3e446903a8c1100c38edfde1",
    "spanID": "7c925fc99accc3fb",
    "operationName": "dispatchedShardOperationOnReplica",
    "references": [
        {
            "refType": "CHILD_OF",
            "traceID": "47a376fb3e446903a8c1100c38edfde1",
            "spanID": "cf1e69713797cc94"
        }
    ],
    "startTime": 1697732790269806,
    "duration": 22670,
    "tags": [
        {
            "key": "otel.library.name",
            "type": "string",
            "value": "org.opensearch.telemetry"
        },
        {
            "key": "shard_id",
            "type": "int64",
            "value": 0
        },
        {
            "key": "refresh_policy",
            "type": "string",
            "value": "false"
        },
        {
            "key": "thread.name",
            "type": "string",
            "value": "opensearch[runTask-2][write][T#3]"
        },
        {
            "key": "node_id",
            "type": "string",
            "value": "prD53FEMSKam7RdppmaR_Q"
        },
        {
            "key": "index",
            "type": "string",
            "value": "moviess"
        },
        {
            "key": "bulk_request_items",
            "type": "int64",
            "value": 6
        },
        {
            "key": "span.kind",
            "type": "string",
            "value": "server"
        },
        {
            "key": "internal.span.format",
            "type": "string",
            "value": "proto"
        }
    ],
    "logs": [],
    "processID": "p1",
    "warnings": [
        "clock skew adjustment disabled; not applying calculated delta of -3.266521ms"
    ]
}

Related Issues

Resolves #8555 partially

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT:
  • URL:
  • CommitID: 6a39f5c
    Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
    Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

github-actions bot commented Sep 29, 2023

Compatibility status:

Checks if related components are compatible with change 533c716

Incompatible components

Incompatible components: [https://github.com/opensearch-project/cross-cluster-replication.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/neural-search.git]

@sarthakaggarwal97
Copy link
Contributor

@rayshrey I think the PR needs refactoring, can you please do a ./gradlew :server:precommit and ./gradlew :server:spotlessApply to check for the issues.
Please add the entry in the change log as well!

@sarthakaggarwal97
Copy link
Contributor

@rayshrey can you also please attach the outputs of the traces?

@Gaganjuneja
Copy link
Contributor

Please make SpanName also meaningful so that if somebody just looking at the span should be able to recognise that what's happening in that Span.

@rayshrey
Copy link
Contributor Author

rayshrey commented Oct 3, 2023

Please make SpanName also meaningful so that if somebody just looking at the span should be able to recognise that what's happening in that Span.

@Gaganjuneja We are using shardPrimaryWrite, shardReplicaWrite and bulkShardAction as the span names currently.
According to me, these convey what is happening in the span from Indexing perspective clearly.

Do you have any suggestions/references for the span names ?

@github-actions
Copy link
Contributor

github-actions bot commented Oct 3, 2023

Gradle Check (Jenkins) Run Completed with:

@sarthakaggarwal97
Copy link
Contributor

@reta lgtm!

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testRestartPrimary

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@rayshrey
Copy link
Contributor Author

@reta Thanks for the approval. Can we resolve the open conversations now and merge this PR (since we have got approvals from all the concerned folks) ?

@reta
Copy link
Collaborator

reta commented Oct 20, 2023

@reta Thanks for the approval. Can we resolve the open conversations now and merge this PR (since we have got approvals from all the concerned folks) ?

One minor thing, sorry about that, we missed it

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.cluster.allocation.ClusterRerouteIT.testDelayWithALargeAmountOfShards

@reta reta merged commit e691df0 into opensearch-project:main Oct 20, 2023
16 checks passed
@reta reta added the backport 2.x Backport to 2.x branch label Oct 20, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-10273-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e691df09c66dcc1693897543fd7633c4b208ce48
# Push it to GitHub
git push --set-upstream origin backport/backport-10273-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-10273-to-2.x.

@reta
Copy link
Collaborator

reta commented Oct 20, 2023

@rayshrey please send manual backport to 2.x, thank you

austintlee pushed a commit to austintlee/OpenSearch that referenced this pull request Oct 23, 2023
)

* Add tracing instrumentation for indexing paths

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Fix failing tests and review changes

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Fix test failures due to Span not being properly closed

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Changes to spans in primary and replica actions

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Review comments fixes and refactoring

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Precommit auto-changes

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Add refresh policy as attribute

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Fix changelog entry

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Instrument primary/replica write in TransportWriteAction instead of TransportShardBulkAction

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Modify SpanBuilder

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* spotlessApply and precommit

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Change span names

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Pass Noop Tracer instead of injected tracer

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Reverting previous changes

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Remove tracer variable from TransportShardBulkAction

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

---------

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>
@monusingh-1
Copy link

@rayshrey please send manual backport to 2.x, thank you

@rayshrey backport to 2.x

rayshrey added a commit to rayshrey/OpenSearch that referenced this pull request Oct 27, 2023
)

* Add tracing instrumentation for indexing paths

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Fix failing tests and review changes

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Fix test failures due to Span not being properly closed

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Changes to spans in primary and replica actions

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Review comments fixes and refactoring

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Precommit auto-changes

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Add refresh policy as attribute

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Fix changelog entry

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Instrument primary/replica write in TransportWriteAction instead of TransportShardBulkAction

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Modify SpanBuilder

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* spotlessApply and precommit

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Change span names

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Pass Noop Tracer instead of injected tracer

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Reverting previous changes

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Remove tracer variable from TransportShardBulkAction

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

---------

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>
reta pushed a commit that referenced this pull request Oct 27, 2023
* Add tracing instrumentation for indexing paths



* Fix failing tests and review changes



* Fix test failures due to Span not being properly closed



* Changes to spans in primary and replica actions



* Review comments fixes and refactoring



* Precommit auto-changes



* Add refresh policy as attribute



* Fix changelog entry



* Instrument primary/replica write in TransportWriteAction instead of TransportShardBulkAction



* Modify SpanBuilder



* spotlessApply and precommit



* Change span names



* Pass Noop Tracer instead of injected tracer



* Reverting previous changes



* Remove tracer variable from TransportShardBulkAction



---------

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>
@rayshrey rayshrey deleted the indexing-tracing-instrumentation branch November 2, 2023 09:50
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
)

* Add tracing instrumentation for indexing paths

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Fix failing tests and review changes

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Fix test failures due to Span not being properly closed

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Changes to spans in primary and replica actions

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Review comments fixes and refactoring

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Precommit auto-changes

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Add refresh policy as attribute

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Fix changelog entry

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Instrument primary/replica write in TransportWriteAction instead of TransportShardBulkAction

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Modify SpanBuilder

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* spotlessApply and precommit

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Change span names

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Pass Noop Tracer instead of injected tracer

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Reverting previous changes

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

* Remove tracer variable from TransportShardBulkAction

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>

---------

Signed-off-by: Shreyansh Ray <rayshrey@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed enhancement Enhancement or improvement to existing feature or request Indexing Indexing, Bulk Indexing and anything related to indexing Search Search query, autocomplete ...etc v2.12.0 Issues and PRs related to version 2.12.0 v3.0.0 Issues and PRs related to version 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Tracing Instrumentation] Add instrumentation at deep indexing level
6 participants