Skip to content

Conversation

@jiekun
Copy link
Member

@jiekun jiekun commented Jul 14, 2025

Describe Your Changes

Fix #21

Checklist

The following checks are mandatory:

@jiekun jiekun marked this pull request as draft July 14, 2025 03:58
@jiekun jiekun requested a review from Copilot July 26, 2025 15:50
@jiekun jiekun marked this pull request as ready for review July 26, 2025 15:50

This comment was marked as outdated.

@jiekun jiekun requested a review from Copilot July 28, 2025 01:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a trace_id index stream to accelerate query execution for trace lookups. The feature introduces an index stream that stores trace IDs with their start timestamps to enable faster trace retrieval by avoiding full time range scans.

  • Adds trace_id index constants and partitioning configuration
  • Implements fast path trace lookup using the index stream with fallback to step-based search
  • Refactors field name generation for events and links to append index suffixes correctly

Reviewed Changes

Copilot reviewed 6 out of 15 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
lib/protoparser/opentelemetry/pb/trace_fields.go Defines constants for trace_id index stream name, field name, and partition count
go.mod Adds fastcache dependency for trace ID deduplication
app/vtselect/traces/query/query.go Implements fast path trace lookup using index stream and refactors existing search logic
app/vtselect/traces/jaeger/model_test.go Adds test for removeArrayIndex function
app/vtselect/traces/jaeger/model.go Refactors field parsing to extract attribute names and indices correctly
app/vtinsert/opentelemetry/opentelemetry.go Creates trace_id index entries and refactors field generation with suffix support
Comments suppressed due to low confidence (1)

app/vtselect/traces/jaeger/model.go:275

  • The function name 'extraAttributeNameAndIndex' is unclear. It should be named more descriptively, such as 'extractAttributeNameAndIndex' or 'parseAttributeNameAndIndex'.
func extraAttributeNameAndIndex(input string) (string, string) {

qStr := fmt.Sprintf(otelpb.TraceIDField+": %q", traceID)
// possible partition
// query: {trace_id_idx="xx"} AND trace_id:traceID
qStr := fmt.Sprintf(`{%s="%d"} AND %s:=%q | fields _time`, otelpb.TraceIDIndexStreamName, xxhash.Sum64String(traceID)%otelpb.TraceIDIndexPartitionCount, otelpb.TraceIDIndexFieldName, traceID)
Copy link

Copilot AI Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query uses ':=' operator but the constant name suggests it should be ':'. The operator ':=' is typically for assignment, not field matching. This should likely be ':' for field matching.

Suggested change
qStr := fmt.Sprintf(`{%s="%d"} AND %s:=%q | fields _time`, otelpb.TraceIDIndexStreamName, xxhash.Sum64String(traceID)%otelpb.TraceIDIndexPartitionCount, otelpb.TraceIDIndexFieldName, traceID)
qStr := fmt.Sprintf(`{%s="%d"} AND %s:%q | fields _time`, otelpb.TraceIDIndexStreamName, xxhash.Sum64String(traceID)%otelpb.TraceIDIndexPartitionCount, otelpb.TraceIDIndexFieldName, traceID)

Copilot uses AI. Check for mistakes.
return time.Time{}, nil
}

// findSpanByTraceID searches for spans from now to 0 time with steps.
Copy link

Copilot AI Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says 'findSpanByTraceID' but the actual function name is 'findSpansByTraceID' (plural). The comment should match the function name.

Suggested change
// findSpanByTraceID searches for spans from now to 0 time with steps.
// findSpansByTraceID searches for spans from now to 0 time with steps.

Copilot uses AI. Check for mistakes.
}

// found result, perform extra search for traceMaxDurationWindow and then break.
return time.Unix(traceIDStartTimeInt/1e9, traceIDStartTimeInt%1e9), nil
Copy link

Copilot AI Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic numbers '1e9' should be replaced with a named constant like 'time.Second.Nanoseconds()' or '1000000000' with a comment explaining the nanosecond conversion.

Suggested change
return time.Unix(traceIDStartTimeInt/1e9, traceIDStartTimeInt%1e9), nil
return time.Unix(traceIDStartTimeInt/time.Second.Nanoseconds(), traceIDStartTimeInt%time.Second.Nanoseconds()), nil

Copilot uses AI. Check for mistakes.
@jiekun jiekun merged commit 5787221 into master Jul 28, 2025
8 checks passed
@jiekun jiekun deleted the feature/trace-id-stream branch July 29, 2025 01:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bad performance when query with trace_id only

1 participant