Skip to content

[BUG] PPL parser rejects new join type=left syntax with ON clause due to incorrect ambiguity check #4459

@alexey-temnikov

Description

@alexey-temnikov

Query Information

PPL Command/Query:

source=product_reviews
| stats COUNT() AS RatingCount, AVG(rating) AS Avg_ComputedRating by product_id
| where RatingCount >= 100
| join type=left left=RATING right=PRODUCT ON RATING.product_id = PRODUCT.product_id
[
  source=product_info | fields product_id, average_rating, title
]
| eval DiffRating=Avg_ComputedRating - PRODUCT.average_rating
| fields RATING.product_id, Avg_ComputedRating, PRODUCT.average_rating, DiffRating, PRODUCT.title

Expected Result:
The query should execute successfully using the PPL 3.3.0+ extended syntax with type=left option, as documented in the join command documentation.

Actual Result:

{
  "error": {
    "reason": "Invalid Query",
    "details": "Join type is ambiguous, remove either the join type before JOIN keyword or 'type=' option.",
    "type": "SemanticCheckException"
  },
  "status": 400
}

The error message incorrectly suggests that both syntaxes are present (join type before JOIN keyword AND type= option), when only the type= option is used.

Dataset Information

Dataset/Schema Type

  • OpenTelemetry (OTEL)
  • Simple Schema for Observability (SS4O)
  • Open Cybersecurity Schema Framework (OCSF)
  • Custom (details below)

Index Mapping

product_reviews:

{
  "mappings": {
    "properties": {
      "product_id": { "type": "text" },
      "rating": { "type": "long" }
    }
  }
}

product_info:

{
  "mappings": {
    "properties": {
      "product_id": { "type": "text" },
      "average_rating": { "type": "float" },
      "title": { "type": "text" }
    }
  }
}

Sample Data

product_reviews (105 documents with product_id="P001" and ratings 3-5):

{"product_id": "P001", "rating": 4}

product_info:

{"product_id": "P001", "average_rating": 4.2, "title": "Sample Product"}

Bug Description

Issue Summary:
PPL parser incorrectly rejects the new PPL 3.3.0+ join syntax join type=left ... ON ... with a misleading error about ambiguous join type, even though only the type= option is specified and no join type appears before the JOIN keyword.

Steps to Reproduce:

  1. Create two indices with sample data (product_reviews and product_info)
  2. Execute a PPL query using the new syntax: join type=left left=RATING right=PRODUCT ON ...
  3. Observe the SemanticCheckException error

Impact:

  • Users cannot use the documented PPL 3.3.0+ extended syntax with type= option when using ON/WHERE join criteria
  • The error message is misleading, suggesting both syntaxes are present when only one is used
  • Forces users to use the older syntax (e.g., left join instead of join type=left)

Tentative Root Cause:
Based on preliminary analysis, the issue appears to be in /ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java at lines 214-232:

public UnresolvedPlan visitJoinCommand(OpenSearchPPLParser.JoinCommandContext ctx) {
  boolean sqlLike = ctx.joinCriteria() != null;
  Join.JoinType joinType = null;
  if (sqlLike) {
    joinType = ArgumentFactory.getJoinType(ctx.sqlLikeJoinType());  // Line 219
  }
  // ...
  if (argumentMap.get("type") != null) {
    Join.JoinType joinTypeFromArgument = ArgumentFactory.getJoinType(argumentMap);
    if (sqlLike && joinType != joinTypeFromArgument) {  // Line 226
      throw new SemanticCheckException(
          "Join type is ambiguous, remove either the join type before JOIN keyword or 'type='"
              + " option.");
    }
    joinType = joinTypeFromArgument;
  }
}

The problem occurs because:

  1. When ON clause is present, sqlLike is set to true
  2. ArgumentFactory.getJoinType(ctx.sqlLikeJoinType()) is called with null context
  3. In /ppl/src/main/java/org/opensearch/sql/ppl/utils/ArgumentFactory.java , when context is null, it defaults to Join.JoinType.INNER
  4. The comparison at line 226 compares this default INNER with the explicit LEFT from type=left, finds they differ, and throws the ambiguity error

The logic incorrectly treats the absence of a join type before JOIN as an implicit INNER join, then compares it against the explicit type= option.

Tentative Proposed Fix:
This is a preliminary analysis and requires further validation
The fix should distinguish between an explicitly specified join type and a null/absent join type. One approach:

In AstBuilder.java, modify the logic to only check for ambiguity when sqlLikeJoinType context is explicitly provided (not null):

if (sqlLike) {
  if (ctx.sqlLikeJoinType() != null) {
    joinType = ArgumentFactory.getJoinType(ctx.sqlLikeJoinType());
  }
}
// ...
if (argumentMap.get("type") != null) {
  Join.JoinType joinTypeFromArgument = ArgumentFactory.getJoinType(argumentMap);
  if (sqlLike && joinType != null && joinType != joinTypeFromArgument) {
    throw new SemanticCheckException(
        "Join type is ambiguous, remove either the join type before JOIN keyword or 'type='"
            + " option.");
  }
  joinType = joinTypeFromArgument;
}
if (joinType == null) {
  joinType = Join.JoinType.INNER;  // Apply default only if no type was specified
}

Workaround:
Use the older PPL 3.0.0 syntax with join type before the JOIN keyword

Environment Information

OpenSearch Version:
OpenSearch 3.3.0-SNAPSHOT (build hash: e6382333112715aee285cb81097a6ee51fb6802f, build date: 2025-10-07T01:45:05.033269Z)

Additional Details:

  • Tested on local environment
  • SQL plugin version: bundled with OpenSearch 3.3.0-SNAPSHOT
  • The issue affects the PPL parser/semantic checker, not the execution engine

Additional Notes

Documentation Clarity Issue:
The documentation at /docs/user/ppl/cmd/join.rst uses type=outer in examples, which is ambiguous. The term "outer" doesn't clearly indicate whether it's a left outer join or a right outer join. According to the code, OUTER is treated as an alias for LEFT, but this should be explicitly documented. Consider using more explicit type names, such as type=left or type=right, in documentation examples to avoid confusion.

Metadata

Metadata

Assignees

No one assigned

    Labels

    PPLPiped processing languagebugSomething isn't workingcalcitecalcite migration releatedv3.3.0

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions