Skip to content

[RFC] Add nomv command in PPL #5082

@srikanthpadakanti

Description

@srikanthpadakanti

Proposal — Command syntax, options, and precise semantics

Command
nomv <field>

Arguments

field (required)

The name of a multivalue field whose values will be converted into a single value.

Constraints

  1. Must be a direct field reference.
  2. Must be a multivalue (array) field.
  3. If the field does not exist in the current schema, the command fails.

Precise semantics

nomv operates on the current result set produced by the pipeline. The command is row-local and does not perform grouping or aggregation

Per-row transformation

For each input row:
Read the value of . If the value is a multivalue (array), convert it into a single scalar string. Replace with the converted scalar value. Preserve all other fields unchanged.

Value rendering

Array elements are joined using a newline delimiter ("\n"). The delimiter is fixed and not configurable.

Ordering

  1. Value order in the resulting string follows the existing order of elements in the array at the point where nomv executes.
  2. If deterministic ordering is required, users must ensure ordering before nomv by controlling how the multivalue field is constructed.

Null and missing handling

Missing field (schema-level)

The command fails with an error if does not exist in the current schema.

Null field value (row-level)

If exists but its value is null for a given row, the output value remains null.

Empty array

If is an empty array, the output value is an empty string ("").

Null elements inside array

Null elements are ignored and do not produce delimiters in the output.

Type transformation

Input type: ARRAY
Output type: STRING

Error handling

The command fails if:

  1. The target field does not exist in the current schema.
  2. The target is not a direct field reference.
  3. The target field is not a multivalue (array) type.

Examples

1. Basic

Input:

| user | tags |
| joe | ["a","b"] |
| sam | ["x"] |

Query:
... | nomv tags

| user | tags |
| joe | a\nb |
| sam | x |

2. Multiple rows (row-local behavior)

Input:
| user | tags |
| joe | ["a","b"] |
| joe | ["c","d"] |

Query:
... | nomv tags

Output:

| user | tags |
| joe | a\nb |
| joe | c\nd |

3. Non-consecutive rows (no grouping)

Input:

| user | tags |
| joe | ["a"] |
| sam | ["x"] |
| joe | ["b"] |

Query:
... | nomv tags

Output:
| user | tags |
| joe | a |
| sam | x |
| joe | b |

4. Empty array

Input:

| user | tags |
| joe | [] |

Query:
... | nomv tags

Output:
| user | tags |
| joe | |

5. Null elements inside array

Input:

| user | tags |
| joe | ["a", null, "b"] |

Query:
... | nomv tags

Output:
| user | tags |
| joe | a\nb |

6. Null field value

Input:

| user | tags |
| joe | null |

Query:

... | nomv tags

Output:
| user | tags |
| joe | null |

7. Missing target field

Input:

| user | action |
| joe | login |

Query:
... | nomv tags

Result:
Error: field [tags] not found in schema

8. Scalar field (invalid target)

Input:

| user | status |
| joe | ok |

Query:
... | nomv status

Result:
Error: field [status] is not a multivalue field

Approach

Implement nomv as a streaming projection command. No grouping, buffering, or state is required. The command rewrites the logical plan by replacing with a scalar expression derived from the array value.

Relationship to existing commands

  1. nomv is not an aggregation command.
  2. nomv does not modify row cardinality.
  3. nomv is a per-row transformation and can be modeled as a projection.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions