Skip to content

[Feature] Add Arrow Flight SQL transport for StarRocks OLAP connector #8911

@EdwardArchive

Description

@EdwardArchive

Is your feature request related to a problem? Please describe.

The StarRocks connector currently uses only the MySQL wire protocol (port 9030) for all query execution. The MySQL protocol serializes data row-by-row in a text-based format, which becomes a significant performance bottleneck for large result sets commonly encountered in dashboard queries:

  • 10K+ rows: MySQL serialization overhead dominates query latency, with ~54ms for 100K rows at full table scan.
  • High memory allocations: The row-based deserialization path produces ~1.2M allocations per 100K-row query, increasing GC pressure in the Rill runtime process.
  • Throughput ceiling: MySQL protocol tops out at ~1.8M rows/sec for typical analytical queries, limiting dashboard responsiveness for large datasets.

StarRocks 3.0+ natively supports Apache Arrow Flight SQL, a columnar transport protocol designed for high-throughput analytical workloads, but Rill does not currently leverage it.

Describe the solution you'd like

Add Arrow Flight SQL as an opt-in query transport for the StarRocks connector, configurable via a transport: "flight_sql" property. The implementation should:

  1. Route Query() and QuerySchema() through Arrow Flight SQL when configured, while keeping Exec() and DryRun on MySQL (DDL/DML operations don't benefit from columnar transfer).

  2. Automatically fall back to MySQL for parameterized queries (stmt.Args), since Arrow Flight SQL does not support query parameter binding at the protocol level.

  3. Handle StarRocks' FE/BE split architecture: FE handles Execute() (query planning), while BE nodes serve DoGet() (data retrieval). The connector must parse endpoint Location URIs from FlightInfo to route DoGet to the correct BE node.

  4. Pool BE clients to avoid per-query gRPC connection overhead, with eviction on connection failure for resilience during rolling restarts.

  5. Limit concurrency via a configurable semaphore (flight_sql_max_conns) to prevent exhausting StarRocks FE's per-user connection limit.

  6. Support ExecutionTimeout via context.WithTimeout, consistent with other OLAP drivers (ClickHouse, Druid, Pinot).

  7. Provide a flight_sql_be_addr override for environments where BE nodes advertise internal addresses not reachable from the client (e.g., Docker, NAT).

Expected configuration

type: connector
driver: starrocks

host: "starrocks-fe.example.com"
port: 9030
username: "analyst"
password: "{{ .env.STARROCKS_PASSWORD }}"
database: "my_database"
transport: "flight_sql"          # "mysql" (default) or "flight_sql"
flight_sql_port: 9408            # FE Arrow Flight SQL port (default: 9408)
flight_sql_be_addr: "host:port"  # Optional: override BE address for NAT/Docker
flight_sql_max_conns: 100        # Optional: max concurrent Flight SQL queries (default: 100)

Expected performance improvement

Query Size MySQL Flight SQL Improvement
1 row 0.67ms 2.28ms MySQL 3.4x faster (gRPC overhead)
100 rows 2.75ms 2.94ms ~same
1K rows 3.69ms 2.97ms Flight SQL 1.2x faster
10K rows 9.55ms 4.87ms Flight SQL 2.0x faster
100K rows 54.5ms 19.5ms Flight SQL 2.8x faster

Crossover point is ~1K rows. Below this threshold, MySQL is faster due to gRPC connection overhead.

Describe alternatives you've considered

  1. MySQL protocol with compression: MySQL supports zlib and zstd compression, which could reduce network overhead. However, this does not address the row-by-row serialization cost on the client side, nor the high allocation count. Benchmarks with compression showed <10% improvement for typical analytical queries.

  2. JDBC with Arrow extensions: Some JDBC drivers (e.g., Snowflake) support fetching results as Arrow batches. StarRocks' JDBC driver does not currently support this, and JDBC adds Java dependency overhead.

Additional context

Requirements

  • StarRocks 4.0.4+: Earlier versions have Arrow Flight serialization bugs for certain types (e.g., VARBINARY). See StarRocks PR #65889.
  • FE/BE config: arrow_flight_port must be set in both fe.conf (default 9408) and be.conf (default 9419).
  • Network: FE Arrow Flight port must be accessible from the Rill runtime. BE Arrow Flight port must also be accessible unless FE proxy mode is enabled (arrow_flight_proxy_enabled = true in fe.conf).

Architecture

Query() / QuerySchema()
  |
  +-- ExecutionTimeout -> context.WithTimeout (if set)
  |
  +-- transport=mysql (default) -> MySQL protocol (port 9030)
  |
  +-- transport=flight_sql
       +-- stmt.Args non-empty? -> fallback to MySQL
       +-- flightSem.Acquire() -> concurrency limit
       +-- Execute() -> FE (port 9408)
       +-- DoGet()   -> BE (via endpoint Location URI, with pooled clients)
                         +-- on failure: evict cached client, next query reconnects

Exec() / DryRun -> Always MySQL

Full benchmark results (100K rows, full table scan)

Metric MySQL Flight SQL Improvement
Throughput 1.79M rows/sec 4.77M rows/sec 2.7x faster
Memory 17.2MB/op 12.7MB/op 26% less
Allocations 1.2M allocs/op 469K allocs/op 61% fewer

Scope

In scope:

  • transport config property (mysql / flight_sql)
  • Flight SQL client initialization with Basic Auth over gRPC
  • DoGet BE routing via endpoint Location URIs
  • BE client pooling with eviction on failure
  • Arrow RecordBatch to drivers.Result type conversion
  • flightRows implementing Rows interface (Next/MapScan/Scan/Close)
  • Concurrency limiter (flight_sql_max_conns semaphore)
  • ExecutionTimeout support
  • Automatic MySQL fallback for parameterized queries
  • flight_sql_be_addr override for Docker/NAT environments
  • Unified integration tests (MySQL + FlightSQL, single container)
  • Performance benchmarks

Out of scope:

  • Routing Exec() through Flight SQL (no performance benefit for DDL/DML)
  • Multi-endpoint consumption (StarRocks returns single endpoint for normal SQL queries)
  • TLS certificate pinning for BE connections (follows FE SSL setting)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions