Skip to content

[Feature] Add Arrow Flight SQL transport for StarRocks connector#8910

Open
EdwardArchive wants to merge 6 commits intorilldata:mainfrom
EdwardArchive:feat/starrocks-arrow-flight-sql
Open

[Feature] Add Arrow Flight SQL transport for StarRocks connector#8910
EdwardArchive wants to merge 6 commits intorilldata:mainfrom
EdwardArchive:feat/starrocks-arrow-flight-sql

Conversation

@EdwardArchive
Copy link
Contributor

@EdwardArchive EdwardArchive commented Feb 23, 2026

Summary

Issue : #8911

  • Add Arrow Flight SQL as an alternative query transport for the StarRocks OLAP connector
  • Route Query()/QuerySchema() through Arrow Flight SQL when transport: "flight_sql" is configured; Exec()/DryRun always use MySQL (Arrow Flight SQL supports DDL/DML but offers no performance benefit for non-data-returning operations)
  • Implement DoGet BE routing: FE handles Execute(), DoGet is routed to BE nodes via endpoint Location URIs
  • Add BE client pooling with automatic eviction on connection failure for resilience to BE restarts
  • Add flight_sql_max_conns semaphore to prevent exhausting StarRocks FE's per-user connection limit (default 1024), following the same pattern as the DuckDB driver
  • Apply ExecutionTimeout via context.WithTimeout, consistent with ClickHouse, Druid, Pinot, and DuckDB drivers
  • Fix flightRows.Scan to write through pointers via sqlconvert.ConvertAssign, matching Redshift/Athena/BigQuery drivers
  • Automatically fall back to MySQL for parameterized queries (stmt.Args), since Flight SQL does not support query parameters
  • Upgrade test image to StarRocks 4.0.4 (fixes VARBINARY Arrow Flight serialization via StarRocks PR #65889)
  • Add Arrow Flight SQL documentation to StarRocks connector docs

Benchmark Results

Environment: AMD RYZEN AI MAX+ 395 (32 cores), StarRocks 4.0.4, -benchtime=50x -count=3

Query Size Comparison (ad_bids table, 100K rows)

Query Rows MySQL FlightSQL Ratio MySQL allocs FlightSQL allocs
SmallLiteral (SELECT 1) 1 0.67ms 2.28ms MySQL 3.4x faster 40 273
SingleRow (LIMIT 1) 1 3.72ms 4.19ms ~same 190 620
AdBids100 (LIMIT 100) 100 2.75ms 2.94ms ~same 953 674
AdBids1K (LIMIT 1000) 1,000 3.69ms 2.97ms FlightSQL 1.2x faster 11,459 4,732
AdBids10K (LIMIT 10000) 10,000 9.55ms 4.87ms FlightSQL 2.0x faster 119,681 46,930
AdBidsAll (100K rows) 100,000 54.5ms 19.5ms FlightSQL 2.8x faster 1,200,752 468,549
AdBidsAgg (GROUP BY) 14 9.22ms 9.36ms ~same 168 348

Full Table Scan Throughput (100K rows)

Metric MySQL FlightSQL Ratio
Query time 56.0ms 21.0ms FlightSQL 2.7x faster
Throughput 1.79M rows/sec 4.77M rows/sec FlightSQL 2.7x faster
Memory (B/op) 17.2MB 12.7MB FlightSQL 26% less
Allocations 1,200,752 468,543 FlightSQL 61% fewer

Key Observations

  • Crossover point: ~1K rows (FlightSQL becomes faster)
  • Large result sets (100K rows): FlightSQL is 2.8x faster with 61% fewer allocations
  • Aggregation queries: Similar performance (server-side processing dominates; result set is small)
  • Memory efficiency: Arrow columnar batches vs row-by-row MySQL scanning reduces GC pressure on the Rill runtime

Configuration

type: connector
driver: starrocks

host: "starrocks-fe.example.com"
port: 9030
username: "analyst"
password: "{{ .env.STARROCKS_PASSWORD }}"
database: "my_database"

# Arrow Flight SQL settings
transport: "flight_sql"           # "mysql" (default) or "flight_sql"
flight_sql_port: 9408             # FE Arrow Flight port (default: 9408)
flight_sql_be_addr: "host:port"   # Optional: override BE address for NAT/Docker
flight_sql_max_conns: 100         # Max concurrent Flight SQL queries (default: 100)

Architecture

Query() / QuerySchema()
  │
  ├─ ExecutionTimeout → context.WithTimeout (if set)
  │
  ├─ transport=mysql (default) → MySQL protocol (port 9030)
  │
  └─ transport=flight_sql
       ├─ stmt.Args non-empty? → fallback to MySQL (Flight SQL has no parameter support)
       ├─ Execute() → FE (port 9408)
       └─ DoGet()   → BE (via endpoint Location URI, with pooled clients)
                         └─ on failure: evict cached client, next query reconnects

Exec() / DryRun → Always MySQL (no performance benefit for DDL/DML)

Files Changed

File Action Description
starrocks.go Modified Add Transport, FlightSQLPort, FlightSQLBEAddr, FlightSQLMaxConns config; BE client pool; semaphore init
olap.go Modified Transport dispatch in Query()/QuerySchema(); ExecutionTimeout support; extract queryMySQL()/querySchemaMySQL()
flight.go New Flight SQL client init, BE routing with pooling/eviction, auth interceptors, IPv6-safe parsing
flight_rows.go New Arrow RecordBatch → drivers.Result conversion; flightRows (Next/MapScan/Scan/Close); semaphore lifecycle
flight_test.go New TestParseFlightLocation (7 cases), Scan pointer tests, NULL handling, parameter fallback
olap_test.go Modified Unified test structure: MySQL + FlightSQL share one container
bench_test.go New MySQL vs FlightSQL benchmarks with connection limit retry
teststarrocks.go Modified Expose FE/BE Flight ports, mount custom configs, StarRocks 4.0.4
testdata/be.conf New BE config with arrow_flight_port = 9419
testdata/fe.conf New FE config with arrow_flight_port = 9408
docs/.../starrocks.md Modified Arrow Flight SQL section, config properties, troubleshooting, version requirement

Checklist:

  • Covered by tests
  • Ran it and it works as intended
  • Reviewed the diff before requesting a review
  • Checked for unhandled edge cases
  • Linked the issues it closes
  • Checked if the docs need to be updated. If so, create a separate Linear DOCS issue
  • Intend to cherry-pick into the release branch
  • I'm proud of this work!

- Route Query/QuerySchema via Arrow Flight SQL when transport=flight_sql; fall back to MySQL for parameterized queries
- Implement DoGet BE routing with client pooling and flight_sql_be_addr override for Docker/NAT
- Upgrade test image to StarRocks 4.0.4; unified test structure running both transports in a single container
- Add flight_sql_max_conns semaphore to prevent FE connection exhaustion
- Add Arrow Flight SQL section to StarRocks connector docs
- Add connection limit retry logic in benchmarks
- Use sqlconvert.ConvertAssign instead of direct assignment
- Add unit tests for parseFlightLocation (7 cases)
- Add integration tests for Scan, NULL handling, and parameter fallback
- Apply context.WithTimeout in Query() when ExecutionTimeout is set
- Evict cached BE client on DoGet failure for automatic reconnection
Keep transport-agnostic date/time assertions (fmt.Sprintf) for MySQL + FlightSQL compatibility.
@EdwardArchive EdwardArchive force-pushed the feat/starrocks-arrow-flight-sql branch from f6a5056 to 8d7f94d Compare February 23, 2026 11:06
@EdwardArchive
Copy link
Contributor Author

The job failed because the Druid test DSN was not configured.

@EdwardArchive
Copy link
Contributor Author

Hey @k-anshul , I know you’re busy, but if you get a chance, could you take a look at the code? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant