Skip to content

Releases: StarRocks/starrocks

Release notes 2.1.7

26 May 12:45
140673d
Compare
Choose a tag to compare

Release date: May 26, 2022

Improvements

For window functions in which the frame is set to ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, if the partition involved in a calculation is large, StarRocks caches all data of the partition before it performs the calculation. In this situation, a large number of memory resources are consumed. StarRocks has been optimized not to cache all data of the partition in this situation. 5829

Bug Fixes

The following bugs are fixed:

  • When data is loaded into a table that uses the Primary Key model, data processing errors may occur if the creation time of each data version stored in the system does not monotonically increase due to reasons such as backward-moved system time and related unknown bugs. Such data processing errors cause backends (BEs) to stop. #6046
  • Some graphical user interface (GUI) tools automatically configure the set_sql_limit variable. As a result, the SQL statement ORDER BY LIMIT is ignored, and consequently an incorrect number of rows are returned for queries. #5966
  • If the DROP SCHEMA statement is executed on a database, the database is forcibly deleted and cannot be restored. #6201
  • When JSON-formatted data is loaded, BEs stop if the data contains JSON format errors. For example, key-value pairs are not separated by commas (,). #6098
  • When a large amount of data is being loaded in a highly concurrent manner, tasks that are run to write data to disks are piled up on BEs. In this situation, the BEs may stop. #3877
  • StarRocks estimates the amount of memory that is required before it performs a schema change on a table. If the table contains a large number of STRING fields, the memory estimation result may be inaccurate. In this situation, if the estimated amount of memory that is required exceeds the maximum memory that is allowed for a single schema change operation, schema change operations that are supposed to be properly run encounter errors. #6322
  • After a schema change is performed on a table that uses the Primary Key model, a "duplicate key xxx" error may occur when data is loaded into that table. #5878
  • If low-cardinality optimization is performed during Shuffle Join operations, partitioning errors may occur. #4890
  • If a colocation group (CG) contains a large number of tables and data is frequently loaded into the tables, the CG may not be able to stay in the stable state. In this case, the JOIN statement does not support Colocate Join operations. StarRocks has been optimized to wait for a little longer during data loading. This way, the integrity of the tablet replicas to which data is loaded can be maximized.

Full Changelog: 2.1.6...2.1.7

Thanks to

@Astralidea, @HangyuanLiu, @Linkerist, @Youngwb, @chaoyli, @decster, @dirtysalt, @gengjun-git, @meegoo, @rickif, @sevev, @stdpain, @trueeyu, @xiaoyong-z

Release notes 2.0.6

26 May 11:57
8bac338
Compare
Choose a tag to compare

Release date: May 25, 2022

Bug Fixes

The following bugs are fixed:

  • Some graphical user interface (GUI) tools automatically configure the set_sql_limit variable. As a result, the SQL statement ORDER BY LIMIT is ignored, and consequently an incorrect number of rows are returned for queries. #5966
  • If a colocation group (CG) contains a large number of tables and data is frequently loaded into the tables, the CG may not be able to stay in the stable state. In this case, the JOIN statement does not support Colocate Join operations. StarRocks has been optimized to wait for a little longer during data loading. This way, the integrity of the tablet replicas to which data is loaded can be maximized.
  • If a few replicas fail to be loaded due to reasons such as heavy loads or high network latencies, cloning on these replicas is triggered. In this case, deadlocks may occur, which may cause a situation in which the loads on processes are low but a large number of requests time out. #5646 #6290
  • After the schema of a table that uses the Primary Key model is changed, a "duplicate key xxx" error may occur when data is loaded into that table. #5878
  • If the DROP SCHEMA statement is executed on a database, the database is forcibly deleted and cannot be restored. #6201

Full Changelog: 2.0.5...2.0.6

Thanks to

@Astralidea, @Linkerist, @chaoyli, @decster, @dirtysalt, @gengjun-git, @sevev, @stdpain

Release notes 2.2.0

25 May 13:13
671149f
Compare
Choose a tag to compare

New Features

  • [Preview] Resource groups are supported. By using resource groups to control CPU and memory usage, StarRocks can achieve resource isolation and rational use of resources when different tenants perform complex and simple queries in the same cluster.
  • [Preview] Java UDFs (user-defined functions) are supported. StarRocks supports writing UDFs in Java, extending StarRocks' functions.
  • [Preview] Primary key model supports partial updates when data is loaded to the primary key model using Stream Load, Broker Load, and Routine Load. In real-time data update scenarios such as updating orders and joining multiple streams, partial updates allow users to update only a few columns.
  • [Preview] JSON data types and JSON functions are supported.
  • External tables based on Apache Hudi are supported, which further improves data lake analytics experience.
  • The following functions are supported:
    • ARRAY functions, including array_agg, array_sort, array_distinct, array_join, reverse, array_slice, array_concat, array_difference, array_overlap, and array_intersect
    • BITMAP functions, including bitmap_max and bitmap_min
    • Other functions, including retention and square

Improvement

  • CBO's Parser and Analyzer are reconstructed, code structure is optimized and syntax such as Insert with CTE is supported. So the performance of complex queries is optimized, such as those queries reusing common table expression (CTE).
  • The query performance of object storage-based (AWS S3, Alibaba Cloud OSS, Tencent COS) Apache Hive external table is optimized. After optimization, the performance of object storage-based queries is comparable to that of HDFS-based queries. Also, late materialization of ORC files is supported, improving query performance of small files.
  • When external tables are used to query Apache Hive, StarRocks supports automatic and incremental updating of cached metastore data by consuming Hive Metastore events, such as data changes and partition changes. Moreover, it also supports querying DECIMAL and ARRAY data in Apache Hive.
  • The performance of UNION ALL operator is optimized, delivering improvement of up to 2-25 times.
  • The pipeline engine which can adaptively adjust query parallelism is released, and its profile is optimized. The pipeline engine can improve performance for small queries in high concurrent scenarios.
  • StarRocks supports the loading of CSV files with multi-character row delimiters.

Bug Fixes

The following bugs are fixed:

  • Deadlocks occur when data is loaded and changes are committed into tables based on Primary Key model. #4998
  • Some FE (including BDBJE) stability issues. #4428, #4666, #2
  • The return value overflows when the SUM function is used to calculate a large amount of data. #3944
  • The return values of ROUND and TRUNCATE functions have precision issues. #4256
    Some bugs detected by SQLancer. Please see SQLancer related issues.

Others

  • The Flink connector flink-connector-starrocks supports Flink 1.14.

Release notes 2.0.5

14 May 03:56
Compare
Choose a tag to compare

Release date: May 13, 2022
Upgrade recommendation: Some critical bugs related to the correctness of stored data or data queries have been fixed in this version. It is recommended that you upgrade your StarRocks cluster in time.

Bug Fixes

The following bugs are fixed:

  • [Critical Bug] Data may be lost as a result of BE failures. This bug is fixed by introducing a mechanism that is used to publish a specific version to multiple BEs at a time. #3140
  • [Critical Bug] If tablets are migrated in specific data ingestion phases, data continues to be written to the original disk on which the tablets are stored. As a result, data is lost, and queries cannot be run properly. #5160
  • [Critical Bug] When you run queries after you perform multiple DELETE operations, you may obtain incorrect query results if optimization on low-cardinality columns is performed for the queries. #5712
  • [Critical Bug] If a query contains a JOIN clause that is used to combine a column with DOUBLE values and a column with VARCHAR values, the query result may be incorrect. #5809
  • In certain circumstances, when you load data into your StarRocks cluster, some replicas of specific versions are marked as valid by the FEs before taking effect. At this time, if you query data of the specific versions, StarRocks cannot find the data and reports errors. #5153
  • If a parameter in the SPLIT function is set to NULL, the BEs of your StarRocks cluster may stop running. #4092
  • After your cluster is upgraded from Apache Doris 0.13 to StarRocks 1.19.x and keeps running for a period of time, a further upgrade to StarRocks 2.0.1 may fail. #5309

Thanks to:

@ABingHuang, @Astralidea, @HangyuanLiu, @Pslydhh, @Seaven, @Youngwb, @adzfolc, @decster, @gengjun-git, @kangkaisen, @mergify, @miomiocat, @mofeiatwork, @rickif, @satanson, @sevev, @stdpain

Release notes 2.1.6

11 May 14:13
d44c230
Compare
Choose a tag to compare

Release date: May 10, 2022

Bug Fixes

The following bugs are fixed:

  • When you run queries after you perform multiple DELETE operations, you may obtain incorrect query results if optimization on low-cardinality columns is performed for the queries. #5712
  • If tablets are migrated in specific data ingestion phases, data continues to be written to the original disk on which the tablets are stored. As a result, data is lost, and queries cannot be run properly. #5160
  • If you covert values between the DECIMAL and STRING data types, the return values may be in an unexpected precision. #5608
  • If you multiply a DECIMAL value by a BIGINT value, an arithmetic overflow may occur. A few adjustments and optimizations are made to fix this bug. #4211

Thanks to

@ABingHuang, @Astralidea, @HangyuanLiu, @Seaven, @ZiheLiu, @caneGuy, @gengjun-git, @mergify, @satanson, @sevev, @silverbullet233, @stdpain

Release notes 2.1.5

27 Apr 12:10
2a5c43f
Compare
Choose a tag to compare

Release date: April 27, 2022

BugFix

The following bugs are fixed:

  • The calculation result is not correct when decimal multiplication overflows. After the bug is fixed, NULL is returned when decimal multiplication overflows.
  • When statistics have a considerable deviation from the actual statistics, the priority of Collocate Join can be lower than Broadcast Join. As a result, the query planner may not choose Colocate Join as the more appropriate Join strategy. #4817
  • Query fails because the plan for complex expressions is wrong when there are more than 4 tables to join.
  • BEs may stop working under Shuffle Join when the shuffle column is a low-cardinality column. #4890
  • BEs may stop working when the SPLIT function uses a NULL parameter. #4092

Thanks to:

@ABingHuang, @Astralidea, @HangyuanLiu, @Linkerist, @Seaven, @Youngwb, @adzfolc, @chaoyli, @decster, @gengjun-git, @kangkaisen, @liuyehcf, @meegoo, @mergify, @miomiocat, @mofeiatwork, @rickif, @satanson, @sevev, @stdpain, @trueeyu, @wyb

Release notes 2.0.4

18 Apr 03:07
ca947b1
Compare
Choose a tag to compare

Release date: April 18, 2022

Bug Fixes

The following bugs are fixed:

  • After deleting columns, adding new partitions, and cloning tablets, the columns' unique ids in old and new tablets may not be the same, which may cause BE to stop working because the system uses a shared tablet schema. #4514
  • When data is loading to a StarRocks external table, if the configured FE of the target StarRocks cluster is not a Leader, it will cause the FE to stop working. #4573
  • Query results may be incorrect, when a Duplicate Key table performs schema change and creates materialized view at the same time. #4839
  • The problem of possible data loss due to BE failure (solved by using Batch publish version). #3140

Release notes 2.1.4

12 Apr 09:56
d965a4f
Compare
Choose a tag to compare

Release date: April 8, 2022

New Feature

  • The UUID_NUMERIC function is supported, which returns a LARGEINT value. Compared with UUID function, the performance of UUID_NUMERIC function can be improved by nearly 2 orders of magnitude.

BugFix

The following bugs are fixed:

  • After deleting columns, adding new partitions, and cloning tablets, the columns' unique ids in old and new tablets may not be the same, which may cause BE to stop working because the system uses a shared tablet schema. #4514
  • When data is loading to a StarRocks external table, if the configured FE of the target StarRocks cluster is not a Leader, it will cause the FE to stop working. #4573
  • The results of CAST function are different in StarRocks version 1.19 and 2.1. #4701
  • Query results may be incorrect, when a Duplicate Key table performs schema change and creates materialized view at the same time. #4839

Release notes 2.1.3

12 Apr 09:54
0881cb2
Compare
Choose a tag to compare

Release date: March 19, 2022

Bug Fixes

The following bugs are fixed:

  • The problem of possible data loss due to BE failure (solved by using Batch publish version). #3140
  • Some queries may cause memory limit exceeded errors due to inappropriate execution plans.
  • The checksum between replicas may be inconsistent in different compaction processes. #3438
  • Query may fail in some situation when JSON reorder projection is not processed correctly. #4056

Release notes 2.0.3

14 Mar 08:26
Compare
Choose a tag to compare

Release date: March 14, 2022

BugFix

The following bugs are fixed:

  • Query fails when BE nodes are in suspended animation.
  • Query fails when there is no appropriate execution plan for single-tablet table joins. #3854
  • A deadlock problem may occur when an FE node collects information to build a global dictionary for low-cardinality optimization. #3839