|
| 1 | +--- |
| 2 | +title: TiDB 2.0 Release Notes |
| 3 | +category: Releases |
| 4 | +--- |
| 5 | + |
| 6 | +# TiDB 2.0 Release Notes |
| 7 | + |
| 8 | +On April 27, 2018, TiDB 2.0 GA is released! Compared with TiDB 1.0, this release has great improvement in MySQL compatibility, SQL optimizer, executor, and stability. |
| 9 | + |
| 10 | +## TiDB |
| 11 | + |
| 12 | +- SQL Optimizer |
| 13 | + - Use more compact data structure to reduce the memory usage of statistics information |
| 14 | + - Speed up loading statistics information when starting a tidb-server process |
| 15 | + - Support updating statistics information dynamically [experimental] |
| 16 | + - Optimize the cost model to provide more accurate query cost evaluation |
| 17 | + - Use `Count-Min Sketch` to estimate the cost of point queries more accurately |
| 18 | + - Support analyzing more complex conditions to make full use of indexes |
| 19 | + - Support manually specifying the `Join` order using the `STRAIGHT_JOIN` syntax |
| 20 | + - Use the Stream Aggregation operator when the `GROUP BY` clause is empty to improve the performance |
| 21 | + - Support using indexes for the `MAX/MIN` function |
| 22 | + - Optimize the processing algorithms for correlated subqueries to support decorrelating more types of correlated subqueries and transform them to `Left Outer Join` |
| 23 | + - Extend `IndexLookupJoin` to be used in matching the index prefix |
| 24 | +- SQL Execution Engine |
| 25 | + - Refactor all operators using the Chunk architecture, improve the execution performance of analytical queries, and reduce memory usage. There is a significant improvement in the TPC-H benchmark result. |
| 26 | + - Support the Streaming Aggregation operators pushdown |
| 27 | + - Optimize the `Insert Into Ignore` statement to improve the performance by over 10 times |
| 28 | + - Optimize the `Insert On Duplicate Key Update` statement to improve the performance by over 10 times |
| 29 | + - Optimize `Load Data` to improve the performance by over 10 times |
| 30 | + - Push down more data types and functions to TiKV |
| 31 | + - Support computing the memory usage of physical operators, and specifying the processing behavior in the configuration file and system variables when the memory usage exceeds the threshold |
| 32 | + - Support limiting the memory usage by a single SQL statement to reduce the risk of OOM |
| 33 | + - Support using implicit RowID in CRUD operations |
| 34 | + - Improve the performance of point queries |
| 35 | +- Server |
| 36 | + - Support the Proxy Protocol |
| 37 | + - Add more monitoring metrics and refine the log |
| 38 | + - Support validating the configuration files |
| 39 | + - Support obtaining the information of TiDB parameters through HTTP API |
| 40 | + - Resolve Lock in the Batch mode to speed up garbage collection |
| 41 | + - Support multi-threaded garbage collection |
| 42 | + - Support TLS |
| 43 | +- Compatibility |
| 44 | + - Support more MySQL syntaxes |
| 45 | + - Support modifying the `lower_case_table_names` system variable in the configuration file to support the OGG data synchronization tool |
| 46 | + - Improve compatibility with the Navicat management tool |
| 47 | + - Support displaying the table creating time in `Information_Schema` |
| 48 | + - Fix the issue that the return types of some functions/expressions differ from MySQL |
| 49 | + - Improve compatibility with JDBC |
| 50 | + - Support more SQL Modes |
| 51 | +- DDL |
| 52 | + - Optimize the `Add Index` operation to greatly improve the execution speed in some scenarios |
| 53 | + - Attach a lower priority to the `Add Index` operation to reduce the impact on online business |
| 54 | + - Output more detailed status information of the DDL jobs in `Admin Show DDL Jobs` |
| 55 | + - Support querying the original statements of currently running DDL jobs using `Admin Show DDL Job Queries JobID` |
| 56 | + - Support recovering the index data using `Admin Recover Index` for disaster recovery |
| 57 | + - Support modifying Table Options using the `Alter` statement |
| 58 | + |
| 59 | +## PD |
| 60 | + |
| 61 | +- Support `Region Merge`, to merge empty Regions after deleting data [experimental] |
| 62 | +- Support `Raft Learner` [experimental] |
| 63 | +- Optimize the scheduler |
| 64 | + - Make the scheduler to adapt to different Region sizes |
| 65 | + - Improve the priority and speed of restoring data during TiKV outage |
| 66 | + - Speed up data transferring when removing a TiKV node |
| 67 | + - Optimize the scheduling policies to prevent the disks from becoming full when the space of TiKV nodes is insufficient |
| 68 | + - Improve the scheduling efficiency of the balance-leader scheduler |
| 69 | + - Reduce the scheduling overhead of the balance-region scheduler |
| 70 | + - Optimize the execution efficiency of the the hot-region scheduler |
| 71 | +- Operations interface and configuration |
| 72 | + - Support TLS |
| 73 | + - Support prioritizing the PD leaders |
| 74 | + - Support configuring the scheduling policies based on labels |
| 75 | + - Support configuring stores with a specific label not to schedule the Raft leader |
| 76 | + - Support splitting Region manually to handle the hotspot in a single Region |
| 77 | + - Support scattering a specified Region to manually adjust Region distribution in some cases |
| 78 | + - Add check rules for configuration parameters and improve validity check of the configuration items |
| 79 | +- Debugging interface |
| 80 | + - Add the `Drop Region` debugging interface |
| 81 | + - Add the interfaces to enumerate the health status of each PD |
| 82 | +- Statistics |
| 83 | + - Add statistics about abnormal Regions |
| 84 | + - Add statistics about Region isolation level |
| 85 | + - Add scheduling related metrics |
| 86 | +- Performance |
| 87 | + - Keep the PD leader and the etcd leader together in the same node to improve write performance |
| 88 | + - Optimize the performance of Region heartbeat |
| 89 | + |
| 90 | +## TiKV |
| 91 | + |
| 92 | +- Features |
| 93 | + - Protect critical configuration from incorrect modification |
| 94 | + - Support `Region Merge` [experimental] |
| 95 | + - Add the `Raw DeleteRange` API |
| 96 | + - Add the `GetMetric` API |
| 97 | + - Add `Raw Batch Put`, `Raw Batch Get`, `Raw Batch Delete` and `Raw Batch Scan` |
| 98 | + - Add Column Family options for the RawKV API and support executing operation on a specific Column Family |
| 99 | + - Support Streaming and Streaming Aggregation in Coprocessor |
| 100 | + - Support configuring the request timeout of Coprocessor |
| 101 | + - Carry timestamps with Region heartbeats |
| 102 | + - Support modifying some RocksDB parameters online, such as `block-cache-size` |
| 103 | + - Support configuring the behavior of Coprocessor when it encounters some warnings or errors |
| 104 | + - Support starting in the importing data mode to reduce write amplification during the data importing process |
| 105 | + - Support manually splitting Region in halves |
| 106 | + - Improve the data recovery tool `tikv-ctl` |
| 107 | + - Return more statistics in Coprocessor to guide the behavior of TiDB |
| 108 | + - Support the `ImportSST` API to import SST files [experimental] |
| 109 | + - Add the TiKV Importer binary to integrate with TiDB Lightning to import data quickly [experimental] |
| 110 | +- Performance |
| 111 | + - Optimize read performance using `ReadPool` and increase the `raw_get/get/batch_get` by 30% |
| 112 | + - Improve metrics performance |
| 113 | + - Inform PD immediately once the Raft snapshot process is completed to speed up balancing |
| 114 | + - Solve performance jitter caused by RocksDB flushing |
| 115 | + - Optimize the space reclaiming mechanism after deleting data |
| 116 | + - Speed up garbage cleaning while starting the server |
| 117 | + - Reduce the I/O overhead during replica migration using `DeleteFilesInRanges` |
| 118 | +- Stability |
| 119 | + - Fix the issue that gRPC call does not get returned when the PD leader switches |
| 120 | + - Fix the issue that it is slow to offline nodes caused by snapshots |
| 121 | + - Limit the temporary space usage consumed by migrating replicas |
| 122 | + - Report the Regions that cannot elect a leader for a long time |
| 123 | + - Update the Region size information in time according to compaction events |
| 124 | + - Limit the size of scan lock to avoid request timeout |
| 125 | + - Limit the memory usage when receiving snapshots to avoid OOM |
| 126 | + - Increase the speed of CI test |
| 127 | + - Fix the OOM issue caused by too many snapshots |
| 128 | + - Configure `keepalive` of gRPC |
| 129 | + - Fix the OOM issue caused by an increase of the Region number |
| 130 | + |
| 131 | +## TiSpark |
| 132 | + |
| 133 | +TiSpark uses a separate version number. The current TiSpark version is 1.0 GA. The components of TiSpark 1.0 provide distributed computing of TiDB data using Apache Spark. |
| 134 | + |
| 135 | +- Provide a gRPC communication framework to read data from TiKV |
| 136 | +- Provide encoding and decoding of TiKV component data and communication protocol |
| 137 | +- Provide calculation pushdown, which includes: |
| 138 | + - Aggregate pushdown |
| 139 | + - Predicate pushdown |
| 140 | + - TopN pushdown |
| 141 | + - Limit pushdown |
| 142 | +- Provide index related support |
| 143 | + - Transform predicate into Region key range or secondary index |
| 144 | + - Optimize `Index Only` queries |
| 145 | + - Optimize table scan when runtime index degenerates |
| 146 | +- Provide cost-based optimization |
| 147 | + - Support statistics |
| 148 | + - Select index |
| 149 | + - Estimate broadcast table cost |
| 150 | +- Provide support for multiple Spark interfaces |
| 151 | + - Support Spark Shell |
| 152 | + - Support ThriftServer/JDBC |
| 153 | + - Support Spark-SQL interaction |
| 154 | + - Support PySpark Shell |
| 155 | + - Support SparkR |
0 commit comments