TiDB 6.6.0 Release Notes
Release date: February 20, 2023
TiDB version: 6.6.0-DMR
Quick access: Quick start | Installation package
In v6.6.0-DMR, the key new features and improvements are as follows:
Category | Feature | Description |
---|---|---|
Scalability and Performance | TiKV supports Partitioned Raft KV storage engine (experimental) | TiKV introduces the Partitioned Raft KV storage engine, and each Region uses an independent RocksDB instance, which can easily expand the storage capacity of the cluster from TB to PB and provide more stable write latency and stronger scalability. |
TiKV supports batch aggregating data requests | This enhancement significantly reduces total RPCs in TiKV batch-get operations. In situations where data is highly dispersed and the gRPC thread pool has insufficient resources, batching coprocessor requests can improve performance by more than 50%. | |
TiFlash supports Stale Read and compression exchange | TiFlash supports the stale read feature, which can improve query performance in scenarios where real-time requirements are not restricted. TiFlash supports data compression to improve the efficiency of parallel data exchange, and the overall TPC-H performance improves by 10%, which can save more than 50% of the network usage. | |
Reliability and availability | Resource control (experimental) | Support resource management based on resource groups, which maps database users to the corresponding resource groups and sets quotas for each resource group based on actual needs. |
Historical SQL binding | Support binding historical execution plans and quickly binding execution plans on TiDB Dashboard. | |
SQL functionalities | Foreign key (experimental) | Support MySQL-compatible foreign key constraints to maintain data consistency and improve data quality. |
Multi-valued indexes (experimental) | Introduce MySQL-compatible multi-valued indexes and enhance the JSON type to improve TiDB's compatibility with MySQL 8.0. | |
DB operations and observability | DM supports physical import (experimental) | TiDB Data Migration (DM) integrates TiDB Lightning's physical import mode to improve the performance of full data migration, with performance being up to 10 times faster. |
Feature details
Scalability
Support Partitioned Raft KV storage engine (experimental) #11515 #12842 @busyjay @tonyxuqqi @tabokie @bufferflies @5kbpers @SpadeA-Tang @nolouch
Before TiDB v6.6.0, TiKV's Raft-based storage engine used a single RocksDB instance to store the data of all 'Regions' of the TiKV instance. To support larger clusters more stably, starting from TiDB v6.6.0, a new TiKV storage engine is introduced, which uses multiple RocksDB instances to store TiKV Region data, and the data of each Region is independently stored in a separate RocksDB instance. The new engine can better control the number and level of files in the RocksDB instance, achieve physical isolation of data operations between Regions, and support stably managing more data. You can see it as TiKV managing multiple RocksDB instances through partitioning, which is why the feature is named Partitioned-Raft-KV. The main advantage of this feature is better write performance, faster scaling, and larger volume of data supported with the same hardware. It can also support larger cluster scales.
Currently, this feature is experimental and not recommended for use in production environments.
For more information, see documentation.
Support the distributed parallel execution framework for DDL operations (experimental) #37125 @zimulala
In previous versions, only one TiDB instance in the entire TiDB cluster was allowed to handle schema change tasks as a DDL owner. To further improve DDL concurrency for large table's DDL operations, TiDB v6.6.0 introduces the distributed parallel execution framework for DDL, through which all TiDB instances in the cluster can concurrently execute the
StateWriteReorganization
phase of the same task to speed up DDL execution. This feature is controlled by the system variabletidb_ddl_distribute_reorg
and is currently only supported forAdd Index
operations.
Performance
Support a stable wake-up model for pessimistic lock queues #13298 @MyonKeminta
If an application encounters frequent single-point pessimistic lock conflicts, the existing wake-up mechanism cannot guarantee the time for transactions to acquire locks, which causes high long-tail latency and even lock acquisition timeout. Starting from v6.6.0, you can enable a stable wake-up model for pessimistic locks by setting the value of the system variable
tidb_pessimistic_txn_aggressive_locking
toON
. In this wake-up model, the wake-up sequence of a queue can be strictly controlled to avoid the waste of resources caused by invalid wake-ups. In scenarios with serious lock conflicts, the stable wake-up model can reduce long-tail latency and the P99 response time.Tests indicate this reduces tail latency 40-60%.
For more information, see documentation.
Batch aggregate data requests #39361 @cfzjywxk @you06
When TiDB sends a data request to TiKV, TiDB compiles the request into different sub-tasks according to the Region where the data is located, and each sub-task only processes the request of a single Region. When the data to be accessed is highly dispersed, even if the size of the data is not large, many sub-tasks will be generated, which in turn will generate many RPC requests and consume extra time. Starting from v6.6.0, TiDB supports partially merging data requests that are sent to the same TiKV instance, which reduces the number of sub-tasks and the overhead of RPC requests. In the case of high data dispersion and insufficient gRPC thread pool resources, batching requests can improve performance by more than 50%.
This feature is enabled by default. You can set the batch size of requests using the system variable
tidb_store_batch_size
.Remove the limit on
LIMIT
clauses #40219 @fzzf678Starting from v6.6.0, TiDB plan cache supports caching execution plans with a variable as the
LIMIT
parameter, such asLIMIT ?
orLIMIT 10, ?
. This feature allows more SQL statements to benefit from plan cache, thus improving execution efficiency. Currently, for security considerations, TiDB can only cache execution plans with?
not greater than 10000.For more information, see documentation.
TiFlash supports data exchange with compression #6620 @solotzg
To cooperate with multiple nodes for computing, the TiFlash engine needs to exchange data among different nodes. When the size of the data to be exchanged is very large, the performance of data exchange might affect the overall computing efficiency. In v6.6.0, the TiFlash engine introduces a compression mechanism to compress the data that needs to be exchanged when necessary, and then to perform the exchange, thereby improving the efficiency of data exchange.
For more information, see documentation.
TiFlash supports the Stale Read feature #4483 @hehechen
The Stale Read feature has been generally available (GA) since v5.1.1, which allows you to read historical data at a specific timestamp or within a specified time range. Stale read can reduce read latency and improve query performance by reading data from local TiKV replicas directly. Before v6.6.0, TiFlash does not support Stale Read. Even if a table has TiFlash replicas, Stale Read can only read its TiKV replicas.
Starting from v6.6.0, TiFlash supports the Stale Read feature. When you query the historical data of a table using the
AS OF TIMESTAMP
syntax or thetidb_read_staleness
system variable, if the table has a TiFlash replica, the optimizer now can choose to read the corresponding data from the TiFlash replica, thus further improving query performance.For more information, see documentation.
Support pushing down the
regexp_replace
string function to TiFlash #6115 @xzhangxian1008
Reliability
Support resource control based on resource groups (experimental) #38825 @nolouch @BornChanger @glorv @tiancaiamao @Connor1996 @JmPotato @hnes @CabinfeverB @HuSharp
Now you can create resource groups for a TiDB cluster, bind different database users to corresponding resource groups, and set quotas for each resource group according to actual needs. When the cluster resources are limited, all resources used by sessions in the same resource group will be limited to the quota. In this way, even if a resource group is over-consumed, the sessions in other resource groups are not affected. TiDB provides a built-in view of the actual usage of resources on Grafana dashboards, assisting you to allocate resources more rationally.
The introduction of the resource control feature is a milestone for TiDB. It can divide a distributed database cluster into multiple logical units. Even if an individual unit overuses resources, it does not crowd out the resources needed by other units.
With this feature, you can:
- Combine multiple small and medium-sized applications from different systems into a single TiDB cluster. When the workload of an application grows larger, it does not affect the normal operation of other applications. When the system workload is low, busy applications can still be allocated the required system resources even if they exceed the set read and write quotas, so as to achieve the maximum utilization of resources.
- Choose to combine all test environments into a single TiDB cluster, or group the batch tasks that consume more resources into a single resource group. It can improve hardware utilization and reduce operating costs while ensuring that critical applications can always get the necessary resources.
In addition, the rational use of the resource control feature can reduce the number of clusters, ease the difficulty of operation and maintenance, and save management costs.
In v6.6, you need to enable both TiDB's global variable
tidb_enable_resource_control
and the TiKV configuration itemresource-control.enabled
to enable resource control. Currently, the supported quota method is based on "Request Unit (RU)". RU is TiDB's unified abstraction unit for system resources such as CPU and IO.For more information, see documentation.
Binding historical execution plans is GA #39199 @fzzf678
In v6.5.0, TiDB extends the binding targets in the
CREATE [GLOBAL | SESSION] BINDING
statements and supports creating bindings according to historical execution plans. In v6.6.0, this feature is GA. The selection of execution plans is not limited to the current TiDB node. Any historical execution plan generated by any TiDB node can be selected as the target of SQL binding, which further improves the feature usability.For more information, see documentation.
Add several optimizer hints #39964 @Reminiscent
TiDB adds several optimizer hints in v6.6.0 to control the execution plan selection of
LIMIT
operations.ORDER_INDEX()
: tells the optimizer to use the specified index, to keep the order of the index when reading data, and generates plans similar toLimit + IndexScan(keep order: true)
.NO_ORDER_INDEX()
: tells the optimizer to use the specified index, not to keep the order of the index when reading data, and generates plans similar toTopN + IndexScan(keep order: false)
.
Continuously introducing optimizer hints provides users with more intervention methods, helps solve SQL performance issues, and improves the stability of overall performance.
Support dynamically managing the resource usage of DDL operations (experimental) #38025 @hawkingrei
TiDB v6.6.0 introduces resource management for DDL operations to reduce the impact of DDL changes on online applications by automatically controlling the CPU usage of these operations. This feature is effective only after the DDL distributed parallel execution framework is enabled.
Availability
Support configuring
SURVIVAL_PREFERENCE
for placement rules in SQL #38605 @nolouchSURVIVAL_PREFERENCES
provides data survival preference settings to increase the disaster survivability of data. By specifyingSURVIVAL_PREFERENCE
, you can control the following:- For TiDB clusters deployed across cloud regions, when a cloud region fails, the specified databases or tables can survive in another cloud region.
- For TiDB clusters deployed in a single cloud region, when an availability zone fails, the specified databases or tables can survive in another availability zone.
For more information, see documentation.
Support rolling back DDL operations via the
FLASHBACK CLUSTER TO TIMESTAMP
statement #14088 @Defined2014 @JmPotatoThe
FLASHBACK CLUSTER TO TIMESTAMP
statement supports restoring the entire cluster to a specified point in time within the Garbage Collection (GC) lifetime. In TiDB v6.6.0, this feature adds support for rolling back DDL operations. This can be used to quickly undo a DML or DDL misoperation on a cluster, roll back a cluster within minutes, and roll back a cluster multiple times on the timeline to determine when specific data changes occurred.For more information, see documentation.
SQL
Support MySQL-compatible foreign key constraints (experimental) #18209 @crazycs520
TiDB v6.6.0 introduces the foreign key constraints feature, which is compatible with MySQL. This feature supports referencing within a table or between tables, constraints validation, and cascade operations. This feature helps to migrate applications to TiDB, maintain data consistency, improve data quality, and facilitate data modeling.
For more information, see documentation.
Support MySQL-compatible multi-valued indexes (experimental) #39592 @xiongjiwei @qw4990
TiDB introduces MySQL-compatible multi-valued indexes in v6.6.0. Filtering the values of an array in a JSON column is a common operation, but normal indexes cannot help speed up such an operation. Creating a multi-valued index on an array can greatly improve filtering performance. If an array in the JSON column has a multi-valued index, you can use the multi-valued index to filter the retrieval conditions with
MEMBER OF()
,JSON_CONTAINS()
,JSON_OVERLAPS()
functions, thereby reducing much I/O consumption and improving operation speed.Introducing multi-valued indexes further enhances TiDB's support for the JSON data type and also improves TiDB's compatibility with MySQL 8.0.
For more information, see documentation.
DB operations
Support configuring read-only storage nodes for resource-consuming tasks @v01dstar
In production environments, some read-only operations might consume a large number of resources regularly and affect the performance of the entire cluster, such as backups and large-scale data reading and analysis. TiDB v6.6.0 supports configuring read-only storage nodes for resource-consuming read-only tasks to reduce the impact on the online application. Currently, TiDB, TiSpark, and BR support reading data from read-only storage nodes. You can configure read-only storage nodes according to steps and specify where data is read through the system variable
tidb_replica_read
, the TiSpark configuration itemspark.tispark.replica_read
, or the br command line argument--replica-read-label
, to ensure the stability of cluster performance.For more information, see documentation.
Support dynamically modifying
store-io-pool-size
#13964 @LykxSassinatorThe TiKV configuration item
raftstore.store-io-pool-size
specifies the allowable number of threads that process Raft I/O tasks, which can be adjusted when tuning TiKV performance. Before v6.6.0, this configuration item cannot be modified dynamically. Starting from v6.6.0, you can modify this configuration without restarting the server, which means more flexible performance tuning.For more information, see documentation.
Support specifying the SQL script executed upon TiDB cluster initialization #35624 @morgo
When you start a TiDB cluster for the first time, you can specify the SQL script to be executed by configuring the command line parameter
--initialize-sql-file
. You can use this feature when you need to perform such operations as modifying the value of a system variable, creating a user, or granting privileges.For more information, see documentation.
TiDB Data Migration (DM) integrates with TiDB Lightning's physical import mode for up to a 10x performance boost for full migration (experimental) @lance6716
In v6.6.0, DM full migration capability integrates with physical import mode of TiDB Lightning, which enables DM to improve the performance of full data migration by up to 10 times, greatly reducing the migration time in large data volume scenarios.
Before v6.6.0, for large data volume scenarios, you were required to configure physical import tasks in TiDB Lightning separately for fast full data migration, and then use DM for incremental data migration, which was a complex configuration. Starting from v6.6.0, you can migrate large data volumes without the need to configure TiDB Lightning tasks; one DM task can accomplish the migration.
For more information, see documentation.
TiDB Lightning adds a new configuration parameter
"header-schema-match"
to address the issue of mismatched column names between the source file and the target table @dsdashunIn v6.6.0, TiDB Lightning adds a new profile parameter
"header-schema-match"
. The default value istrue
, which means the first row of the source CSV file is treated as the column name, and consistent with that in the target table. If the field name in the CSV table header does not match the column name of the target table, you can set this configuration tofalse
. TiDB Lightning will ignore the error and continue to import the data in the order of the columns in the target table.For more information, see documentation.
TiDB Lightning supports enabling compressed transfers when sending key-value pairs to TiKV #41163 @sleepymole
Starting from v6.6.0, TiDB Lightning supports compressing locally encoded and sorted key-value pairs for network transfer when sending them to TiKV, thus reducing the amount of data transferred over the network and lowering the network bandwidth overhead. In the earlier TiDB versions before this feature is supported, TiDB Lightning requires relatively high network bandwidth and incurs high traffic charges in case of large data volumes.
This feature is disabled by default. To enable it, you can set the
compress-kv-pairs
configuration item of TiDB Lightning to"gzip"
or"gz"
.For more information, see documentation.
The TiKV-CDC tool is now GA and supports subscribing to data changes of RawKV #48 @zeminzhou @haojinming @pingyu
TiKV-CDC is a CDC (Change Data Capture) tool for TiKV clusters. TiKV and PD can constitute a KV database when used without TiDB, which is called RawKV. TiKV-CDC supports subscribing to data changes of RawKV and replicating them to a downstream TiKV cluster in real time, thus enabling cross-cluster replication of RawKV.
For more information, see documentation.
TiCDC supports scaling out a single table on Kafka changefeeds and distributing the changefeed to multiple TiCDC nodes (experimental) #7720 @overvenus
Before v6.6.0, when a table in the upstream accepts a large amount of writes, the replication capability of this table cannot be scaled out, resulting in an increase in the replication latency. Starting from TiCDC v6.6.0. the changefeed of an upstream table can be distributed to multiple TiCDC nodes in a Kafka sink, which means the replication capability of a single table is scaled out.
For more information, see documentation.
GORM adds TiDB integration tests. Now TiDB is the default database supported by GORM. #6014 @Icemap
In v1.4.6, GORM MySQL driver adapts to the
AUTO_RANDOM
attribute of TiDB #104In v1.4.6, GORM MySQL driver fixes the issue that when connecting to TiDB, the
Unique
attribute of theUnique
field cannot be modified duringAutoMigrate
#105GORM documentation mentions TiDB as the default database #638
For more information, see GORM documentation.
Observability
Support quickly creating SQL binding on TiDB Dashboard #781 @YiniXu9506
TiDB v6.6.0 supports creating SQL binding from statement history, which allows you to quickly bind a SQL statement to a specific plan on TiDB Dashboard.
By providing a user-friendly interface, this feature simplifies the process of binding plans in TiDB, reduces the operation complexity, and improves the efficiency and user experience of the plan binding process.
For more information, see documentation.
Add warning for caching execution plans @qw4990
When an execution plan cannot be cached, TiDB indicates the reason in warning to make diagnostics easier. For example:
mysql> PREPARE st FROM 'SELECT * FROM t WHERE a<?'; Query OK, 0 rows affected (0.00 sec) mysql> SET @a='1'; Query OK, 0 rows affected (0.00 sec) mysql> EXECUTE st USING @a; Empty set, 1 warning (0.01 sec) mysql> SHOW WARNINGS; +---------+------+----------------------------------------------+ | Level | Code | Message | +---------+------+----------------------------------------------+ | Warning | 1105 | skip plan-cache: '1' may be converted to INT | +---------+------+----------------------------------------------+In the preceding example, the optimizer converts a non-INT type to an INT type, and the execution plan might change with the change of the parameter, so TiDB does not cache the plan.
For more information, see documentation.
Add a
Warnings
field to the slow query log #39893 @time-and-fateTiDB v6.6.0 adds a
Warnings
field to the slow query log to help diagnose performance issues. This field records warnings generated during the execution of a slow query. You can also view the warnings on the slow query page of TiDB Dashboard.For more information, see documentation.
Automatically capture the generation of SQL execution plans #38779 @Yisaer
In the process of troubleshooting execution plan issues,
PLAN REPLAYER
can help preserve the scene and improve the efficiency of diagnosis. However, in some scenarios, the generation of some execution plans cannot be reproduced freely, which makes the diagnosis work more difficult.To address such issues, in TiDB v6.6.0,
PLAN REPLAYER
extends the capability of automatic capture. With thePLAN REPLAYER CAPTURE
command, you can register the target SQL statement in advance and also specify the target execution plan at the same time. When TiDB detects the SQL statement or the execution plan that matches the registered target, it automatically generates and packages thePLAN REPLAYER
information. When the execution plan is unstable, this feature can improve diagnostic efficiency.To use this feature, set the value of
tidb_enable_plan_replayer_capture
toON
.For more information, see documentation.
Support persisting statements summary (experimental) #40812 @mornyx
Before v6.6.0, statements summary data is kept in memory and would be lost upon a TiDB server restart. Starting from v6.6.0, TiDB supports enabling statements summary persistence, which allows historical data to be written to disks on a regular basis. In the meantime, the result of queries on system tables will derive from disks, instead of memory. After TiDB restarts, all historical data remains available.
For more information, see documentation.
Security
TiFlash supports automatic rotations of TLS certificates #5503 @ywqzzy
In v6.6.0, TiDB supports automatic rotations of TiFlash TLS certificates. For a TiDB cluster with encrypted data transmission between components enabled, when a TLS certificate of TiFlash expires and needs to be reissued with a new one, the new TiFlash TLS certificate can be automatically loaded without restarting the TiDB cluster. In addition, the rotation of a TLS certificate between components within a TiDB cluster does not affect the use of the TiDB cluster, which ensures high availability of the cluster.
For more information, see documentation.
TiDB Lightning supports accessing Amazon S3 data via AWS IAM role keys and session tokens #4075 @okJiang
Before v6.6.0, TiDB Lightning only supports accessing S3 data via AWS IAM user's access keys (each access key consists of an access key ID and a secret access key) so you cannot use a temporary session token to access S3 data. Starting from v6.6.0, TiDB Lightning supports accessing S3 data via AWS IAM role's access keys + session tokens as well to improve the data security.
For more information, see documentation.
Telemetry
- Starting from February 20, 2023, the telemetry feature is disabled by default in new versions of TiDB and TiDB Dashboard (including v6.6.0). If you upgrade from a previous version that uses the default telemetry configuration, the telemetry feature is disabled after the upgrade. For the specific versions, see TiDB Release Timeline.
- Starting from v1.11.3, the telemetry feature is disabled by default in newly deployed TiUP. If you upgrade from a previous version of TiUP to v1.11.3 or a later version, the telemetry feature keeps the same status as before the upgrade.
Compatibility changes
MySQL compatibility
Support MySQL-compatible foreign key constraints (experimental) #18209 @crazycs520
For more information, see the SQL section in this document and documentation.
Support the MySQL-compatible multi-valued indexes (experimental) #39592 @xiongjiwei @qw4990
For more information, see the SQL section in this document and documentation.
System variables
Variable name | Change type | Description |
---|---|---|
tidb_enable_amend_pessimistic_txn | Deleted | Starting from v6.5.0, this variable is deprecated. Starting from v6.6.0, this variable and the AMEND TRANSACTION feature are deleted. TiDB will use meta lock to avoid the Information schema is changed error. |
tidb_enable_concurrent_ddl | Deleted | This variable controls whether to allow TiDB to use concurrent DDL statements. When this variable is disabled, TiDB uses the old DDL execution framework, which provides limited support for concurrent DDL execution. Starting from v6.6.0, this variable is deleted and TiDB no longer supports the old DDL execution framework. |
tidb_ttl_job_run_interval | Deleted | This variable is used to control the scheduling interval of TTL jobs in the background. Starting from v6.6.0, this variable is deleted, because TiDB provides the TTL_JOB_INTERVAL attribute for every table to control the TTL runtime, which is more flexible than tidb_ttl_job_run_interval . |
foreign_key_checks | Modified | This variable controls whether to enable the foreign key constraint check. The default value changes from OFF to ON , which means enabling the foreign key check by default. |
tidb_enable_foreign_key | Modified | This variable controls whether to enable the foreign key feature. The default value changes from OFF to ON , which means enabling foreign key by default. |
tidb_enable_general_plan_cache | Modified | This variable controls whether to enable General Plan Cache. Starting from v6.6.0, this variable is renamed to tidb_enable_non_prepared_plan_cache . |
tidb_enable_historical_stats | Modified | This variable controls whether to enable historical statistics. The default value changes from OFF to ON , which means that historical statistics are enabled by default. |
tidb_enable_telemetry | Modified | The default value changes from ON to OFF , which means that telemetry is disabled by default in TiDB. |
tidb_general_plan_cache_size | Modified | This variable controls the maximum number of execution plans that can be cached by General Plan Cache. Starting from v6.6.0, this variable is renamed to tidb_non_prepared_plan_cache_size . |
tidb_replica_read | Modified | A new value option learner is added for this variable to specify the learner replicas with which TiDB reads data from read-only nodes. |
tidb_replica_read | Modified | A new value option prefer-leader is added for this variable to improve the overall read availability of TiDB clusters. When this option is set, TiDB prefers to read from the leader replica. When the performance of the leader replica significantly decreases, TiDB automatically reads from follower replicas. |
tidb_store_batch_size | Modified | This variable controls the batch size of the Coprocessor Tasks of the IndexLookUp operator. 0 means to disable batch. Starting from v6.6.0, the default value is changed from 0 to 4 , which means 4 Coprocessor tasks will be batched into one task for each batch of requests. |
mpp_exchange_compression_mode | Newly added | This variable specifies the data compression mode of the MPP Exchange operator. It takes effect when TiDB selects the MPP execution plan with the version number 1 . The default value UNSPECIFIED means that TiDB automatically selects the FAST compression mode. |
mpp_version | Newly added | This variable specifies the version of the MPP execution plan. After a version is specified, TiDB selects the specified version of the MPP execution plan. The default value UNSPECIFIED means that TiDB automatically selects the latest version 1 . |
tidb_ddl_distribute_reorg | Newly added | This variable controls whether to enable distributed execution of the DDL reorg phase to accelerate this phase. The default value OFF means not to enable distributed execution of the DDL reorg phase by default. Currently, this variable takes effect only for ADD INDEX . |
tidb_enable_historical_stats_for_capture | Newly added | This variable controls whether the information captured by PLAN REPLAYER CAPTURE includes historical statistics by default. The default value OFF means that historical statistics are not included by default. |
tidb_enable_plan_cache_for_param_limit | Newly added | This variable controls whether Prepared Plan Cache caches execution plans that contain COUNT after Limit . The default value is ON , which means Prepared Plan Cache supports caching such execution plans. Note that Prepared Plan Cache does not support caching execution plans with a COUNT condition that counts a number greater than 10000. |
tidb_enable_plan_replayer_capture | Newly added | This variable controls whether to enable the PLAN REPLAYER CAPTURE feature. The default value OFF means to disable the PLAN REPLAYER CAPTURE feature. |
tidb_enable_resource_control | Newly added | This variable controls whether to enable the resource control feature. The default value is OFF . When this variable is set to ON , the TiDB cluster supports resource isolation of applications based on resource groups. |
tidb_historical_stats_duration | Newly added | This variable controls how long the historical statistics are retained in storage. The default value is 7 days. |
tidb_index_join_double_read_penalty_cost_rate | Newly added | This variable controls whether to add some penalty cost to the selection of index join. The default value 0 means that this feature is disabled by default. |
tidb_pessimistic_txn_aggressive_locking | Newly added | This variable controls whether to use enhanced pessimistic locking wake-up model for pessimistic transactions. The default value OFF means not to use such a wake-up model for pessimistic transactions by default. |
tidb_stmt_summary_enable_persistent | Newly added | This variable is read-only. It controls whether to enable statements summary persistence. The value of this variable is the same as that of the configuration item tidb_stmt_summary_enable_persistent . |
tidb_stmt_summary_filename | Newly added | This variable is read-only. It specifies the file to which persistent data is written when statements summary persistence is enabled. The value of this variable is the same as that of the configuration item tidb_stmt_summary_filename . |
tidb_stmt_summary_file_max_backups | Newly added | This variable is read-only. It specifies the maximum number of data files that can be persisted when statements summary persistence is enabled. The value of this variable is the same as that of the configuration item tidb_stmt_summary_file_max_backups . |
tidb_stmt_summary_file_max_days | Newly added | This variable is read-only. It specifies the maximum number of days to keep persistent data files when statements summary persistence is enabled. The value of this variable is the same as that of the configuration item tidb_stmt_summary_file_max_days . |
tidb_stmt_summary_file_max_size | Newly added | This variable is read-only. It specifies the maximum size of a persistent data file when statements summary persistence is enabled. The value of this variable is the same as that of the configuration item tidb_stmt_summary_file_max_size . |
Configuration file parameters
Configuration file | Configuration parameter | Change type | Description |
---|---|---|---|
TiKV | rocksdb.enable-statistics | Deleted | This configuration item specifies whether to enable RocksDB statistics. Starting from v6.6.0, this item is deleted. RocksDB statistics are enabled for all clusters by default to help diagnostics. For details, see #13942. |
TiKV | raftdb.enable-statistics | Deleted | This configuration item specifies whether to enable Raft RocksDB statistics. Starting from v6.6.0, this item is deleted. Raft RocksDB statistics are enabled for all clusters by default to help diagnostics. For details, see #13942. |
TiKV | storage.block-cache.shared | Deleted | Starting from v6.6.0, this configuration item is deleted, and the block cache is enabled by default and cannot be disabled. For details, see #12936. |
DM | on-duplicate | Deleted | This configuration item controls the methods to resolve conflicts during the full import phase. In v6.6.0, new configuration items on-duplicate-logical and on-duplicate-physical are introduced to replace on-duplicate . |
TiDB | enable-telemetry | Modified | Starting from v6.6.0, the default value changes from true to false , which means that telemetry is disabled by default in TiDB. |
TiKV | rocksdb.defaultcf.block-size and rocksdb.writecf.block-size | Modified | The default values change from 64K to 32K . |
TiKV | rocksdb.defaultcf.block-cache-size , rocksdb.writecf.block-cache-size , rocksdb.lockcf.block-cache-size | Deprecated | Starting from v6.6.0, these configuration items are deprecated. For details, see #12936. |
PD | enable-telemetry | Modified | Starting from v6.6.0, the default value changes from true to false , which means that telemetry is disabled by default in TiDB Dashboard. |
DM | import-mode | Modified | The possible values of this configuration item are changed from "sql" and "loader" to "logical" and "physical" . The default value is "logical" , which means using TiDB Lightning's logical import mode to import data. |
TiFlash | profile.default.max_memory_usage_for_all_queries | Modified | Specifies the memory usage limit for the generated intermediate data in all queries. Starting from v6.6.0, the default value changes from 0 to 0.8 , which means the limit is 80% of the total memory. |
TiCDC | consistent.storage | Modified | This configuration item specifies the path under which redo log backup is stored. Two more value options are added for scheme , GCS, and Azure. |
TiDB | initialize-sql-file | Newly added | This configuration item specifies the SQL script to be executed when the TiDB cluster is started for the first time. The default value is empty. |
TiDB | tidb_stmt_summary_enable_persistent | Newly added | This configuration item controls whether to enable statements summary persistence. The default value is false , which means this feature is not enabled by default. |
TiDB | tidb_stmt_summary_file_max_backups | Newly added | When statements summary persistence is enabled, this configuration specifies the maximum number of data files that can be persisted. 0 means no limit on the number of files. |
TiDB | tidb_stmt_summary_file_max_days | Newly added | When statements summary persistence is enabled, this configuration specifies the maximum number of days to keep persistent data files. |
TiDB | tidb_stmt_summary_file_max_size | Newly added | When statements summary persistence is enabled, this configuration specifies the maximum size of a persistent data file (in MiB). |
TiDB | tidb_stmt_summary_filename | Newly added | When statements summary persistence is enabled, this configuration specifies the file to which persistent data is written. |
TiKV | resource-control.enabled | Newly added | Whether to enable scheduling for user foreground read/write requests according to the Request Unit (RU) of the corresponding resource groups. The default value is false , which means to disable scheduling according to the RU of the corresponding resource groups. |
TiKV | storage.engine | Newly added | This configuration item specifies the type of the storage engine. Value options are "raft-kv" and "partitioned-raft-kv" . This configuration item can only be specified when creating a cluster and cannot be modified once being specified. |
TiKV | rocksdb.write-buffer-flush-oldest-first | Newly added | This configuration item specifies the flush strategy used when the memory usage of memtable of the current RocksDB reaches the threshold. |
TiKV | rocksdb.write-buffer-limit | Newly added | This configuration item specifies the limit on total memory used by memtable of all RocksDB instances in a single TiKV. The default value is 25% of the total machine memory. |
PD | pd-server.enable-gogc-tuner | Newly added | This configuration item controls whether to enable the GOGC tuner, which is disabled by default. |
PD | pd-server.gc-tuner-threshold | Newly added | This configuration item specifies the maximum memory threshold ratio for tuning GOGC. The default value is 0.6 . |
PD | pd-server.server-memory-limit-gc-trigger | Newly added | This configuration item specifies the threshold ratio at which PD tries to trigger GC. The default value is 0.7 . |
PD | pd-server.server-memory-limit | Newly added | This configuration item specifies the memory limit ratio for a PD instance. The value 0 means no memory limit. |
TiCDC | scheduler.region-per-span | Newly added | This configuration item controls whether to split a table into multiple replication ranges based on the number of Regions, and these ranges can be replicated by multiple TiCDC nodes. The default value is 50000 . |
TiDB Lightning | compress-kv-pairs | Newly added | This configuration item controls whether to enable compression when sending KV pairs to TiKV in the physical import mode. The default value is empty, meaning that the compression is not enabled. |
DM | checksum-physical | Newly added | This configuration item controls whether DM performs ADMIN CHECKSUM TABLE <table> for each table to verify data integrity after the import. The default value is "required" , which performs admin checksum after the import. If checksum fails, DM pauses the task and you need to manually handle the failure. |
DM | disk-quota-physical | Newly added | This configuration item sets the disk quota. It corresponds to the disk-quota configuration of TiDB Lightning. |
DM | on-duplicate-logical | Newly added | This configuration item controls how DM resolves conflicting data in the logical import mode. The default value is "replace" , which means using the new data to replace the existing data. |
DM | on-duplicate-physical | Newly added | This configuration item controls how DM resolves conflicting data in the physical import mode. The default value is "none" , which means not resolving conflicting data. "none" has the best performance, but might lead to inconsistent data in the downstream database. |
DM | sorting-dir-physical | Newly added | This configuration item specifies the directory used for local KV sorting in the physical import mode. The default value is the same as the dir configuration. |
sync-diff-inspector | skip-non-existing-table | Newly added | This configuration item controls whether to skip checking upstream and downstream data consistency when tables in the downstream do not exist in the upstream. |
TiSpark | spark.tispark.replica_read | Newly added | This configuration item controls the type of replicas to be read. The value options are leader , follower , and learner . |
TiSpark | spark.tispark.replica_read.label | Newly added | This configuration item is used to set labels for the target TiKV node. |
Others
- Support dynamically modifying
store-io-pool-size
. This facilitates more flexible TiKV performance tuning. - Remove the limit on
LIMIT
clauses, thus improving the execution performance. - Starting from v6.6.0, BR does not support restoring data to clusters earlier than v6.1.0.
- Starting from v6.6.0, TiDB no longer supports modifying column types on partitioned tables because of potential correctness issues.
Improvements
TiDB
- Improve the scheduling mechanism of TTL background cleaning tasks to allow the cleaning task of a single table to be split into several sub-tasks and scheduled to run on multiple TiDB nodes simultaneously #40361 @YangKeao
- Optimize the column name display of the result returned by running multi-statements after setting a non-default delimiter #39662 @mjonss
- Optimize the execution efficiency of statements after warning messages are generated #39702 @tiancaiamao
- Support distributed data backfill for
ADD INDEX
(experimental) #37119 @zimulala - Support using
CURDATE()
as the default value of a column #38356 @CbcWestwolf partial order prop push down
now supports the LIST-type partitioned tables #40273 @winoros- Add error messages for conflicts between optimizer hints and execution plan bindings #40910 @Reminiscent
- Optimize the plan cache strategy to avoid non-optimal plans when using plan cache in some scenarios #40312 #40218 #40280 #41136 #40686 @qw4990
- Clear expired region cache regularly to avoid memory leak and performance degradation #40461 @sticnarf
MODIFY COLUMN
is not supported on partitioned tables #39915 @wjhuang2016- Disable renaming of columns that partition tables depend on #40150 @mjonss
- Refine the error message reported when a column that a partitioned table depends on is deleted #38739 @jiyfhust
- Add a mechanism that
FLASHBACK CLUSTER
retries when it fails to check themin-resolved-ts
#39836 @Defined2014
TiKV
- Optimize the default values of some parameters in partitioned-raft-kv mode: the default value of the TiKV configuration item
storage.block-cache.capacity
is adjusted from 45% to 30%, and the default value ofregion-split-size
is adjusted from96MiB
adjusted to10GiB
. When using raft-kv mode andenable-region-bucket
istrue
,region-split-size
is adjusted to 1 GiB by default. #12842 @tonyxuqqi - Support priority scheduling in Raftstore asynchronous writes #13730 @Connor1996
- Support starting TiKV on a CPU with less than 1 core #13586 #13752 #14017 @andreid-db
- Optimize the new detection mechanism of Raftstore slow score and add
evict-slow-trend-scheduler
#14131 @innerr - Force the block cache of RocksDB to be shared and no longer support setting the block cache separately according to CF #12936 @busyjay
- Optimize the default values of some parameters in partitioned-raft-kv mode: the default value of the TiKV configuration item
PD
- Support managing the global memory threshold to alleviate the OOM problem (experimental) #5827 @hnes
- Add the GC Tuner to alleviate the GC pressure (experimental) #5827 @hnes
- Add the
evict-slow-trend-scheduler
scheduler to detect and schedule abnormal nodes #5808 @innerr - Add the keyspace manager to manage keyspace #5293 @AmoebaProtozoa
TiFlash
- Support an independent MVCC bitmap filter that decouples the MVCC filtering operations in the TiFlash data scanning process, which provides the foundation for future optimization of the data scanning process #6296 @JinheLin
- Reduce the memory usage of TiFlash by up to 30% when there is no query #6589 @hongyunyan
Tools
Backup & Restore (BR)
TiCDC
TiDB Data Migration (DM)
Optimize DM alert rules and content #7376 @D3Hunter
Previously, alerts similar to "DM_XXX_process_exits_with_error" were raised whenever a related error occurred. But some alerts are caused by idle database connections, which can be recovered after reconnecting. To reduce these kinds of alerts, DM divides errors into two types: automatically recoverable errors and unrecoverable errors:
- For an error that is automatically recoverable, DM reports the alert only if the error occurs more than 3 times within 2 minutes.
- For an error that is not automatically recoverable, DM maintains the original behavior and reports the alert immediately.
Optimize relay performance by adding the async/batch relay writer #4287 @GMHDBJD
TiDB Lightning
- Physical Import Mode supports keyspace #40531 @iosmanthus
- Support setting the maximum number of conflicts by
lightning.max-error
#40743 @dsdashun - Support importing CSV data files with BOM headers #40744 @dsdashun
- Optimize the processing logic when encountering TiKV flow-limiting errors and try other available regions instead #40205 @lance6716
- Disable checking the table foreign keys during import #40027 @sleepymole
Dumpling
sync-diff-inspector
- Add a new parameter
skip-non-existing-table
to control whether to skip checking upstream and downstream data consistency when tables in the downstream do not exist in the upstream #692 @lichunzhu @liumengya94
- Add a new parameter
Bug fixes
TiDB
- Fix the issue that a statistics collection task fails due to an incorrect
datetime
value #39336 @xuyifangreeneyes - Fix the issue that
stats_meta
is not created following table creation #38189 @xuyifangreeneyes - Fix frequent write conflicts in transactions when performing DDL data backfill #24427 @mjonss
- Fix the issue that sometimes an index cannot be created for an empty table using ingest mode #39641 @tangenta
- Fix the issue that
wait_ts
in the slow query log is the same for different SQL statements within the same transaction #39713 @TonsnakeLin - Fix the issue that the
Assertion Failed
error is reported when adding a column during the process of deleting a row record #39570 @wjhuang2016 - Fix the issue that the
not a DDL owner
error is reported when modifying a column type #39643 @zimulala - Fix the issue that no error is reported when inserting a row after exhaustion of the auto-increment values of the
AUTO_INCREMENT
column #38950 @Dousir9 - Fix the issue that the
Unknown column
error is reported when creating an expression index #39784 @Defined2014 - Fix the issue that data cannot be inserted into a renamed table when the generated expression includes the name of this table #39826 @Defined2014
- Fix the issue that the
INSERT ignore
statement cannot fill in default values when the column is write-only #40192 @YangKeao - Fix the issue that resources are not released when disabling the resource management module #40546 @zimulala
- Fix the issue that TTL tasks cannot trigger statistics updates in time #40109 @YangKeao
- Fix the issue that unexpected data is read because TiDB improperly handles
NULL
values when constructing key ranges #40158 @tiancaiamao - Fix the issue that illegal values are written to a table when the
MODIFT COLUMN
statement also changes the default value of a column #40164 @wjhuang2016 - Fix the issue that the adding index operation is inefficient due to invalid Region cache when there are many Regions in a table #38436 @tangenta
- Fix data race occurred in allocating auto-increment IDs #40584 @Dousir9
- Fix the issue that the implementation of the not operator in JSON is incompatible with the implementation in MySQL #40683 @YangKeao
- Fix the issue that concurrent view might cause DDL operations to be blocked #40352 @zeminzhou
- Fix data inconsistency caused by concurrently executing DDL statements to modify columns of partitioned tables #40620 @mjonss @mjonss
- Fix the issue that "Malformed packet" is reported when using
caching_sha2_password
for authentication without specifying a password #40831 @dveeden - Fix the issue that a TTL task fails if the primary key of the table contains an
ENUM
column #40456 @lcwangchao - Fix the issue that some DDL operations blocked by MDL cannot be queried in
mysql.tidb_mdl_view
#40838 @YangKeao - Fix the issue that data race might occur during DDL ingestion #40970 @tangenta
- Fix the issue that TTL tasks might delete some data incorrectly after the time zone changes #41043 @lcwangchao
- Fix the issue that
JSON_OBJECT
might report an error in some cases #39806 @YangKeao - Fix the issue that TiDB might deadlock during initialization #40408 @Defined2014
- Fix the issue that the value of system variables might be incorrectly modified in some cases due to memory reuse #40979 @lcwangchao
- Fix the issue that data might be inconsistent with the index when a unique index is created in the ingest mode #40464 @tangenta
- Fix the issue that some truncate operations cannot be blocked by MDL when truncating the same table concurrently #40484 @wjhuang2016
- Fix the issue that the
SHOW PRIVILEGES
statement returns an incomplete privilege list #40591 @CbcWestwolf - Fix the issue that TiDB panics when adding a unique index #40592 @tangenta
- Fix the issue that executing the
ADMIN RECOVER
statement might cause the index data to be corrupted #40430 @xiongjiwei - Fix the issue that a query might fail when the queried table contains a
CAST
expression in the expression index #40130 @xiongjiwei - Fix the issue that a unique index might still produce duplicate data in some cases #40217 @tangenta
- Fix the PD OOM issue when there is a large number of Regions but the table ID cannot be pushed down when querying some virtual tables using
Prepare
orExecute
#39605 @djshow832 - Fix the issue that data race might occur when an index is added #40879 @tangenta
- Fix the
can't find proper physical plan
issue caused by virtual columns #41014 @AilinKid - Fix the issue that TiDB cannot restart after global bindings are created for partition tables in dynamic trimming mode #40368 @Yisaer
- Fix the issue that
auto analyze
causes graceful shutdown to take a long time #40038 @xuyifangreeneyes - Fix the panic of the TiDB server when the IndexMerge operator triggers memory limiting behaviors #41036 @guo-shaoge
- Fix the issue that the
SELECT * FROM table_name LIMIT 1
query on partitioned tables is slow #40741 @solotzg
- Fix the issue that a statistics collection task fails due to an incorrect
TiKV
- Fix an error that occurs when casting the
const Enum
type to other types #14156 @wshwsh12 - Fix the issue that Resolved TS causes higher network traffic #14092 @overvenus
- Fix the data inconsistency issue caused by network failure between TiDB and TiKV during the execution of a DML after a failed pessimistic DML #14038 @MyonKeminta
- Fix an error that occurs when casting the
PD
- Fix the issue that the Region Scatter task generates redundant replicas unexpectedly #5909 @HundunDM
- Fix the issue that the Online Unsafe Recovery feature would get stuck and time out in
auto-detect
mode #5753 @Connor1996 - Fix the issue that the execution
replace-down-peer
slows down under certain conditions #5788 @HundunDM - Fix the PD OOM issue that occurs when the calls of
ReportMinResolvedTS
are too frequent #5965 @HundunDM
TiFlash
- Fix the issue that querying TiFlash-related system tables might get stuck #6745 @lidezhu
- Fix the issue that semi-joins use excessive memory when calculating Cartesian products #6730 @gengliqi
- Fix the issue that the result of the division operation on the DECIMAL data type is not rounded #6393 @LittleFall
- Fix the issue that
start_ts
cannot uniquely identify an MPP query in TiFlash queries, which might cause an MPP query to be incorrectly canceled #43426 @hehechen
Tools
Backup & Restore (BR)
- Fix the issue that when restoring log backup, hot Regions cause the restore to fail #37207 @Leavrth
- Fix the issue that restoring data to a cluster on which the log backup is running causes the log backup file to be unrecoverable #40797 @Leavrth
- Fix the issue that the PITR feature does not support CA-bundles #38775 @YuJuncen
- Fix the panic issue caused by duplicate temporary tables during recovery #40797 @joccau
- Fix the issue that PITR does not support configuration changes for PD clusters #14165 @YuJuncen
- Fix the issue that the connection failure between PD and tidb-server causes PITR backup progress not to advance #41082 @YuJuncen
- Fix the issue that TiKV cannot listen to PITR tasks due to the connection failure between PD and TiKV #14159 @YuJuncen
- Fix the issue that the frequency of
resolve lock
is too high when there is no PITR backup task in the TiDB cluster #40759 @joccau - Fix the issue that when a PITR backup task is deleted, the residual backup data causes data inconsistency in new tasks #40403 @joccau
TiCDC
- Fix the issue that
transaction_atomicity
andprotocol
cannot be updated via the configuration file #7935 @CharlesCheung96 - Fix the issue that precheck is not performed on the storage path of redo log #6335 @CharlesCheung96
- Fix the issue of insufficient duration that redo log can tolerate for S3 storage failure #8089 @CharlesCheung96
- Fix the issue that changefeed might get stuck in special scenarios such as when scaling in or scaling out TiKV or TiCDC nodes #8174 @hicqu
- Fix the issue of too high traffic among TiKV nodes #14092 @overvenus
- Fix the performance issues of TiCDC in terms of CPU usage, memory control, and throughput when the pull-based sink is enabled #8142 #8157 #8001 #5928 @hicqu @hi-rustin
- Fix the issue that
TiDB Data Migration (DM)
- Fix the issue that the
binlog-schema delete
command fails to execute #7373 @liumengya94 - Fix the issue that the checkpoint does not advance when the last binlog is a skipped DDL #8175 @D3Hunter
- Fix a bug that when the expression filters of both "update" and "non-update" types are specified in one table, all
UPDATE
statements are skipped #7831 @lance6716 - Fix a bug that when only one of
update-old-value-expr
orupdate-new-value-expr
is set for a table, the filter rule does not take effect or DM panics #7774 @lance6716
- Fix the issue that the
TiDB Lightning
- Fix the issue that TiDB Lightning timeout hangs due to TiDB restart in some scenarios #33714 @lichunzhu
- Fix the issue that TiDB Lightning might incorrectly skip conflict resolution when all but the last TiDB Lightning instance encounters a local duplicate record during a parallel import #40923 @lichunzhu
- Fix the issue that precheck cannot accurately detect the presence of a running TiCDC in the target cluster #41040 @lance6716
- Fix the issue that TiDB Lightning panics in the split-region phase #40934 @lance6716
- Fix the issue that the conflict resolution logic (
duplicate-resolution
) might lead to inconsistent checksums #40657 @sleepymole - Fix a possible OOM problem when there is an unclosed delimiter in the data file #40400 @buchuitoudegou
- Fix the issue that the file offset in the error report exceeds the file size #40034 @buchuitoudegou
- Fix an issue with the new version of PDClient that might cause parallel import to fail #40493 @AmoebaProtozoa
- Fix the issue that TiDB Lightning prechecks cannot find dirty data left by previously failed imports #39477 @dsdashun
Contributors
We would like to thank the following contributors from the TiDB community: