title | summary |
---|---|
TiFlash Query Result Materialization |
Learn how to save the query results of TiFlash in a transaction. |
Warning:
This is an experimental feature, which might be changed or removed without prior notice. The syntax and implementation might be modified before GA. If you encounter any problems, you can report an issue on GitHub.
This document introduces how to save the TiFlash query result to a specified TiDB table in an INSERT INTO SELECT
transaction.
Starting from v6.5.0, TiDB supports saving TiFlash query results in a table, that is, TiFlash query result materialization. During the execution of the INSERT INTO SELECT
statement, if TiDB pushes down the SELECT
subquery to TiFlash, the TiFlash query result can be saved to a TiDB table specified in the INSERT INTO
clause. For TiDB versions earlier than v6.5.0, the TiFlash query results are read-only, so if you want to save TiFlash query results, you have to obtain them from the application level, and then save them in a separate transaction or process.
Note:
- By default (
tidb_allow_mpp = ON
), the TiDB optimizer intelligently chooses to push down queries to TiKV or TiFlash based on the query cost. To enforce that the queries are pushed down to TiFlash, you can set the system variabletidb_enforce_mpp
toON
.- During the experimental phase, this feature is disabled by default. To enable this feature, you can set the system variable
tidb_enable_tiflash_read_for_write_stmt
toON
.
The syntax of INSERT INTO SELECT
is as follows.
INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name
[PARTITION (partition_name [, partition_name] ...)]
[(col_name [, col_name] ...)]
SELECT ...
[ON DUPLICATE KEY UPDATE assignment_list]value:
{expr | DEFAULT}
assignment:
col_name = valueassignment_list:
assignment [, assignment] ...
For example, you can save the query result from table t1
in the SELECT
clause to table t2
using the following INSERT INTO SELECT
statement:
INSERT INTO t2 (name, country)
SELECT app_name, country FROM t1;
-
Efficient BI solutions
For many BI applications, the analysis query requests are very heavy. For example, when a lot of users access and refresh a report at the same time, a BI application needs to handle a lot of concurrent query requests. To deal with this situation effectively, you can use
INSERT INTO SELECT
to save the query results of the report in a TiDB table. Then, the end users can query data directly from the result table when the report is refreshed, which avoids multiple repeated computations and analyses. Similarly, by saving historical analysis results, you can further reduce the computation volume for long-time historical data analysis. For example, if you have a reportA
that is used to analyze daily sales profit, you can save the results of reportA
to a result tableT
usingINSERT INTO SELECT
. Then, when you need to generate a reportB
to analyze the sales profit of the past month, you can directly use the daily analysis results in tableT
. This way not only greatly reduces the computation volume but also improves the query response speed and reduces the system load. -
Serving online applications with TiFlash
The number of concurrent requests supported by TiFlash depends on the volume of data and complexity of the queries, but it typically does not exceed 100 QPS. You can use
INSERT INTO SELECT
to save TiFlash query results, and then use the query result table to support highly concurrent online requests. The data in the result table can be updated in the background at a low frequency (for example, at an interval of 0.5 second), which is well below the TiFlash concurrency limit, while still maintaining a high level of data freshness.
- During the execution of the
INSERT INTO SELECT
statement, TiFlash first returns the query results of theSELECT
clause to a TiDB server in the cluster, and then writes the results to the target table, which can have a TiFlash replica. - The execution of the
INSERT INTO SELECT
statement guarantees ACID properties.
-
The TiDB memory limit on the
INSERT INTO SELECT
statement can be adjusted using the system variabletidb_mem_quota_query
. Starting from v6.5.0, it is not recommended to usetxn-total-size-limit
to control transaction memory size.For more information, see TiDB memory control.
-
The TiDB memory limit on the
INSERT INTO SELECT
statement can be adjusted using the system variabletidb_mem_quota_query
. Starting from v6.5.0, it is not recommended to usetxn-total-size-limit
to control transaction memory size.For more information, see TiDB memory control.
-
TiDB has no hard limit on the concurrency of the
INSERT INTO SELECT
statement, but it is recommended to consider the following practices:- When a "write transaction" is large, such as close to 1 GiB, it is recommended to control concurrency to no more than 10.
- When a "write transaction" is small, such as less than 100 MiB, it is recommended to control concurrency to no more than 30.
- Determine the concurrency based on testing results and specific circumstances.