Issue 29: Index Design (Monitoring Full Text Index)

Then talk about MySQL full-text index, this article mainly discusses the monitoring of MySQL full-text index.

MySQL has a very complete metadata table to monitor the insertion, update, and deletion of the full-text index table; even the data tracking of the full-text index table and auxiliary tables.

This is divided into three parts:

The first part introduces monitoring related parameters;

The second part introduces the monitoring-related metadata table;

In the third part, examples demonstrate how to monitor.

innodb_ft_aux_table : Dynamically set the name of the full-text index table to be monitored. This parameter must be explicitly set in order to monitor the full-text index normally. The value is generally: database name/table name, such as ytt/ft_sample.

innodb_ft_cache_size : Used to cache INSERT/UPDATE data for a single full-text index table, and start flushing when the cache is full. The default is 8M, and the range is 1.6M-80M.

innodb_ft_total_cache_size : Different from the parameter innodb_ft_cache_size , the scope extends to a single instance instead of a single table. The default is 640M, and the range is 32M-1600M.

innodb_ft_result_cache_limit : Set the full-text index search result set cache, and an error will be reported if this value is exceeded. The default is 2G, the minimum is 1M, and the maximum is 2 to the 32nd power-1 byte.

innodb_ft_enable_diag_print : Print additional full-text index diagnostic information, output to the error log, it is disabled by default.

innodb_optimize_fulltext_only : Disabled by default, full-text index is not sorted, set to ON, only full-text index is sorted.

innodb_ft_num_word_optimize : The minimum number of word segmentation operations that need to be performed when optimizing the full-text index table.

innodb_ft_sort_pll_degree : The number of threads when creating full-text indexing and word segmentation. The default is 2, and the range is 1-32.

The second part, the full-text index data monitoring metadata table:

MySQL currently provides the following dictionary tables to monitor full-text index information

INNODB_FT_CONFIG : Store full-text index metadata and related internal processing data. There are only two fields, KEY and VALUE. The first two lines of record explain:

optimize_checkpoint_limit specifies the maximum time for the optimize table statement to execute.

synced_doc_id: The next DOC_ID to be processed.

mysql> select * from information_schema.innodb_ft_config;
+---------------------------+-------+
| KEY                       | VALUE |
+---------------------------+-------+
| optimize_checkpoint_limit | 180   |
| synced_doc_id             | 10    |
| stopword_table_name       |       |
| use_stopword              | 1     |
+---------------------------+-------+
4 rows in set (0.00 sec)

INNODB_FT_INDEX_TABLE : Store inverted index data table.

INNODB_FT_INDEX_CACHE : Store the newly inserted keyword information. In order to avoid frequent DML operations to rebuild the full-text index, MySQL stores the newly inserted keywords in this table. When the newly inserted keywords fill up the cache (single-table cache: innodb_ft_cache_size /instance shared cache: innodb_ft_total_cache_size ), then connect to the disk On the auxiliary table to merge. Or after the MySQL instance is shut down and the optimize table statement is executed manually, it is merged with the auxiliary table on the disk.

INNODB_FT_DELETED/INNODB_FT_BEING_DELETED

These two tables are used to save the DOC_ID that has been deleted from the full-text index. Only after the optimize table is performed, DOC_ID will be completely deleted from the auxiliary table. The table INNODB_FT_BEING_DELETED is a snapshot of the table INNODB_FT_DELETED. When optimize table is performed on the full-text index table, the data in the table INNODB_FT_DELETED is snapshotted. When this statement is executed, the snapshot table is emptied. The data retention time of the snapshot table is very short. If you want to observe the data in this table, you have to prepare a very large full-text index table.

The third part, full-text index monitoring demonstration:

1. ##### Create a sample table ft1.

 mysql> create table ft1(id int not null auto_increment primary key, s1 text,fulltext ft_s1 (s1));
 Query OK, 0 rows affected (0.39 sec)
 
 mysql> insert into ft1 values (1,'mysql oracle'),(2,'xfs ext3'),(3,'dml ddl dql others'),(4,'oltp olap'),(5,'sqlserver oracle'),(6,'ntfs gfs'),(7,'insert delete update select alter'),(8,'windows linux unix solaris');
 Query OK, 8 rows affected (0.01 sec)
 Records: 8  Duplicates: 0  Warnings: 0

2. ##### Put the target list into the monitoring

Set the parameter innodb_ft_aux_table

 mysql> set global innodb_ft_aux_table ='ytt/ft1';
 Query OK, 0 rows affected (0.01 sec)

3. ##### Query the 8 records just inserted. At this time, if you do not optimize the table, close the MySQL instance, and the buffer pool is not full, the data in this table will always exist.

You can see that this table contains information such as the document ID of the keyword in the document, and the document location.

 mysql> select * from information_schema.innodb_ft_index_cache;
 +-----------+--------------+-------------+-----------+--------+----------+
 | WORD      | FIRST_DOC_ID | LAST_DOC_ID | DOC_COUNT | DOC_ID | POSITION |
 +-----------+--------------+-------------+-----------+--------+----------+
 | alter     |            8 |           8 |         1 |      8 |       28 |
 | ddl       |            4 |           4 |         1 |      4 |        4 |
 | delete    |            8 |           8 |         1 |      8 |        7 |
 | dml       |            4 |           4 |         1 |      4 |        0 |
 | dql       |            4 |           4 |         1 |      4 |        8 |
 | ext3      |            3 |           3 |         1 |      3 |        4 |
 | gfs       |            7 |           7 |         1 |      7 |        5 |
 | insert    |            8 |           8 |         1 |      8 |        0 |
 | linux     |            9 |           9 |         1 |      9 |        8 |
 | mysql     |            2 |           2 |         1 |      2 |        0 |
 | ntfs      |            7 |           7 |         1 |      7 |        0 |
 | olap      |            5 |           5 |         1 |      5 |        5 |
 | oltp      |            5 |           5 |         1 |      5 |        0 |
 | oracle    |            2 |           6 |         2 |      2 |        6 |
 | oracle    |            2 |           6 |         2 |      6 |       10 |
 | others    |            4 |           4 |         1 |      4 |       12 |
 | select    |            8 |           8 |         1 |      8 |       21 |
 | solaris   |            9 |           9 |         1 |      9 |       19 |
 | sqlserver |            6 |           6 |         1 |      6 |        0 |
 | unix      |            9 |           9 |         1 |      9 |       14 |
 | update    |            8 |           8 |         1 |      8 |       14 |
 | windows   |            9 |           9 |         1 |      9 |        0 |
 | xfs       |            3 |           3 |         1 |      3 |        0 |
 +-----------+--------------+-------------+-----------+--------+----------+
 23 rows in set (0.00 sec)

4. ##### Now perform any of the above conditions to trigger the flash disk, the newly inserted data is merged into the index file, and the cache table data is cleared at this time.

   mysql> SET GLOBAL innodb_optimize_fulltext_only=ON;
   Query OK, 0 rows affected (0.00 sec)
   
   mysql> optimize table ft1;
   +---------+----------+----------+----------+
   | Table   | Op       | Msg_type | Msg_text |
   +---------+----------+----------+----------+
   | ytt.ft1 | optimize | status   | OK       |
   +---------+----------+----------+----------+
   1 row in set (0.02 sec)
   
   mysql> select * from information_schema.innodb_ft_index_cache;
   Empty set (0.00 sec)
   
   mysql> SET GLOBAL innodb_optimize_fulltext_only=Off;
   Query OK, 0 rows affected (0.00 sec)
   

5. ##### Now check the full-text index keyword table data and find that the data has been merged into the index file.

   mysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_TABLE  limit 4;
   +--------+--------------+-------------+-----------+--------+----------+
   | WORD   | FIRST_DOC_ID | LAST_DOC_ID | DOC_COUNT | DOC_ID | POSITION |
   +--------+--------------+-------------+-----------+--------+----------+
   | alter  |            8 |           8 |         1 |      8 |       28 |
   | ddl    |            4 |           4 |         1 |      4 |        4 |
   | delete |            8 |           8 |         1 |      8 |        7 |
   | dml    |            4 |           4 |         1 |      4 |        0 |
   +--------+--------------+-------------+-----------+--------+----------+
   4 rows in set (0.00 sec)

6. ##### Try to delete a few records

   mysql> delete from ft1 where id in (2,3,4);
   Query OK, 3 rows affected (0.02 sec)

Check the deleted table again, and record the ID of the deleted document.

mysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_DELETED;
+--------+
| DOC_ID |
+--------+
|      3 |
|      4 |
|      5 |
+--------+
3 rows in set (0.00 sec)

7. ##### Trigger delete record flashing again, execute optimize table, table INNODB_FT_DELETED is cleared.

   mysql> SET GLOBAL innodb_optimize_fulltext_only=ON;
   Query OK, 0 rows affected (0.00 sec)
   
   mysql> optimize table ft1;
   +---------+----------+----------+----------+
   | Table   | Op       | Msg_type | Msg_text |
   +---------+----------+----------+----------+
   | ytt.ft1 | optimize | status   | OK       |
   +---------+----------+----------+----------+
   1 row in set (0.02 sec)
   
   mysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_DELETED;
   Empty set (0.00 sec)
   mysql> SET GLOBAL innodb_optimize_fulltext_only=Off;
   Query OK, 0 rows affected (0.00 sec)

8. ##### Now turn on the parameter innodb_ft_enable_diag_print to print more diagnostic data:

Necessary prerequisites: set the parameter innodb_ft_cache_size= to the smallest, turn on the parameter innodb_ft_enable_diag_print, and set the log level to the largest
   mysql> select @@innodb_ft_cache_size;
   +------------------------+
   | @@innodb_ft_cache_size |
   +------------------------+
   |                1600000 |
   +------------------------+
   1 row in set (0.00 sec) 
   
   mysql> set global innodb_ft_enable_diag_print=on;
   Query OK, 0 rows affected (0.00 sec)
   
   mysql> set global log_error_verbosity=3;
   Query OK, 0 rows affected (0.01 sec)

1) When innodb_ft_cache_size is filled, the cache is cleared and the data is flushed:

   mysql> insert into ft1(s1) select replace(uuid(),'-',' ') from ft1 ;
   Query OK, 256 rows affected (0.07 sec)
   Records: 256  Duplicates: 0  Warnings: 0

Observation log data, table ytt.ft1, synchronizes data to the auxiliary table twice, with 371 keywords in the first time and 516 keywords in the second time.

[Note] [MY-012299] [InnoDB] FTS SYNC for table `ytt`.`ft1`, deleted count: 0 size: 160010 bytes
[Note] [MY-012300] [InnoDB] SYNC words: 371
[Note] [MY-012300] [InnoDB] SYNC words: 516

2) Rebuild the full-text index

  mysql> SET GLOBAL innodb_optimize_fulltext_only=On;
  Query OK, 0 rows affected (0.00 sec)
  
  mysql> optimize table ft1;
  +---------+----------+----------+----------+
  | Table   | Op       | Msg_type | Msg_text |
  +---------+----------+----------+----------+
  | ytt.ft1 | optimize | status   | OK       |
  +---------+----------+----------+----------+
  1 row in set (0.01 sec)
  
  mysql> SET GLOBAL innodb_optimize_fulltext_only=Off;
  Query OK, 0 rows affected (0.00 sec)

Observe the log data, the data shows the start time and completion time of the optimize table statement, and the number of synchronized keywords.

[Note] [MY-012299] [InnoDB] FTS SYNC for table `ytt`.`ft1`, deleted count: 0 size: 0 bytes
[Note] [MY-012300] [InnoDB] SYNC words: 0
[Note] [MY-012323] [InnoDB] FTS start optimize `ytt`.`ft1`
[Note] [MY-012325] [InnoDB] FTS end optimize `ytt`.`ft1

3) Clear table ft1

mysql> truncate ft1;
Query OK, 0 rows affected (0.59 sec)

Observing the log data, the table corresponding to the table ID is removed.

[Note] [MY-012329] [InnoDB] FTS Optimize Removing table 1363

4) Make a query

mysql> select count(*) from ft1 where match(s1) against('806d');
+----------+
| count(*) |
+----------+
|      512 |
+----------+
1 row in set (0.01 sec)

Observe log data and output information such as query time, number of rows, and memory size to be allocated.

[Note] [MY-013045] [InnoDB] keynr=1, '806d
[Note] [MY-012388] [InnoDB] NL search
[Note] [MY-012340] [InnoDB] '806d' -> 512/512 4.3427e-05
[Note] [MY-012341] [InnoDB] FTS Search Processing time: 0 secs: 6 millisec: row(s) 512
[Note] [MY-012342] [InnoDB] Full Search Memory: 66196 (bytes),  Row: 512.

5) Rebuild the clustered index.

mysql> alter table ft1 engine innodb;
Query OK, 512 rows affected (0.78 sec)
Records: 512  Duplicates: 0  Warnings: 0

Observe log data, output the number of concurrency required to rebuild the clustered index, the number of data page splits, the depth of the B+ tree, and the number of records.

[Note] [MY-011825] [InnoDB] Parallel scan: 4
[Note] [MY-011825] [InnoDB] ranges: 3 max_threads: 4 split: 4 depth: 1
[Note] [MY-011825] [InnoDB] n: 512

I believe that after understanding the method of observing full-text indexing, you will have a more thorough understanding of how MySQL handles full-text indexing.


What else do you want to know about the technical content of MySQL? Hurry up and leave a message to tell the editor!