MySQL knowledge learning-ninth day: search with regular expressions

Preface

On the ninth day, continue to rush~

What I learned today is to use regular expressions to search. The content is a bit too much, so try more

Introduction to regular expressions

There are so many regular expressions, so I will only talk about some basic concepts here, just know how to use them.

> Regular expression is a special string (character set) used to match text.

Use MySQL regular expressions

The role of regular expressions is to match text and compare a pattern (regular expression) with a string of text.

Basic character matching

For the following table (num), the data type of the first column is character, and the data type of the second column is numeric

Insert picture description here

Look at the following example.

Insert picture description here


This is very similar to the LIKE statement we learned yesterday, but here we borrow keywords REGEXP(regular expression) to achieve regular matching.

The above example says such a thing.
It tells MySQL: What follows REGEXP is treated as a regular expression (a regular expression that matches 1000 in the text body).

Note that all regular expressions should be quoted , (single quotation marks and double quotation marks are acceptable) because the essence of regular expressions is a string, and it can be understood as a string. Although the result of the following operation is correct, when the regular expression becomes complicated, this approach will cause errors. Therefore, we must not forget the quotation marks when writing the regular expression in the future.

Insert picture description here


Look at the following example

Insert picture description here

And

Insert picture description here


the top two results are not the same, and why? Let's talk about expression first .000, which .is a special character in regular expression language. It means to match any character . That is to say, it cannot be 0 characters (somewhat similar to the wildcard _ we learned before). If all 000 does not match, it will not be filtered out.

Let’s talk about the difference between LIKEand REGEXP.

Insert picture description here


Insert picture description here


We see that the results are different. Why? LIKE matches the entire column, and REGEXP matches within the column value. In layman's terms, LIKE is a complete match, and REGEXP is a partial match . Just explain the above example. Although other records contain 000, 1000 is different from 000. Yes, it doesn't work for LIKE. It must be exactly the same to be filtered by LIKE, but REGEXP will not. It only needs to verify that your column value contains 000. No matter whether you are 1000 or 000, it will be filtered out.

One more thing to note

Insert picture description here

Finally, when retrieving data in regular expressions, there is no requirement on the data type of the column, that is to say, whether it is a character or a numeric type can be retrieved (the above example tests the numeric type, the same You can try the character type, that is, replace the number column behind where with new_number), but when you use the keyword BINARY, the retrieved data column can't be a numeric type. Pay special attention to this.

OR matching

The previous introduction is the case of one string, so is it possible to have two strings or multiple strings? Of course it is possible.

We generally use |to OR match for one of the two search strings

Insert picture description here


The above example means to match records containing "1000" or "2000", as long as it contains one of them.

Similar can be matched with

Insert picture description here


more than one is also possible

Matches one of several characters

What if we only want to match certain characters?

We can use the characters enclosed in [and] to achieve

Insert picture description here


this. The above means to match any character in [123], which is equivalent to matching 1000 or 2000 or 3000. We can find that it []is another kind of OR statement, which [123]is [1|2|3]a shorthand. . It should be noted that [] cannot be omitted, [1|2|3]000and 1|2|3000is not the same. The latter means to retrieve records containing 1 or 2 or 3000.

The character set can also be negated. We need to use it ^. We can put it at the beginning of the set to realize the negation of the set. See

Insert picture description here


the example below the table below to negate the character set. Note that ^ should be placed in [] and put At the very beginning, the

Insert picture description here


above example is to select records that do not contain 1, 2, and 3. Then
look at the following example.

Insert picture description here


Why is this record like this? In fact, it [^123]means to match records with values ​​other than 1, 2, 3, 12, 13, 23, and 123. For example, 3000 is OK but 3 is not.

Match range

The set can be used to define one or more characters to be matched. For example, [123] can be written as [1-3], [0-9] means [0123456789], but [6-3] is illegal, the following value If it is not less than the previous value, it can be equal to.

Insert picture description here


The above example is to match any character in [123456].

Match special characters

Some special characters were mentioned earlier, and they all have special meanings, so how to match them? At this time, you need to escape (similar to other languages), and use \ to realize the escape. For example, for the following table

Insert picture description here


Insert picture description here


Insert picture description here


Insert picture description here

Match character class

To make it easier to work, you can use a predefined character set called a character class.
Insert picture description here


Insert picture description here

Match multiple instances

What if I want to find records where at least two 1s appear?

Insert picture description here


Insert picture description here


The following {2,} means that the preceding character appears no less than twice, and there are other options

Insert picture description here

Locator

All the previous matches are at any position of the match, what if I want to match the specified position?

Insert picture description here


We use the above metacharacters to achieve

Insert picture description here


this means to retrieve the record whose last digit is 1.

Insert picture description here


Insert picture description here

There is a lot of content in this section, so the writing is relatively brief. If you don’t understand, you can communicate together, good night~