Application of Redis data structure-GEO [Search for "nearby restaurants", call a taxi on taxi software]

Searching for " nearby restaurants " and requesting taxis on taxi-hailing software are all inseparable from applications based on location -based services (LBS). The data accessed by the LBS application is a set of longitude and latitude information associated with people or things, and it must be able to query the adjacent longitude and latitude ranges

  • Take car-hailing service as an example to analyze the access characteristics of longitude and latitude in the LBS application.
  1. Each online car-hailing has a number (for example, 33), and the online car-hailing needs to send its own longitude information (for example, 116.034579) and latitude information (for example, 39.000452) to the car-hailing application.
  2. When the user calls a car, the car-hailing application will find the user's nearby vehicles and match them according to the user's latitude and longitude location (for example, longitude 116.054579, latitude 39.030452).
  3. After matching users with similar locations with the vehicle, the car-hailing application will obtain the vehicle information according to the vehicle number and return it to the user.

It can be seen that a car (or a user) corresponds to a set of latitude and longitude, and as the position of the car (or user) moves, the corresponding latitude and longitude will also change.

Use hash for key: vehicle ID, value: vehicle latitude and longitude

This data recording mode belongs to a key (such as a car ID) corresponding to a value (a set of latitude and longitude). When there is a lot of vehicle information to be saved, a collection is needed to save a series of keys and values. The Hash collection type can quickly access a series of keys and values, which can be used to record the correspondence between a series of vehicle IDs and latitudes and longitudes. Therefore, we can store the IDs of different vehicles and their corresponding latitudes and longitudes in the Hash collection, as follows As shown in the figure:

Insert picture description here


Hash type HSET operation command will set the corresponding value value according to the key, so we can use it to quickly update the latitude and longitude information of the vehicle change.

problem:

For an LBS application, in addition to recording the latitude and longitude information, it is also necessary to perform a range query in the vehicle's Hash collection based on the user's latitude and longitude information . Once the range query is involved, it means that the elements in the collection need to be ordered , but the elements of the Hash type are unordered, which obviously cannot meet our requirements.

Sorted Set type to record

The underlying data structure of the GEO type is implemented by Sorted Set

The Sorted Set type also supports a record mode in which a key corresponds to a value , where the key is the element in the Sorted Set, and the value is the weight score of the element . More importantly, Sorted Set can sort the elements according to their weight scores, and supports range queries . This can meet the needs of finding neighboring locations in the LBS service.

When using Sorted Set to save the latitude and longitude information of the vehicle, the element of the Sorted Set is the vehicle ID, and the weight score of the element is the latitude and longitude information, as shown in the following figure:

Insert picture description here


The weight score of the Sorted Set elements is a floating point number (float type), And a set of latitude and longitude contains two values ​​of longitude and latitude, which cannot be directly stored as a floating point number.How to save it?

  • GeoHash encoding in GEO type

Redis uses the GeoHash encoding method widely used in the industry. The basic principle of this method is "interval encoding between two partitions" .

For a geographic location information, its longitude range is [-180,180]. GeoHash encoding will encode a longitude value into an N-bit binary value. Let's do N second partition operations on the longitude range [-180,180], where N can be customized.

In the first two divisions, the longitude range [-180,180] will be divided into two sub-intervals: [-180,0) and [0,180] (I call it the left and right divisions). At this point, we can check whether the longitude value to be encoded falls in the left partition or the right partition. If it falls in the left partition, we use 0 to indicate; if it falls in the right partition, we use 1 to indicate. In this way, we can get a 1-bit coded value every time we finish the second partition.

Then, we will do a second partition for the partition to which the longitude value belongs, and at the same time again check whether the longitude value falls in the left partition or the right partition after the second partition, and then perform 1-bit encoding according to the previous rule. After finishing the second partition N times, the longitude value can be represented by an N bit number.

For example, suppose that the longitude value we want to encode is 116.37, and we use a 5-bit encoding value (that is, N=5, do 5 partitions).

Let's do the first two-partition operation first, and divide the longitude interval [-180,180] into the left partition [-180,0) and the right partition [0,180]. At this time, the longitude value 116.37 belongs to the right partition [0,180], so, We use 1 to represent the code value after the first second partition.

Next, we do the second second partition: divide the [0,180] interval to which the longitude value 116.37 belongs, into [0,90) and [90, 180]. At this time, the longitude value 116.37 still belongs to the right partition [90,180], so the code value after the second partition is still 1. Wait until the second partition of [90,180] is performed for the third time, the longitude value 116.37 falls in the left partition [90, 135) after the partition, so the code value after the third partition is 0.

According to this method, after completing 5 partitions, we locate the longitude value 116.37 in the interval [112.5, 123.75], and obtain the 5-bit encoding value of the longitude value, which is 11010. The encoding process is shown in the following table:

Insert picture description here

The coding method for latitude is the same as for longitude, except that the range of latitude is [-90, 90]. The following table shows the coding process for latitude value 39.86.

Insert picture description here

When a group of latitude and longitude values ​​are coded, we then combine their respective coded values ​​together. The combination rule is: the even digits of the final coded value are sequentially coded for longitude, and the odd digits are sequentially coded for latitude Value, where even-numbered bits start from 0, and odd-numbered bits start from 1.

The respective code values ​​of the latitude and longitude (116.37, 39.86) we just calculated are 11010 and 10111. After the combination, the 0th digit is the 0th digit 1 of the longitude, the 1st digit is the latitude 0th digit 1, and the second digit is the longitude The first digit is 1, the third digit is the latitude's first digit 0, and so on, you can get the final code value 1110011101, as shown in the following figure:

Insert picture description here

Of course, after using GeoHash encoding, we are equivalent to dividing the entire geographic space into grids, and each grid corresponds to a partition in GeoHash.