Summary of the first interview-index, sort

table of Contents

1. Index

*1. Indexes are the mechanism and structure for accelerating the retrieval of data in the table

*2. Suitable for adding indexes

*3. Cases that are not suitable for adding indexes

4. The case of index failure

5. Choice of B tree and B+ tree

6. Why is it recommended that InnoDB tables must have a primary key?

7. Why is it recommended to use an auto-incrementing primary key

8. Why InnoDB non-primary key indexes store primary key values

*2. Sorting algorithm

1. Bubble sort

2. Quick sort

3. Simple selection sort-select the smallest each time (unstable, 552)

4. Direct insertion sort-insert new elements into an already ordered queue

5. Merge sort


1. Index

*1. Indexes are the mechanism and structure for accelerating the retrieval of data in the table

In addition to data, the database also maintains data structures that meet specific search algorithms. These data structures refer to the data in a certain way, so that efficient search can be implemented on these data structures. These data structures are indexes.

*2. Suitable for adding indexes

  1. The primary key automatically creates a unique index.
  2. Fields frequently queried as where conditional statements
  3. Related fields need to be indexed, such as foreign key fields, classid in the student table, schoolid in the classes table, etc.
  4. Sort fields can be indexed
  5. Grouping fields can be indexed, because the premise of grouping is sorting
  6. Statistics fields can be indexed, such as count(), max()

*3. Cases that are not suitable for adding indexes

1) Too little table data and poorly unique fields

2) Frequently added, deleted and modified fields

3) Fields not used after where

4. The case of index failure

1. If there is or in the condition, it will not be used even if there is a condition with index (this is the reason why or is used as little as possible)
Note: If you want to use or and want the index to take effect, you can only change each of the conditions in or Indexes are added to all columns
  2. For multi-column indexes, the index is not used if the first part is not used
  3. Like queries start with%
  4. If the column type is a string, then the data must be used in the condition Quote in quotation marks, otherwise do not use index

5. When there is an implicit conversion in the query condition, the index will become invalid. When querying without single quotes

6. IS NULL or IS NOT NULL is used in the where statement, which will cause the index to fail

7. A function is used in the where column, and the column is calculated

8. If mysql estimates that using a full table scan is faster than using an index, then not using the index? ? ? ?

5. Choice of B tree and B+ tree

Features of B+Tree:

1) Non-leaf nodes do not store data, only indexes (redundancy), which can ensure that more indexes are stored

2) Leaf nodes store all index fields

3) Leaf nodes are connected with pointers to improve interval access performance

The query efficiency of the B+ tree is more stable, because the data is placed on the leaf node. The
B+ tree can improve the efficiency of the range query, because the leaf node points to the next leaf node

1. The disk read and write cost of the B+ tree is lower: the internal node of the B+ tree does not have a pointer to the specific information of the keyword, so its internal node is smaller than the B tree. If the keywords of all the same internal nodes are stored in the same In a disk block, the more keywords that the disk block can hold, the more keywords that need to be searched when read into the memory at a time, and the relative IO read and write times are reduced.

2. The query efficiency of the B+ tree is more stable: because the non-terminal point is not the node that ultimately points to the file content, but only the index of the keyword in the leaf node. Therefore, any keyword search must take a path from the root node to the leaf node. The path length of all keyword queries is the same, resulting in the same query efficiency for each data.

First of all, we know that InnoDB uses a B+ tree as a storage structure, so a column must be used as a key. What is a key?

If we define the primary key (PRIMARY KEY), then InnoDB will choose the primary key as the clustered index. If the primary key is not explicitly defined, InnoDB will choose the first unique index that does not contain NULL values ​​as the primary key index. If there is no such Unique index, InnoDB will choose the built-in 6-byte long ROWID as the implicit clustered index (ROWID increases with the writing of the row record and the primary key, this ROWID is not as referenceable as ORACLE ROWID, it is implicit).

Reduce the overhead of the database?

The data record itself is stored in the leaf node of the main index (a B+Tree). This requires that the data records in the same leaf node (the size of a memory page or disk page) are stored in the order of the primary key, so whenever a new record is inserted, MySQL will insert it into the appropriate node according to its primary key And position, if the page reaches the load factor (InnoDB defaults to 15/16), then a new page (node) is opened.

If the table uses an auto-incrementing primary key, then every time a new record is inserted, the record will be added to the current index node in order Subsequent position, when a page is full, a new page will be opened automatically

If you use a non-incremental primary key (such as ID card number or student number, etc.), since the value of the primary key inserted each time is approximately random, each new record must be inserted into a certain position in the middle of the existing index page. MySQL has to move the data in order to insert the new record into the appropriate position, and even the target page may have been written back to the disk and cleared from the cache. At this time, it has to be read back from the disk, which adds a lot of overhead, and at the same time Frequent movement and paging operations caused a lot of fragmentation, resulting in an insufficiently compact index structure. Later, OPTIMIZE TABLE had to be used to rebuild the table and optimize the page filling.

8. Why InnoDB non-primary key indexes store primary key values

Reduce the maintenance of the secondary index when the row moves or the data page is split (when the data needs to be updated, the secondary index does not need to be modified, only the clustered index needs to be modified, a table can only have one clustered index, and the others All are secondary indexes, so only the clustered index needs to be modified, and the secondary index does not need to be rebuilt)

Summary of MySQL Index Interview Questions

*2. Sorting algorithm

The implementation of the top ten sorting algorithms and their respective advantages and disadvantages, applicability_derbi123123的博客-CSDN blog_ advantages and disadvantages of sorting algorithms

1. Bubble sort

    public void BubbleSort(int[] arr){        if(arr==null)   return;        for(int i=0;i<arr.length;i++){            for(int j=i+1;j<arr.length;j++){                if(arr[i]>arr[j]) {                    swap(arr, i, j);                }            }        }    }
    public void BubbleSort(int[] arr){        if(arr==null)   return;        for(int i=0;i<arr.length;i++){            boolean flag=false;            for(int j=i+1;j<arr.length;j++){                if(arr[i]>arr[j]) {                    flag=true;                    swap(arr, i, j);                }            }            if(!flag) return;            System.out.println(i);        }    }

Applicable situation: the scale of the elements to be sorted is small, (PS is stable)

Already ordered situation-O(N)

2. Quick sort

    public void quickSort(int[] arr,int l,int r){        if(l<r){            int a=l+(int)(Math.random()*(r-l+1));            System.out.println(a);            swap(arr,a,l);            int mid=patition(arr,l,r);            quickSort(arr,l,mid-1);            quickSort(arr,mid+1,r);        }    }     public int patition(int[] arr,int l,int r){        int i=l,j=r;        int key=arr[l];        while(i<j){            while(arr[j]>=key && i<j)  j--;            if(i<j) arr[i++]=arr[j];            while(arr[i]<key && i<j)    i++;            if(i<j) arr[j--]=arr[i];        }        arr[i]=key;        return i;    }

Dutch flag issue

    public void hollandFlag(int[] arr,int l,int r,int value){        int less=l-1;        int more=r+1;        int cur=l;        while(cur<r){            if(arr[cur]<value)                swap(arr,cur++,++less);            else if(arr[cur]>value)                swap(arr,--more,cur);            else cur++;        }        System.out.println(Arrays.toString(arr));    }

Quick sort is an improved version of bubble sort

Advantages: faster (in the case of disorder), less data movement;

Disadvantages: unstable. In the case of order, the number of comparisons will increase (positive order, reverse order, equal)

Application: the data distribution is relatively even

3. Simple selection sort-select the smallest each time (unstable, 552)

void SimpleChooseSort(int A[],int N)//序列A及元素个数{	for(int i=0;i<N;i++){		int min=i;		for(int j=i+1;j<N;j++){//寻找i后面的最小元素下标 			if(A[min]>A[j])				min=j;		}		if(min==i) continue;//如果i后面的元素都大于等于i,那么 min还是i,就不用了交换了 		swap(A[i],A[min]); //交换 	}	} 

4. Direct insertion sort-insert new elements into an already ordered queue

 public void simpleInsertSort(int[] arr){        for (int i = 1; i < arr.length; i++) {            int tmp=arr[i];            int j;            for(j=i;j>0&&arr[j-1]>tmp;j--)                arr[j]=arr[j-1];            arr[j]=tmp;        }    }

5. Merge sort

    public void sort(int[] arr){        if(arr==null||arr.length<2) return;        sort1(arr,0,arr.length-1);     }    public void sort1(int[] arr,int l,int r){        if(l<r){            int m=l+((r-l)>>1);            sort1(arr,l,m);            sort1(arr,m+1,r);            merge(arr,l,m,r);        }    }    public void merge(int[] arr,int l,int m,int r){        int[] tmp=new int[r-l+1];        int i=0;        int p1=l,p2=m+1;         while(p1<=m&&p2<=r){            tmp[i++]=arr[p1]<=arr[p2]?arr[p1++]:arr[p2++];        }        while(p1<=m)            tmp[i++]=arr[p1++];        while(p2<=r)            tmp[i++]=arr[p2++];        i=0;        while(l<=r)            arr[l++]=tmp[i++];    }