CMU 15445 database design


The process of capacity expansion is to re-take the remainder of the capacity of the first-dimensional array for each hash value.
Assuming that the capacity is increased from 8 to 16, then the hash values ​​3 (0x0011) and 11 (0x1011) stored in the original slot 3 (011) are allocated to slots 3 and 11, respectively.
Features: If the slots are traversed in the order of high-order carry, assuming that the current traversal is to the 110 slot, then after the capacity is expanded from the capacity 8 to the capacity 16, the new slot corresponding to all the elements on the 110 slot is 0110 or 1110, yes Adjacent, and all slots before 0110 have been traversed at capacity 8

Insert picture description here

The remainder can also be understood as intercepting the binary hash value, such as intercepting the i bit as a slot, resulting in multiple hash values ​​corresponding to a slot. In addition, you can let multiple slots correspond to a bucket. The method is to continue to intercept the slots, assuming that the number of intercepted bits is i j i_j ij​, Then at most 2 (i − i j) 2^(i-i_j) 2(i−ij​)Each slot corresponds to a bucket j.

After insertion causes bucket j to exceed its capacity, if i= i j i_j ij​, It means that a slot corresponds to a bucket. At this time, the capacity of the first array must be doubled and i=i+1;

When i> i j i_j ij​ Explain that at this time bucket j corresponds to multiple slots. At this time, there is no need to update the entire one-bit array, just divide bucket j into two buckets j and z, and the new i j i_j ij​with i z i_z iz​Equal to old i j i_j ij​+1.

To determine whether an element exists in the hash table, one idea is to use the bloom filter, but it does not support deletion operations, because (a bit in the bitmap is shared by multiple elements, and it is impossible to determine whether the deleted element is in the bitmap )

C++ implementation

void ExtendibleHash<K, V>::Insert(const K &key, const V &value) {
        Node<K,V>* newNode=new Node<K,V>(key,value);
        size_t h=HashKey(key);
//        std::cout<<"insert"<<h<<" "<<std::endl;
        size_t cutH=CutBits(h,_numBits);
        std::shared_ptr<BucketList<K,V>> sharedBucket=_table[cutH];
        BucketList<K,V>* bucket = sharedBucket.get();
        Node<K,V>& result=bucket->_Find(key);
            delete &result;
template<typename K, typename V>
void ExtendibleHash<K, V>::DoubleSize() {
   std::shared_ptr<BucketList<K,V>>* newTable=new std::shared_ptr<BucketList<K,V>>[1<<(_numBits+1)];
   for(int i=0;i<(1<<_numBits);i++){
//    std::cout<<"double"<<" ";
   delete [] _table;

template<typename K, typename V>
void ExtendibleHash<K, V>::SplitBucket(BucketList<K, V> *bucket) {
//    std::cout<<"split"<<" ";
   int oldBits=bucket->_numBits;
   size_t  oldIndex=bucket->_tableIndex;
   BucketList<K,V>* newBucket=new BucketList<K,V>(oldBits+1,(oldIndex<<1)+1,BUCKET_LEN);
   for(Node<K,V>* cur=bucket->;cur->next!=NULL;){
       size_t h=HashKey(cur->_key);
       auto next=cur->next;
//            std::cout<<"Trans"<<cur->_key<<" "<<std::endl;
   int sub=_numBits-oldBits;
   std::shared_ptr<BucketList<K,V>> newShared_ptr=std::shared_ptr<BucketList<K,V>>(newBucket);
   for(size_t i=oldIndex<<sub;i<((oldIndex+1)<<sub);i++){
   return ;


DoubleSize is used to double the capacity of a one-bit array, and SplitBucket is used to split a bucket.
The implementation process needs to pay attention to the deletion of linked list elements when traversing the linked list, which will cause the next element to be found.

Storage and index

LRU (least recently used)
maintains a linked list. Assuming that the value stored by the node node of the linked list is v, a hash table is created with v as the key and node as the value. When deleting the first node of the linked list, delete the corresponding element of the hash table. If you want to delete any v value, first find the corresponding node through the hash table, and then delete it from the linked list.


Page *BufferPoolManager::FetchPage(page_id_t page_id) {
   Page* result;
   bool flag=page_table_->Find(page_id,result);
       return result;
           return result;
                   return result;
   return nullptr;

Manage memory to disk mapping in page units. Each page in the disk has a page_id. Memory Page is allocated to a certain number of disk pages. When the user wants to access the disk page corresponding to a certain id, he can access the corresponding memory page instead. The mapping between the two is stored in HashTable<page_id_t, Page *>. The freelist saves the unallocated memory Page, and Replacer<Page *> saves the memory page marked with 0 in the hashtable (somewhat similar to the reference count). If the freelist is not enough, select a memory page from the replacer according to the LRU, write it back to the disk (the s_dirty flag determines whether to write back) and allocate it to a new disk page.

The advantages of using column storage are convenient for cpu to cache, improve compression efficiency, and reduce the number of I/Os, because there is no need to retrieve unnecessary columns, and it is convenient for the cpu to perform vectorization processing.

In row storage, a record is stored in a tuple, and the values ​​of each column are spliced ​​together. For a variable-length value, such as a string, save its length in front of the value. First, replace them with the offset of the variable-length value in the tuple and splice them with the fixed-length value, and then save the specific value of the variable-length value.

The storage form of tuples in a page: first save the meta information, such as page_id, next_page_id, pre_page_id, the number of tuples and the starting offset of the free space, and then the offset and length of each tuple. Then save the data of each tuple from back to front. Each tuple is assigned a fixed rid when inserted into the page, which is composed of page_id and slot_num, and slot_num can be used to retrieve meta-information. When deleting a tuple, you need to move all the tuples whose offsets are before it back to fill its gaps and retain their slot_num, but you need to set their offset and length in the meta information to zero. In the next insertion, you can reuse this slot_num, and then allocate a new space in the free space. In other words, the order of slot_num does not represent the order of the tuple in space.

table_heap is responsible for managing and connecting pages. When inserting a tuple into the current cur_page, if the space is not enough, table_heap will access the next page recorded in the cur_page until it finds one with enough space. If cur_page has no next_page record at this time, table_heap will give it a new one from buffer_pool. Insert a tuple, the reference count of the page will increase by 1, otherwise it will decrease by 1.

table_iterator is responsible for traversing the tuples in all pages managed by table_heap.

Each page of the b+ tree uses an array to store key/value pairs. The key of array[0] in internal_page is useless, while the key of array[0] in leaf_page is useful.

Insert realizes
that each node in the b+ tree occupies a page. Since the value stored in the leaf node is of the RID type, and the value stored in the internal node is of the pafe_id type, the two classes need to be implemented separately. Then use a tree type to manage these nodes (application and use page), and implement the insertion operation. Directly inserting key-value pairs into leaf nodes is different from the subsequent operations caused by inserting key-values ​​into internal nodes due to overflow. For this, two member functions InsertIntoLeaf and InsertToParentWithIndex are implemented in the tree type. For each insert operation, InsertIntoLeaf is called first, and if an overflow occurs internally, InsertToParentWithIndex is called. If an internal node also overflows, InsertToParentWithIndex is called recursively.
Pit: For internal nodes, the child nodes that are transferred to the new node after splitting need to modify their parent node to be the new node. For leaf nodes, after splitting, the next_page_id attributes of the old node and the new node need to be updated respectively. The number of key-value pairs of internal nodes must be greater than or equal to 2, and the number of key-value pairs of leaf nodes must be greater than or equal to 1, so as to ensure the correctness of key value retrieval.

delete implementation
When the number of key-value pairs in a node is greater than or equal to 1/2 of the maximum capacity, it is said to meet the fill factor. During the deletion process, only one node in a layer may not meet the filling factor, so the node that does not meet the filling factor must borrow a node from its neighbor node to satisfy it, or merge with the neighbor node. The node without neighbor nodes must be the root node. Pit: When merging nodes, use the MoveAllTo function to move all the remaining key-value pairs of the current node to the left neighbor or the right neighbor. Unlike the MoveHalfTo used in splitting, here is not moving the key value to a new node, but inserting the left and right neighbors that already have the key value. So for the left and right neighbors, you need to consider the difference between inserting from the end and inserting from the beginning. After deleting the root node, in addition to updating the root node of the tree, you also need to update the parent node of its child nodes to INVALID_PAGE_ID. An unpin (page_id) function is used to notify buffer_pool to add the page_id page to the LRU queue. In addition to manually calling unpin, you can also use shared_ptr to automatically call unpin in the page exit scope. The method is to pass in a deleter object at the same time when initializing shared_ptr:

class Deleter {
    explicit Deleter(BufferPoolManager * buffer_pool_manager){
    void operator ()(BPlusTreePage* page) {

    BufferPoolManager *buffer_pool_manager_;

After applying for a new page from the buffer_pool, be sure to clear the data inside.