Evolution of tree data structures for indexing

# · 🔥 101 · 💬 12 · one month ago · erthalion.info · baotiao
B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. Every node of this tree is usually a page of some certain size and contains keys and pointers to other nodes. While obviously introducing a bit memory overhead those two changes make it possible to detect a concurrent page split by checking the page high key, which allows the tree to be searched without holding any read locks. As you can imagine this optimization is about trade-off between consuming less space on leave pages, but doing more job at run-time. Another interesting approach I find somehow similar is overflow pages, when only a fixed number of payload bytes is actually stored in the page directly and the rest goes to the overflow page. Memory footprint and insert performance are located on different sides of balance, we can improve inserts by preallocating pages and keep track of free space on the page which requires more memory. The authors of say more, what if we replace the whole index structure with a neural network that will learn the data distribution to "Predict" position of any entry in a database? In this case we get something similar to B-tree, except searching through a tree we find a page where the key is located and predicting the key location we get its position with some predefined error range.
Evolution of tree data structures for indexing



Archive | Send Feedback | WebAssembly Version (beta)