C++ containers that save memory and time
January 31st, 2013 | Published in Google Open Source
We’re pleased to announce C++ B-Tree, a C++ template library that implements B-tree containers with an analogous interface to the standard STL map, set, multimap, and multiset containers. B-trees are well-known data structures for organizing secondary storage, because they are optimized for reading and writing large blocks of data. But the same property that makes B-trees appropriate for use with databases and file systems also makes them appropriate for use in main-memory, just with smaller blocks.
C++ standard libraries commonly implement the map and set data structures using Red-Black trees, which store one element per node, requiring three pointers plus one bit of information per element to keep the tree balanced. By comparison, B-tree containers store a number of elements per node, thereby reducing pointer overhead and saving a significant amount of memory. For small data types, B-tree containers typically reduce memory use by 50 to 80% compared with Red-Black tree containers.
insertion time as a function of tree size
(256 byte node)
Storing multiple elements per node can also improve performance, especially in large containers with inexpensive key-compare functions (e.g., integer keys using less-than), relative to Red-Black tree containers. This is because performance in these cases is effectively governed by the number of cache misses, not the number of key-compare operations. For large data sets, using these B-tree containers will save memory and improve performance.
insertion time as a function of node size
(10M element tree)