R-Tree

R-tree

From Wikipedia, the free encyclopedia

Jump to: navigation, search

This article is about the data structure. For the type of metric space, see Real tree.

Simple example of an R-tree for 2D rectangles

Visualization of an R*-tree for 3D cubes using ELKI

R-trees are tree data structures that are similar to B-trees, but are used for spatial access methods, i.e., for indexing multi-dimensional information; for example, the (X, Y) coordinates of geographical data. A common real-world usage for an R-tree might be: "Find all museums within 2 kilometres (1.2 mi) of my current location".

The data structure splits space with hierarchically nested, and possibly overlapping, minimum bounding rectangles (MBRs, otherwise known as bounding boxes, i.e. "rectangle", what the "R" in R-tree stands for).

Each node of an R-tree has a variable number of entries (up to some pre-defined maximum). Each entry within a non-leaf node stores two pieces of data: a way of identifying a child node, and the bounding box of all entries within this child node.

The insertion and deletion algorithms use the bounding boxes from the nodes to ensure that "nearby" elements are placed in the same leaf node (in particular, a new element will go into the leaf node that requires the least enlargement in its bounding box). Each entry within a leaf node stores two pieces of information; a way of identifying the actual data element (which, alternatively, may be placed directly in the node), and the bounding box of the data element.

Similarly, the searching algorithms (e.g., intersection, containment, nearest) use the bounding boxes to decide whether or not to search inside a child node. In this way, most of the nodes in the tree are never "touched" during a search. Like B-trees, this makes R-trees suitable for databases, where nodes can be paged to memory when needed.

Different algorithms can be used to split nodes when they become too full, resulting in the quadratic and linear R-tree sub-types.

R-trees do not historically guarantee good worst-case performance, but generally perform well with real-world data.^{[citation needed]} However, a new algorithm was published in 2004 that defines the Priority R-Tree, which claims to be as efficient as the most efficient methods of 2004 and is at the same time worst-case optimal.^[1]

When data is organized in an R-Tree, the k nearest neighbors (for any L^p-Norm) of all points can efficiently be computed using a spatial join^[2]. This is beneficial for many algorithms based on the k nearest neighbors, for example the Local Outlier Factor.

[hide]

[edit] Variants

R* tree
R+ tree
Hilbert R-tree
Priority R-Tree (PR-Tree) - The PR-tree performs similarly to the best known R-tree variants on real-life and relatively evenly distributed data, but outperforms them significantly on more extreme data.^[1]

[edit] Algorithm

[edit] Search

The input is a search rectangle (Query box). Searching is quite similar to searching in a B+tree. The search starts from the root node of the tree. Every internal node contains a set of rectangles and pointers to the corresponding child node and every leaf node contains the rectangles of spatial objects (the pointer to some spatial object can be there). For every rectangle in a node, it has to be decided if it overlaps the search rectangle or not. If yes, the corresponding child node has to be searched also. Searching is done like this in a recursive manner until all overlapping nodes have been traversed. When a leaf node is reached, the contained bounding boxes (rectangles) are tested against the search rectangle and their objects (if there are any) are put into the result set if they lie within the search rectangle.

[edit] Insertion

To insert an object, the tree is traversed recursively from the root node. All rectangles in the current internal node are examined. The constraint of least coverage is employed to insert an object, i.e., the box that needs least enlargement to enclose the new object is selected. In the case where there is more than one rectangle that meets this criterion, the one with the smallest area is chosen. Inserting continues recursively in the chosen node. Once a leaf node is reached, a straightforward insertion is made if the leaf node is not full. If the leaf node is full, it must be split before the insertion is made. A few splitting algorithms have been proposed for good R-tree performance.

This section requires expansion.

[edit] Bulk-loading

Sort-Tile-Recursive (STR) ^[3]
Packed Hilbert R-Tree - Uses the Hilbert value of the center of a rectangle to sort the leaf nodes and recursively builds the tree.
Nearest-X - Rectangles are sorted on the x-coordinate and nodes are created.

This section requires expansion.

[edit] See also

Segment tree
Interval tree - A degenerate R-Tree for 1 dimension (usually time).
Bounding volume hierarchy
Spatial index
GiST

[edit] References

^ ^a ^b Lars Arge, Mark de Berg, Herman J. Haverkort, Ke Yi: ""The Priority R-Tree: A Practically Efficient and Worst-Case Optimal RTree"", SIGMOD 2004, June 13–18, Paris, France
^ Brinkhoff, T.; Kriegel, H. P.; Seeger, B. (1993). "Efficient processing of spatial joins using R-trees". ACM SIGMOD Record 22: 237. doi:10.1145/170036.170075. edit
^ Scott T. Leutenegger, Jeffrey M. Edgington and Mario A. Lopez: STR: A Simple and Efficient Algorithm for R-Tree Packing

Antonin Guttman: R-Trees: A Dynamic Index Structure for Spatial Searching, Proc. 1984 ACM SIGMOD International Conference on Management of Data, pp. 47–57. ISBN 0-89791-128-8
Yannis Manolopoulos, Alexandros Nanopoulos, Apostolos N. Papadopoulos, Yannis Theodoridis: R-Trees: Theory and Applications, Springer, 2005. ISBN 1-85233-977-2
N. Beckmann, H.-P. Kriegel, R. Schneider, B. Seeger: The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles. SIGMOD Conference 1990: 322-331

[edit] External links

R-tree portal
R-Tree implementations: C & C++, Java applet, Common Lisp, Python, Javascript.

[hide] v • d • e Trees in computer science

Binary trees	Binary search tree (BST) · Van Emde Boas tree · Cartesian tree · Top Tree · T-tree

Self-balancing binary search trees	Red-black tree · AVL tree · AA tree · Splay tree · Scapegoat tree · Treap

B-trees	B+ tree · B*-tree · UB-tree · 2-3 tree · 2-3-4 tree · (a,b)-tree · Dancing tree · Htree · B^x-tree

Tries	Suffix tree · Radix tree · Ternary search tree

Binary space partitioning (BSP) trees	Quadtree · Octree · kd-tree (implicit) · VP-tree

Non-binary trees	Exponential tree · Fusion tree · Interval tree · PQ tree · Range tree · SPQR tree

Spatial data partitioning trees	*R-tree · R+ tree · R tree · X-tree · M-tree · Segment tree · Fenwick tree · Hilbert R-tree**

Other trees	Heap · Hash tree · Finger tree · Metric tree · Cover tree · BK-tree · Doubly-chained tree · iDistance

Retrieved from "http://en.wikipedia.org/wiki/R-tree"

Categories: R-tree | Database index techniques

posted on 2010-12-01 11:46 JonsenElizee 阅读(2550) 评论(0) 编辑收藏引用

只有注册用户登录后才能发表评论。
【推荐】100%开源！大型工业跨平台软件C++源码提供，建模，组态！



网站导航: 博客园 IT新闻 BlogJava 博问 Chat2DB 管理

JonsenElizee