Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 1 of 1
  • Item
    Thumbnail Image
    A study of recursive partitioning technique for indexing spatial objects and trajectories
    Antoine, Elizabeth Arockiarani ( 2009)
    The requirement to store and manipulate data that represents the location and extent of objects, like roads, cities, rivers, etc. (spatial data), led to the evolution of spatial database systems. The domains that led to an increased interest in spatial database systems are earth science, robotics, resource management, urban planning, autonomous navigation and geographic information systems (GIS). To handle the spatial data efficiently, spatial database systems require indexing mechanisms that can retrieve spatial objects based on their locations using direct look-ups as opposed to the sequential search. Indexing structures designed for relational database systems cannot be used for objects with non-zero size. The fact that there is no total ordering of objects in space makes the conventional indexes, such as the B+ tree, incapable of handling spatial data. Extensive work has been done on spatial indexing and indexing methods are categorized in terms of their efficiency for the type of spatial objects or the type of queries. Queries in spatial database system are classified as single-scan and multi-scan queries. Spatial join is the most important multi-scan query in a spatial database system and the execution time of such queries is super linear to the number of objects. Among the indexing structures available for spatial join queries, Filter trees perform better than its counterparts, such as Hilbert R-trees. Filter tree join algorithm outperforms the R-tree join algorithm by reading each block of entities at most once. Filter trees combine the recursive partitioning, size separation and space filling curves to achieve this efficiency. However, for data sets of low join selectivity, the number of blocks processed for Filter trees is excessive compared to the number of blocks that have intersecting entities. The goal of this work is to provide a method for accelerating spatial join operations by using Spatial Join Bitmap (SJB) indices. The file organization is based on the concepts introduced in Filter trees. The SJB indices keep track of blocks that have intersecting entities and make the algorithm process only those blocks. We provide algorithms for generating SJB indices dynamically and for maintaining SJB indices when the data sets are updated. Although maintaining SJB indices for updates increases the cost in terms of response time, the cost saving in terms of the join operation is substantial and this makes the overall behaviour of the spatial system very efficient. We have performed an extensive study using both real and synthetic data sets of various data distributions. The results show that the use of SJB indices produces a substantial speed-up, ranging from 25% to 150% when compared to Filter trees. This method is highly beneficial in a real world scenario, as the number of times the data set is updated is fairly low when compared to the number of times the join processing is done on the data sets. The spatial indexing structures can be extended to handle data of higher dimensions including time. The position of the geometries, like points, lines, areas or volumes changing over time, represents moving objects. The need for storing and processing moving object data arises in a wide range of applications, including digital battlefields (battlefield simulators), air-traffic control, and mobile communication systems. The successive locations of the object are gathered as the object moves around the space and the locations that are ordered in time are interpolated to obtain the movement of the object: this is called as trajectory of the object. R-tree variations, such as the three dimensional R-trees, TB-trees, FNR trees, STR trees, MON trees and SETI trees, are found to be effective for storing and manipulating past locations of moving objects. The SETI tree is a combination of the R-tree in the time dimension and the partition based technique in the space dimension, and outperforms the other R-tree indexing structures in handling coordinate based queries. However, SETI increases the computational time when handling trajectory queries that retrieve the whole or part of the trajectories. We propose a methodology for using the recursive partitioning technique for indexing trajectories, called the Recursively Partitioned Trajectory Index (RPTI). RPTI uses a two-level indexing structure that is similar to the SETI and maintains separate indices for the space and time dimensions. However, the splitting of trajectory segments in SETI, which increases the computational time, does not arise in RPTI. We present the algorithms for constructing the RPTI and the algorithms for updates, which include insertion and deletion. We have conducted an experimental study of this method and have demonstrated that RPTI is better than SETI in handling trajectory queries and is competitive with SETI in handling coordinate based queries. Contrary to the SETI structure, RPTI recursively partitions the space and avoids the splitting of line segments, making it efficient for query processing. Deletion is often ignored while proposing a trajectory index as a result of the assumption that deleting the trajectory of a moving object is meaningless after the transmitted positions are recorded. However, deletions are necessary when the trajectory of a moving object is no longer useful. We have also shown that deletion of a trajectory can be efficiently done using the RPTI structure. The structure of RPTI can be easily implemented by using any of the existing spatial indexing structures. The only design parameters required are the standard disk page size and maximum level of recursive partitioning. However, in SETI, the number of spatial partitions, which is a crucial parameter in any spatial partitioning strategy, is highly dependent on the distribution of data sets.