Please use this identifier to cite or link to this item: http://hdl.handle.net/1893/27668
Appears in Collections:Computing Science and Mathematics Conference Papers and Proceedings
Peer Review Status: Refereed
Author(s): Connor, Richard
Contact Email: richard.connor@stir.ac.uk
Title: Reference point hyperplane trees
Editor(s): Amsaleg, L
Houle, ME
Schubert, E
Citation: Connor R (2016) Reference point hyperplane trees. In: Amsaleg L, Houle M & Schubert E (eds.) Similarity Search and Applications. SISAP 2016. Lecture Notes in Computer Science, 9939. International Conference on Similarity Search and Applications, SISAP 2016, Tokyo, Japan, 24.10.2016-26.10.2016. Cham, Switzerland: Springer, pp. 65-78. https://doi.org/10.1007/978-3-319-46759-7_5
Issue Date: 31-Dec-2016
Date Deposited: 16-Aug-2018
Series/Report no.: Lecture Notes in Computer Science, 9939
Conference Name: International Conference on Similarity Search and Applications, SISAP 2016
Conference Dates: 2016-10-24 - 2016-10-26
Conference Location: Tokyo, Japan
Abstract: Our context of interest is tree-structured exact search in metric spaces. We make the simple observation that, the deeper a data item is within the tree, the higher the probability of that item being excluded from a search. Assuming a fixed and independent probability p of any subtree being excluded at query time, the probability of an individual data item being accessed is (1−p)d for a node at depth d. In a balanced binary tree half of the data will be at the maximum depth of the tree so this effect should be significant and observable. We test this hypothesis with two experiments on partition trees. First, we force a balance by adjusting the partition/exclusion criteria, and compare this with unbalanced trees where the mean data depth is greater. Second, we compare a generic hyperplane tree with a monotone hyperplane tree, where also the mean depth is greater. In both cases the tree with the greater mean data depth performs better in high-dimensional spaces. We then experiment with increasing the mean depth of nodes by using a small, fixed set of reference points to make exclusion decisions over the whole tree, so that almost all of the data resides at the maximum depth. Again this can be seen to reduce the overall cost of indexing. Furthermore, we observe that having already calculated reference point distances for all data, a final filtering can be applied if the distance table is retained. This reduces further the number of distance calculations required, whilst retaining scalability. The final structure can in fact be viewed as a hybrid between a generic hyperplane tree and a LAESA search structure.
Status: AM - Accepted Manuscript
Rights: This is a post-peer-review, pre-copyedit version of a paper published in Amsaleg L, Houle ME & Schubert E (eds.) Similarity Search and Applications. The final authenticated version is available online at: https://doi.org/10.1007/978-3-319-46759-7_5

Files in This Item:
File Description SizeFormat 
Connor_SISAP2016_Reference_point_hyperplane_trees.pdfFulltext - Accepted Version577.87 kBAdobe PDFView/Open



This item is protected by original copyright



Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved https://creativecommons.org/publicdomain/zero/1.0/

If you believe that any material held in STORRE infringes copyright, please contact library@stir.ac.uk providing details and we will remove the Work from public display in STORRE and investigate your claim.