Please use this identifier to cite or link to this item: http://hdl.handle.net/1893/30024
Appears in Collections:Computing Science and Mathematics Conference Papers and Proceedings
Author(s): Chávez, Edgar
Connor, Richard
Vadicamo, Lucia
Contact Email: richard.connor@stir.ac.uk
Title: Query Filtering with Low-Dimensional Local Embeddings
Citation: Chávez E, Connor R & Vadicamo L (2019) Query Filtering with Low-Dimensional Local Embeddings. In: SISAP 2019: Similarity Search and Applications. Lecture Notes in Computer Science, 11807. SISAP2019: International Conference on Similarity Search and Applications, Newark NJ, USA, 02.10.2019-04.10.2019. Cham, Switzerland: Springer, pp. 233-246. https://doi.org/10.1007/978-3-030-32047-8_21
Issue Date: 2019
Date Deposited: 19-Aug-2019
Series/Report no.: Lecture Notes in Computer Science, 11807
Conference Name: SISAP2019: International Conference on Similarity Search and Applications
Conference Dates: 2019-10-02 - 2019-10-04
Conference Location: Newark NJ, USA
Abstract: The concept of local pivoting is to partition a metric space so that each element in the space is associated with precisely one of a fixed set of reference objects or pivots. The idea is that each object of the data set is associated with the reference object that is best suited to filter that particular object if it is not relevant to a query, maximising the probability of excluding it from a search. The notion does not in itself lead to a scalable search mechanism, but instead gives a good chance of exclusion based on a tiny memory footprint and a fast calculation. It is therefore most useful in contexts where main memory is at a premium, or in conjunction with another, scalable, mechanism. In this paper we apply similar reasoning to metric spaces which possess the four-point property, which notably include Euclidean, Cosine, Triangular, Jensen-Shannon, and Quadratic Form. In this case, each element of the space can be associated with two reference objects, and a four-point lower-bound property is used instead of the simple triangle inequality. The probability of exclusion is strictly greater than with simple local pivoting; the space required per object and the calculation are again tiny in relative terms. We show that the resulting mechanism can be very effective. A consequence of using the four-point property is that, for m reference points, there arèarè m 2 ´ pivot pairs to choose from, giving a very good chance of a good selection being available from a small number of distance calculations. Finding the best pair has a quadratic cost with the number of references ; however, we provide experimental evidence that good heuristics exist. Finally, we show how the resulting mechanism can be integrated with a more scalable technique to provide a very significant performance improvement, for a very small overhead in build-time and memory cost. Keywords: metric search · extreme pivoting · supermetric space · four-point property · pivot based index 2 Chávez et al.
Status: AM - Accepted Manuscript
Rights: This is a post-peer-review, pre-copyedit version of a paper published in Amato G., Gennaro C., Oria V., Radovanović M. (eds) Similarity Search and Applications. SISAP 2019. Lecture Notes in Computer Science, vol 11807. Springer, Cham. The final authenticated version is available online at: https://doi.org/10.1007/978-3-030-32047-8_21

Files in This Item:
File Description SizeFormat 
superExtremePaper.pdfFulltext - Accepted Version745.34 kBAdobe PDFView/Open



This item is protected by original copyright



Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved https://creativecommons.org/publicdomain/zero/1.0/

If you believe that any material held in STORRE infringes copyright, please contact library@stir.ac.uk providing details and we will remove the Work from public display in STORRE and investigate your claim.