Bounded Similarity Querying for Time-Series Data

Information and Computation (I&C), 194(2):203-241, November 2004.
Dina Goldin, Todd Millstein, Ayferi Kutlu
We define the problem of bounded similarity querying in time-series databases, which generalizes earlier notions of similarity querying. Given a (sub)sequence S, a query sequence Q, lower and upper bounds on shifting and scaling parameters, and a tolerance &epsilon, S is considered boundedly similar to Q if S can be shifted and scaled within the specified bounds to produce a modified sequence S' whose distance from Q is within &epsilon. We use similarity transformation to formalize the notion of bounded similarity. We then describe a framework that supports the resulting set of queries; it is based on a fingerprint method that normalizes the data and saves the normalization parameters. For off-line data, we provide an indexing method with a single index structure and search technique for handling all the special cases of bounded similarity querying. Experimental investigations find the performance of our method to be competitive with earlier, less general approaches.