Repetitive strings such as periods have been studied vigorously in so diverse fields as data compression, computer-assisted music analysis, bioinformatics, and etc. In bioinformatics, periods are highly related to repetitive patterns in DNA sequences ...
Repetitive strings such as periods have been studied vigorously in so diverse fields as data compression, computer-assisted music analysis, bioinformatics, and etc. In bioinformatics, periods are highly related to repetitive patterns in DNA sequences so called tandem repeats. In some cases, quite similar but not the same patterns are repeated and thus we need approximate string matching algorithms to study tandem repeats in DNA sequences. In this paper, we propose a new definition of approximate periods of strings based on distance sum. Given two strings p (|p|=m) and x (|X||=n), we propose an algorithm that computes the minimum approximate period distance based on distance sum. Our algorithm runs in O (mn²) time for the weighted edit distance, and runs in O(m n) time for the edit distance, and runs in O(n) time for the Hamming distance.