Click or drag to resize

LongestCommonSubstringBase Class

Base class for problem solvers for the longest common substring problem.
Inheritance Hierarchy

Namespace: Altaxo.Collections.Text
Assembly: AltaxoCore (in AltaxoCore.dll) Version: 4.8.3261.0 (4.8.3261.0)
Syntax
C#
public class LongestCommonSubstringBase

The LongestCommonSubstringBase type exposes the following members.

Constructors
 NameDescription
Public methodLongestCommonSubstringBaseInitializes a new instance of the problem solver for the longest common substring problem.
Top
Properties
 NameDescription
Public propertyCommonSubstringPositionsForMaximumNumberOfWordsReturns the positions for common substrings for the maximum number of words that have at least one common substring. The result is identical to a call of GetSubstringPositionsCommonToTheNumberOfWords(Int32) with the argument MaximumNumberOfWordsWithCommonSubstring
Public propertyMaximumNumberOfWordsWithCommonSubstringGets or sets the maximum number of words with a common substring.
Public propertyStoreVerboseResultsGets or sets a value indicating whether to store all longest common substrings for a given number of words or just one.
Top
Methods
 NameDescription
Public methodEqualsDetermines whether the specified object is equal to the current object.
(Inherited from Object)
Protected methodFinalizeAllows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection.
(Inherited from Object)
Public methodGetHashCodeServes as the default hash function.
(Inherited from Object)
Public methodGetSubstringPositionsCommonToTheNumberOfWordsReturns the positions for common substrings for the given number of words
Public methodGetTypeGets the Type of the current instance.
(Inherited from Object)
Protected methodMemberwiseCloneCreates a shallow copy of the current Object.
(Inherited from Object)
Protected methodStoreVerboseResultStores a common substring occurence.
Public methodToStringReturns a string that represents the current object.
(Inherited from Object)
Top
Fields
 NameDescription
Protected field_LCP Stores the length of the Longest Common Prefix of the lexicographically i-th suffix and its lexicographical predecessor (the lexicographically (i-1)-th suffix). The element at index 0 is always 0.
Protected field_LCPSStores the length of the Longest Common Prefix _LCP, but here only if two adjacent suffixes belong to the same word. In the other case, i.e. the suffix _suffixArray[i-1] belongs to another word than the suffix _suffixArray[i], then _LCPS[i] is zero.
Protected field_lcsOfNumberOfWords Stores in element idx the length of the longest substring that is common to idx number of words (it follows that index 0 and 1 are unused here).
Protected field_maximumLcp Maximum of all values in the _LCP array.
Protected field_maximumNumberOfWordsWithCommonSubstring Contains the maximum number of words that have a common substring.
Protected field_numberOfWordsNumber of words the text was separated into.
Protected field_singleResultOfNumberOfWords If _verboseResultsOfNumberOfWords is false, stores only the first report of a longest common string for the given number of words. The content of one element is the beginning and the end index in the suffix array that indicate all suffixes that have this substring in common. The length of this substring is stored in _lcsOfNumberOfWords at the same index. If _verboseResultsOfNumberOfWords is true, this array is not used and is set to null.
Protected field_suffixArrayMaps the lexicographical order position i of a suffix to the starting position of the suffix in the text, which is the value of the i-th element of this array.
Protected field_useVerboseResults Determines the amount of information to store during evaluation.
Protected field_verboseResultsOfNumberOfWords If _verboseResultsOfNumberOfWords is true, this array stores, for a given number of words that have one or more substrings in common, a list with all positions where such common substrings occur. The content of one element of each list is the beginning and the end index in the suffix array that indicate all suffixes that have a substring in common. The length of this substring is stored in _lcsOfNumberOfWords If _verboseResultsOfNumberOfWords is false, this array is not used and is set to null.
Protected field_wordIndices Maps the lexicographical order position i of a suffix to the index of the word, in which this suffix starts. This means, that for instance the value of the i-th element contains the index of the word, in which the lexicographically i-th suffix that starts at position _suffixArray[i] begins. The contents of this array is only meaningful, if you provided text that was separated into words, for instance for the longest common substring problem.
Protected field_wordStartPositions Start positions of the words in which the original text was separated in the concenated text array.
Protected fieldStatic memberERROR_NO_RESULTS_YET 
Top
Remarks
For details of the algorithm see the very nice paper by Michael Arnold and Enno Ohlebusch, 'Linear Time Algorithms for Generalizations of the Longest Common Substring Problem', Algorithmica (2011) 60; 806-818; DOI: 10.1007/s00453-009-9369-1. This code was adopted from the C++ sources from the web site of the authors at http://www.uni-ulm.de/in/theo/research/sequana.html.
See Also