Click or drag to resize

LongestCommonSubstringA Class

Problem solver for the longest common substring problem, operating in O(N) time (N being the text length), and using an array of linked structures stored in a linear array instead of linked class instances. This code runs slightly faster than LongestCommonSubstringL, and avoids creating a lot of nodes for the linked list, in order to make it easier for the garbage collector.
Inheritance Hierarchy

Namespace: Altaxo.Collections.Text
Assembly: AltaxoCore (in AltaxoCore.dll) Version: 4.8.3179.0 (4.8.3179.0)
Syntax
C#
public class LongestCommonSubstringA : LongestCommonSubstringBaseA

The LongestCommonSubstringA type exposes the following members.

Constructors
 NameDescription
Public methodLongestCommonSubstringAInitializes a new instance of the problem solver for the longest common substring problem.
Top
Properties
 NameDescription
Public propertyCommonSubstringPositionsForMaximumNumberOfWordsReturns the positions for common substrings for the maximum number of words that have at least one common substring. The result is identical to a call of GetSubstringPositionsCommonToTheNumberOfWords(Int32) with the argument MaximumNumberOfWordsWithCommonSubstring
(Inherited from LongestCommonSubstringBase)
Public propertyMaximumNumberOfWordsWithCommonSubstringGets or sets the maximum number of words with a common substring.
(Inherited from LongestCommonSubstringBase)
Public propertyStoreVerboseResultsGets or sets a value indicating whether to store all longest common substrings for a given number of words or just one.
(Inherited from LongestCommonSubstringBase)
Top
Methods
 NameDescription
Protected methodCleanIntermediatesCleans the intermediates so the garbage collector can get them.
Public methodEqualsDetermines whether the specified object is equal to the current object.
(Inherited from Object)
Public methodEvaluateEvaluates the longest common substring. After evaluation, the results can be accessed by the properties of this instance. Please be aware that the amount of resulting information depends on the state of [P:StoreVerboseResults].
Protected methodEvaluateMaximumNumberOfWordsWithCommonSubstringPosts the process results. Here the maximum number of words that have at least one common substring is evaluated.
Protected methodFinalizeAllows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection.
(Inherited from Object)
Public methodGetHashCodeServes as the default hash function.
(Inherited from Object)
Public methodGetSubstringPositionsCommonToTheNumberOfWordsReturns the positions for common substrings for the given number of words
(Inherited from LongestCommonSubstringBase)
Public methodGetTypeGets the Type of the current instance.
(Inherited from Object)
Protected methodMemberwiseCloneCreates a shallow copy of the current Object.
(Inherited from Object)
Protected methodprint_debugPrints all linked list items for debugging purposes.
(Inherited from LongestCommonSubstringBaseA)
Protected methodStoreVerboseResultStores a common substring occurence.
(Inherited from LongestCommonSubstringBase)
Public methodToStringReturns a string that represents the current object.
(Inherited from Object)
Top
Fields
 NameDescription
Protected field_ddlListKeeps a linked list of LongestCommonSubstringBaseALLElements.
(Inherited from LongestCommonSubstringBaseA)
Protected field_items 
Protected field_lastLcp
(Inherited from LongestCommonSubstringBaseA)
Protected field_LCP Stores the length of the Longest Common Prefix of the lexicographically i-th suffix and its lexicographical predecessor (the lexicographically (i-1)-th suffix). The element at index 0 is always 0.
(Inherited from LongestCommonSubstringBase)
Protected field_LCPSStores the length of the Longest Common Prefix _LCP, but here only if two adjacent suffixes belong to the same word. In the other case, i.e. the suffix _suffixArray[i-1] belongs to another word than the suffix _suffixArray[i], then _LCPS[i] is zero.
(Inherited from LongestCommonSubstringBase)
Protected field_lcsOfNumberOfWords Stores in element idx the length of the longest substring that is common to idx number of words (it follows that index 0 and 1 are unused here).
(Inherited from LongestCommonSubstringBase)
Protected field_maximumLcp Maximum of all values in the _LCP array.
(Inherited from LongestCommonSubstringBase)
Protected field_maximumNumberOfWordsWithCommonSubstring Contains the maximum number of words that have a common substring.
(Inherited from LongestCommonSubstringBase)
Protected field_numberOfWordsNumber of words the text was separated into.
(Inherited from LongestCommonSubstringBase)
Protected field_singleResultOfNumberOfWords If _verboseResultsOfNumberOfWords is false, stores only the first report of a longest common string for the given number of words. The content of one element is the beginning and the end index in the suffix array that indicate all suffixes that have this substring in common. The length of this substring is stored in _lcsOfNumberOfWords at the same index. If _verboseResultsOfNumberOfWords is true, this array is not used and is set to null.
(Inherited from LongestCommonSubstringBase)
Protected field_suffixArrayMaps the lexicographical order position i of a suffix to the starting position of the suffix in the text, which is the value of the i-th element of this array.
(Inherited from LongestCommonSubstringBase)
Protected field_useVerboseResults Determines the amount of information to store during evaluation.
(Inherited from LongestCommonSubstringBase)
Protected field_verboseResultsOfNumberOfWords If _verboseResultsOfNumberOfWords is true, this array stores, for a given number of words that have one or more substrings in common, a list with all positions where such common substrings occur. The content of one element of each list is the beginning and the end index in the suffix array that indicate all suffixes that have a substring in common. The length of this substring is stored in _lcsOfNumberOfWords If _verboseResultsOfNumberOfWords is false, this array is not used and is set to null.
(Inherited from LongestCommonSubstringBase)
Protected field_wordIndices Maps the lexicographical order position i of a suffix to the index of the word, in which this suffix starts. This means, that for instance the value of the i-th element contains the index of the word, in which the lexicographically i-th suffix that starts at position _suffixArray[i] begins. The contents of this array is only meaningful, if you provided text that was separated into words, for instance for the longest common substring problem.
(Inherited from LongestCommonSubstringBase)
Protected field_wordStartPositions Start positions of the words in which the original text was separated in the concenated text array.
(Inherited from LongestCommonSubstringBase)
Top
Remarks
For details of the algorithm see the very nice paper by Michael Arnold and Enno Ohlebusch, 'Linear Time Algorithms for Generalizations of the Longest Common Substring Problem', Algorithmica (2011) 60; 806-818; DOI: 10.1007/s00453-009-9369-1. This code was adopted by D.Lellinger from the C++ sources from the web site of the authors at http://www.uni-ulm.de/in/theo/research/sequana.html.
See Also