Package org.apache.nutch.util
Class TrieStringMatcher
- java.lang.Object
-
- org.apache.nutch.util.TrieStringMatcher
-
- Direct Known Subclasses:
PrefixStringMatcher
,SuffixStringMatcher
public abstract class TrieStringMatcher extends Object
TrieStringMatcher is a base class for simple tree-based string matching. This class is thread-safe during string matching but not when adding strings to the trie.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected class
TrieStringMatcher.TrieNode
Node class for the character tree.
-
Field Summary
Fields Modifier and Type Field Description protected TrieStringMatcher.TrieNode
root
-
Constructor Summary
Constructors Modifier Constructor Description protected
TrieStringMatcher()
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected void
addPatternBackward(String s)
Adds any necessary nodes to the trie so that the givenString
can be decoded in reverse and the first character is represented by a terminal node.protected void
addPatternForward(String s)
Adds any necessary nodes to the trie so that the givenString
can be decoded and the last character is represented by a terminal node.abstract String
longestMatch(String input)
Returns the longest substring ofinput
that is matched by a pattern in the trie, ornull
if no match exists.protected TrieStringMatcher.TrieNode
matchChar(TrieStringMatcher.TrieNode node, String s, int idx)
Get the nextTrieStringMatcher.TrieNode
visited, given that you are atnode
, and that the next character in the input is theidx
'th character ofs
.abstract boolean
matches(String input)
Returns true if the givenString
is matched by a pattern in the trieabstract String
shortestMatch(String input)
Returns the shortest substring ofinput
that is matched by a pattern in the trie, ornull
if no match exists.
-
-
-
Field Detail
-
root
protected TrieStringMatcher.TrieNode root
-
-
Method Detail
-
matchChar
protected final TrieStringMatcher.TrieNode matchChar(TrieStringMatcher.TrieNode node, String s, int idx)
Get the nextTrieStringMatcher.TrieNode
visited, given that you are atnode
, and that the next character in the input is theidx
'th character ofs
. Can return null.- Parameters:
node
- InputTrieStringMatcher.TrieNode
containing child nodess
- String to match character at indexed positionidx
- Indexed position in input string- Returns:
- child
TrieStringMatcher.TrieNode
- See Also:
TrieStringMatcher.TrieNode.getChild(char)
-
addPatternForward
protected final void addPatternForward(String s)
Adds any necessary nodes to the trie so that the givenString
can be decoded and the last character is represented by a terminal node. Zero-lengthStrings
are ignored.- Parameters:
s
- String to be decoded.
-
addPatternBackward
protected final void addPatternBackward(String s)
Adds any necessary nodes to the trie so that the givenString
can be decoded in reverse and the first character is represented by a terminal node. Zero-lengthStrings
are ignored.- Parameters:
s
- String to be decoded.
-
matches
public abstract boolean matches(String input)
Returns true if the givenString
is matched by a pattern in the trie- Parameters:
input
- A String to be matched by a pattern- Returns:
- true if there is a match, flase otherwise
-
shortestMatch
public abstract String shortestMatch(String input)
Returns the shortest substring ofinput
that is matched by a pattern in the trie, ornull
if no match exists.- Parameters:
input
- A String to be matched by a pattern- Returns:
- shortest string match or null if no match is made
-
-