Package org.apache.nutch.parse
Class ParseResult
- java.lang.Object
-
- org.apache.nutch.parse.ParseResult
-
public class ParseResult extends Object implements Iterable<Map.Entry<Text,Parse>>
A utility class that stores result of a parse. Internally a ParseResult stores <Text
,Parse
> pairs.Parsers may return multiple results, which correspond to parts or other associated documents related to the original URL.
There will be usually one parse result that corresponds directly to the original URL, and possibly many (or none) results that correspond to derived URLs (or sub-URLs).
-
-
Constructor Summary
Constructors Constructor Description ParseResult(String originalUrl)
Create a container for parse results.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static ParseResult
createParseResult(String url, Parse parse)
Convenience method for obtainingParseResult
from a singleParse
output.void
filter()
Remove all results where status is not successful (as determined byParseStatus.isSuccess()
).Parse
get(String key)
Retrieve a single parse output.Parse
get(Text key)
Retrieve a single parse output.boolean
isAnySuccess()
A convenience method which returns true if at least one of the parses is successful.boolean
isEmpty()
Checks whether the result is empty.boolean
isSuccess()
A convenience method which returns true only if all parses are successful.Iterator<Map.Entry<Text,Parse>>
iterator()
Iterate over all entries in the <url, Parse> map.void
put(String key, ParseText text, ParseData data)
Store a result of parsing.void
put(Text key, ParseText text, ParseData data)
Store a result of parsing.int
size()
Return the number of parse outputs (both successful and failed)-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
-
-
-
Constructor Detail
-
ParseResult
public ParseResult(String originalUrl)
Create a container for parse results.- Parameters:
originalUrl
- the original url from which all parse results have been obtained.
-
-
Method Detail
-
createParseResult
public static ParseResult createParseResult(String url, Parse parse)
Convenience method for obtainingParseResult
from a singleParse
output.- Parameters:
url
- canonical url.parse
- single parse output.- Returns:
- result containing the single parse output.
-
isEmpty
public boolean isEmpty()
Checks whether the result is empty.- Returns:
- true if empty, false otherwise
-
size
public int size()
Return the number of parse outputs (both successful and failed)- Returns:
- an int representing the parse map size
-
get
public Parse get(String key)
Retrieve a single parse output.- Parameters:
key
- sub-url under which the parse output is stored.- Returns:
- parse output corresponding to this sub-url, or null.
-
get
public Parse get(Text key)
Retrieve a single parse output.- Parameters:
key
- sub-url under which the parse output is stored.- Returns:
- parse output corresponding to this sub-url, or null.
-
put
public void put(Text key, ParseText text, ParseData data)
Store a result of parsing.- Parameters:
key
- URL or sub-url of this parse resulttext
- plain text resultdata
- corresponding parse metadata of this result
-
put
public void put(String key, ParseText text, ParseData data)
Store a result of parsing.- Parameters:
key
- URL or sub-url of this parse resulttext
- plain text resultdata
- corresponding parse metadata of this result
-
iterator
public Iterator<Map.Entry<Text,Parse>> iterator()
Iterate over all entries in the <url, Parse> map.
-
filter
public void filter()
Remove all results where status is not successful (as determined byParseStatus.isSuccess()
). Note that effects of this operation cannot be reversed.
-
isSuccess
public boolean isSuccess()
A convenience method which returns true only if all parses are successful. Parse success is determined byParseStatus.isSuccess()
.- Returns:
- true if overall result is a success, false otherwise
-
isAnySuccess
public boolean isAnySuccess()
A convenience method which returns true if at least one of the parses is successful. Parse success is determined byParseStatus.isSuccess()
.- Returns:
- true if atleast one result is a success, false otherwise
-
-