Package org.apache.nutch.metadata
Class SpellCheckedMetadata
- java.lang.Object
-
- org.apache.nutch.metadata.Metadata
-
- org.apache.nutch.metadata.SpellCheckedMetadata
-
- All Implemented Interfaces:
Writable
,CreativeCommons
,DublinCore
,Feed
,HttpHeaders
,Nutch
public class SpellCheckedMetadata extends Metadata
A decorator to Metadata that adds spellchecking capabilities to property names. Currently used spelling vocabulary contains just the HTTP headers fromHttpHeaders
class.
-
-
Field Summary
-
Fields inherited from interface org.apache.nutch.metadata.CreativeCommons
LICENSE_LOCATION, LICENSE_URL, WORK_TYPE
-
Fields inherited from interface org.apache.nutch.metadata.DublinCore
CONTRIBUTOR, COVERAGE, CREATOR, DATE, DESCRIPTION, FORMAT, IDENTIFIER, LANGUAGE, MODIFIED, PUBLISHER, RELATION, RIGHTS, SOURCE, SUBJECT, TITLE, TYPE
-
Fields inherited from interface org.apache.nutch.metadata.Feed
FEED, FEED_AUTHOR, FEED_PUBLISHED, FEED_TAGS, FEED_UPDATED
-
Fields inherited from interface org.apache.nutch.metadata.HttpHeaders
CLIENT_TRANSFER_ENCODING, CONTENT_DISPOSITION, CONTENT_ENCODING, CONTENT_LANGUAGE, CONTENT_LENGTH, CONTENT_LOCATION, CONTENT_MD5, CONTENT_TYPE, IF_MODIFIED_SINCE, LAST_MODIFIED, LOCATION, TRANSFER_ENCODING, USER_AGENT, WRITABLE_CONTENT_TYPE
-
Fields inherited from interface org.apache.nutch.metadata.Nutch
ARG_CRAWLDB, ARG_HOSTDB, ARG_LINKDB, ARG_SEEDDIR, ARG_SEEDNAME, ARG_SEGMENTDIR, ARG_SEGMENTS, CACHING_FORBIDDEN_ALL, CACHING_FORBIDDEN_CONTENT, CACHING_FORBIDDEN_KEY, CACHING_FORBIDDEN_NONE, CHAR_ENCODING_FOR_CONVERSION, CRAWL_ID_KEY, FETCH_EVENT_CONTENTLANG, FETCH_EVENT_CONTENTTYPE, FETCH_EVENT_FETCHTIME, FETCH_EVENT_SCORE, FETCH_EVENT_TITLE, FETCH_STATUS_KEY, FETCH_TIME_KEY, FIXED_INTERVAL_KEY, GENERATE_TIME_KEY, ORIGINAL_CHAR_ENCODING, PROTO_STATUS_KEY, PROTOCOL_STATUS_CODE_KEY, REPR_URL_KEY, ROBOTS_METATAG, SCORE_KEY, SEGMENT_NAME_KEY, SIGNATURE_KEY, STAT_PROGRESS, VAL_RESULT, WRITABLE_FIXED_INTERVAL_KEY, WRITABLE_GENERATE_TIME_KEY, WRITABLE_PROTO_STATUS_KEY, WRITABLE_REPR_URL_KEY
-
-
Constructor Summary
Constructors Constructor Description SpellCheckedMetadata()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
add(String name, String value)
Add a metadata name/value mapping.String
get(String name)
Get the value associated to a metadata name.static String
getNormalizedName(String name)
Get the normalized name of metadata attribute name.String[]
getValues(String name)
Get the values associated to a metadata name.void
remove(String name)
Remove a metadata and all its associated values.void
set(String name, String value)
Set metadata name/value.
-
-
-
Method Detail
-
getNormalizedName
public static String getNormalizedName(String name)
Get the normalized name of metadata attribute name. This method tries to find a well-known metadata name (one of the metadata names defined in this class) that matches the specified name. The matching is error tolerant. For instance,- content-type gives Content-Type
- CoNtEntType gives Content-Type
- ConTnTtYpe gives Content-Type
- Parameters:
name
- HTTP header name to normalize- Returns:
- normalized HTTP header name
-
remove
public void remove(String name)
Description copied from class:Metadata
Remove a metadata and all its associated values.
-
add
public void add(String name, String value)
Description copied from class:Metadata
Add a metadata name/value mapping. Add the specified value to the list of values associated to the specified metadata name.
-
getValues
public String[] getValues(String name)
Description copied from class:Metadata
Get the values associated to a metadata name.
-
get
public String get(String name)
Description copied from class:Metadata
Get the value associated to a metadata name. If many values are associated to the specified name, then the first one is returned.
-
-