Interface Nutch

    • Field Detail

      • WRITABLE_GENERATE_TIME_KEY

        static final Text WRITABLE_GENERATE_TIME_KEY
      • PROTOCOL_STATUS_CODE_KEY

        static final Text PROTOCOL_STATUS_CODE_KEY
      • WRITABLE_PROTO_STATUS_KEY

        static final Text WRITABLE_PROTO_STATUS_KEY
      • CACHING_FORBIDDEN_KEY

        static final String CACHING_FORBIDDEN_KEY
        Sites may request that search engines don't provide access to cached documents.
        See Also:
        Constant Field Values
      • CACHING_FORBIDDEN_NONE

        static final String CACHING_FORBIDDEN_NONE
        Show both original forbidden content and summaries (default).
        See Also:
        Constant Field Values
      • CACHING_FORBIDDEN_ALL

        static final String CACHING_FORBIDDEN_ALL
        Don't show either original forbidden content or summaries.
        See Also:
        Constant Field Values
      • CACHING_FORBIDDEN_CONTENT

        static final String CACHING_FORBIDDEN_CONTENT
        Don't show original forbidden content, but show summaries.
        See Also:
        Constant Field Values
      • WRITABLE_REPR_URL_KEY

        static final Text WRITABLE_REPR_URL_KEY
      • FIXED_INTERVAL_KEY

        static final String FIXED_INTERVAL_KEY
        Used by AdaptiveFetchSchedule to maintain custom fetch interval
        See Also:
        Constant Field Values
      • WRITABLE_FIXED_INTERVAL_KEY

        static final Text WRITABLE_FIXED_INTERVAL_KEY
      • ARG_SEEDDIR

        static final String ARG_SEEDDIR
        Argument key to specify location of the seed url dir for the REST endpoints
        See Also:
        Constant Field Values
      • ARG_SEEDNAME

        static final String ARG_SEEDNAME
        Argument key to specify name of a seed list for the REST endpoints
        See Also:
        Constant Field Values
      • ARG_CRAWLDB

        static final String ARG_CRAWLDB
        Argument key to specify the location of crawldb for the REST endpoints
        See Also:
        Constant Field Values
      • ARG_LINKDB

        static final String ARG_LINKDB
        Argument key to specify the location of linkdb for the REST endpoints
        See Also:
        Constant Field Values
      • VAL_RESULT

        static final String VAL_RESULT
        Name of the key used in the Result Map sent back by the REST endpoint
        See Also:
        Constant Field Values
      • ARG_SEGMENTDIR

        static final String ARG_SEGMENTDIR
        Argument key to specify the location of a directory of segments for the REST endpoints. Similar to the -dir command in the bin/nutch script
        See Also:
        Constant Field Values
      • ARG_SEGMENTS

        static final String ARG_SEGMENTS
        Argument key to specify the location of individual segment or list of segments for the REST endpoints. The behavior differs for diffirent endpoints: CrawlDb, LinkDb and Indexing Jobs take list of segments, Fetcher and Parse segment take one segment
        See Also:
        Constant Field Values
      • ARG_HOSTDB

        static final String ARG_HOSTDB
        Argument key to specify the location of hostdb for the REST endpoints
        See Also:
        Constant Field Values
      • FETCH_EVENT_TITLE

        static final String FETCH_EVENT_TITLE
        Title key in the Pub/Sub event metadata for the title of the parsed page
        See Also:
        Constant Field Values
      • FETCH_EVENT_CONTENTTYPE

        static final String FETCH_EVENT_CONTENTTYPE
        Content-type key in the Pub/Sub event metadata for the content-type of the parsed page
        See Also:
        Constant Field Values
      • FETCH_EVENT_SCORE

        static final String FETCH_EVENT_SCORE
        Score key in the Pub/Sub event metadata for the score of the parsed page
        See Also:
        Constant Field Values
      • FETCH_EVENT_FETCHTIME

        static final String FETCH_EVENT_FETCHTIME
        Fetch time key in the Pub/Sub event metadata for the fetch time of the parsed page
        See Also:
        Constant Field Values
      • FETCH_EVENT_CONTENTLANG

        static final String FETCH_EVENT_CONTENTLANG
        Content-lanueage key in the Pub/Sub event metadata for the content-language of the parsed page
        See Also:
        Constant Field Values