Package org.apache.nutch.fetcher
The Nutch multi-threaded fetching module
-
Class Summary Class Description Fetcher A queue-based fetcher.Fetcher.FetcherRun Fetcher.InputFormat FetcherOutputFormat Splits FetcherOutput entries into multiple map files.FetcherThread This class picks items from queues and fetches the pages.FetcherThreadEvent This class is used to capture the various events occurring at fetch time.FetcherThreadPublisher This class handles the publishing of the events to the queue implementation.FetchItem This class describes the item to be fetched.FetchItemQueue This class handles FetchItems which come from the same host ID (be it a proto/hostname or proto/IP pair).FetchItemQueues A collection of queues that keeps track of the total number of items, and provides items eligible for fetching from any queue.FetchNode FetchNodeDb QueueFeeder This class feeds the queues with input items, and re-fills them as items are consumed by FetcherThread-s. -
Enum Summary Enum Description FetcherThreadEvent.PublishEventType Type of event to specify start, end or reporting of a fetch item.