Package org.apache.nutch.fetcher
Class Fetcher.InputFormat
- java.lang.Object
-
- org.apache.hadoop.mapreduce.InputFormat<K,V>
-
- org.apache.hadoop.mapreduce.lib.input.FileInputFormat<K,V>
-
- org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat<Text,CrawlDatum>
-
- org.apache.nutch.fetcher.Fetcher.InputFormat
-
- Enclosing class:
- Fetcher
public static class Fetcher.InputFormat extends SequenceFileInputFormat<Text,CrawlDatum>
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
FileInputFormat.Counter
-
-
Field Summary
-
Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
DEFAULT_LIST_STATUS_NUM_THREADS, INPUT_DIR, INPUT_DIR_NONRECURSIVE_IGNORE_SUBDIRS, INPUT_DIR_RECURSIVE, LIST_STATUS_NUM_THREADS, NUM_INPUT_FILES, PATHFILTER_CLASS, SPLIT_MAXSIZE, SPLIT_MINSIZE
-
-
Constructor Summary
Constructors Constructor Description InputFormat()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description List<InputSplit>
getSplits(JobContext job)
Don't split inputs to keep things polite - a single fetch list must be processed in one fetcher task.-
Methods inherited from class org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat
createRecordReader, getFormatMinSplitSize, listStatus
-
Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, isSplitable, makeSplit, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize
-
-
-
-
Method Detail
-
getSplits
public List<InputSplit> getSplits(JobContext job) throws IOException
Don't split inputs to keep things polite - a single fetch list must be processed in one fetcher task. Do not split a fetch lists and assigning the splits to multiple parallel tasks.- Overrides:
getSplits
in classFileInputFormat<Text,CrawlDatum>
- Throws:
IOException
-
-