Package org.apache.nutch.crawl
Class Injector.InjectMapper
- java.lang.Object
-
- org.apache.hadoop.mapreduce.Mapper<Text,Writable,Text,CrawlDatum>
-
- org.apache.nutch.crawl.Injector.InjectMapper
-
- Enclosing class:
- Injector
public static class Injector.InjectMapper extends Mapper<Text,Writable,Text,CrawlDatum>
InjectMapper reads- the CrawlDb seeds are injected into
- the plain-text seed files and parses each line into the URL and metadata. Seed URLs are passed to the reducer with STATUS_INJECTED.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Mapper
Mapper.Context
-
-
Field Summary
Fields Modifier and Type Field Description static String
EQUAL_CHARACTER
static String
TAB_CHARACTER
static String
URL_NORMALIZING_SCOPE
-
Constructor Summary
Constructors Constructor Description InjectMapper()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
map(Text key, Writable value, Mapper.Context context)
void
setup(Mapper.Context context)
-
-
-
Field Detail
-
URL_NORMALIZING_SCOPE
public static final String URL_NORMALIZING_SCOPE
- See Also:
- Constant Field Values
-
TAB_CHARACTER
public static final String TAB_CHARACTER
- See Also:
- Constant Field Values
-
EQUAL_CHARACTER
public static final String EQUAL_CHARACTER
- See Also:
- Constant Field Values
-
-
Method Detail
-
setup
public void setup(Mapper.Context context)
-
map
public void map(Text key, Writable value, Mapper.Context context) throws IOException, InterruptedException
- Overrides:
map
in classMapper<Text,Writable,Text,CrawlDatum>
- Throws:
IOException
InterruptedException
-
-