Package org.apache.nutch.crawl
Class URLPartitioner
- java.lang.Object
-
- org.apache.hadoop.mapreduce.Partitioner<Text,Writable>
-
- org.apache.nutch.crawl.URLPartitioner
-
- All Implemented Interfaces:
Configurable
public class URLPartitioner extends Partitioner<Text,Writable> implements Configurable
Partition urls by host, domain name or IP depending on the value of the parameter 'partition.url.mode' which can be 'byHost', 'byDomain' or 'byIP'
-
-
Field Summary
Fields Modifier and Type Field Description static String
PARTITION_MODE_DOMAIN
static String
PARTITION_MODE_HOST
static String
PARTITION_MODE_IP
static String
PARTITION_MODE_KEY
-
Constructor Summary
Constructors Constructor Description URLPartitioner()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Configuration
getConf()
int
getPartition(Text key, Writable value, int numReduceTasks)
Hash by host or domain name or IP address.void
setConf(Configuration conf)
-
-
-
Field Detail
-
PARTITION_MODE_KEY
public static final String PARTITION_MODE_KEY
- See Also:
- Constant Field Values
-
PARTITION_MODE_HOST
public static final String PARTITION_MODE_HOST
- See Also:
- Constant Field Values
-
PARTITION_MODE_DOMAIN
public static final String PARTITION_MODE_DOMAIN
- See Also:
- Constant Field Values
-
PARTITION_MODE_IP
public static final String PARTITION_MODE_IP
- See Also:
- Constant Field Values
-
-
Method Detail
-
setConf
public void setConf(Configuration conf)
- Specified by:
setConf
in interfaceConfigurable
-
getConf
public Configuration getConf()
- Specified by:
getConf
in interfaceConfigurable
-
getPartition
public int getPartition(Text key, Writable value, int numReduceTasks)
Hash by host or domain name or IP address.- Specified by:
getPartition
in classPartitioner<Text,Writable>
-
-