Package org.apache.nutch.hostdb
Class UpdateHostDb
- java.lang.Object
-
- org.apache.hadoop.conf.Configured
-
- org.apache.nutch.hostdb.UpdateHostDb
-
- All Implemented Interfaces:
Configurable
,Tool
public class UpdateHostDb extends Configured implements Tool
Tool to create a HostDB from the CrawlDB. It aggregates fetch status values by host and checks DNS entries for hosts.
-
-
Field Summary
Fields Modifier and Type Field Description static String
HOSTDB_CHECK_FAILED
static String
HOSTDB_CHECK_KNOWN
static String
HOSTDB_CHECK_NEW
static String
HOSTDB_CRAWLDATUM_PROCESSORS
static String
HOSTDB_FORCE_CHECK
static String
HOSTDB_NUM_RESOLVER_THREADS
static String
HOSTDB_NUMERIC_FIELDS
static String
HOSTDB_PERCENTILES
static String
HOSTDB_PURGE_FAILED_HOSTS_THRESHOLD
static String
HOSTDB_RECHECK_INTERVAL
static String
HOSTDB_STRING_FIELDS
static String
HOSTDB_URL_FILTERING
static String
HOSTDB_URL_NORMALIZING
static String
LOCK_NAME
-
Constructor Summary
Constructors Constructor Description UpdateHostDb()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static void
main(String[] args)
int
run(String[] args)
-
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
-
-
-
-
Field Detail
-
LOCK_NAME
public static final String LOCK_NAME
- See Also:
- Constant Field Values
-
HOSTDB_PURGE_FAILED_HOSTS_THRESHOLD
public static final String HOSTDB_PURGE_FAILED_HOSTS_THRESHOLD
- See Also:
- Constant Field Values
-
HOSTDB_NUM_RESOLVER_THREADS
public static final String HOSTDB_NUM_RESOLVER_THREADS
- See Also:
- Constant Field Values
-
HOSTDB_RECHECK_INTERVAL
public static final String HOSTDB_RECHECK_INTERVAL
- See Also:
- Constant Field Values
-
HOSTDB_CHECK_FAILED
public static final String HOSTDB_CHECK_FAILED
- See Also:
- Constant Field Values
-
HOSTDB_CHECK_NEW
public static final String HOSTDB_CHECK_NEW
- See Also:
- Constant Field Values
-
HOSTDB_CHECK_KNOWN
public static final String HOSTDB_CHECK_KNOWN
- See Also:
- Constant Field Values
-
HOSTDB_FORCE_CHECK
public static final String HOSTDB_FORCE_CHECK
- See Also:
- Constant Field Values
-
HOSTDB_URL_FILTERING
public static final String HOSTDB_URL_FILTERING
- See Also:
- Constant Field Values
-
HOSTDB_URL_NORMALIZING
public static final String HOSTDB_URL_NORMALIZING
- See Also:
- Constant Field Values
-
HOSTDB_NUMERIC_FIELDS
public static final String HOSTDB_NUMERIC_FIELDS
- See Also:
- Constant Field Values
-
HOSTDB_STRING_FIELDS
public static final String HOSTDB_STRING_FIELDS
- See Also:
- Constant Field Values
-
HOSTDB_PERCENTILES
public static final String HOSTDB_PERCENTILES
- See Also:
- Constant Field Values
-
HOSTDB_CRAWLDATUM_PROCESSORS
public static final String HOSTDB_CRAWLDATUM_PROCESSORS
- See Also:
- Constant Field Values
-
-