Package org.apache.nutch.crawl
Class LinkDbReader
- java.lang.Object
-
- org.apache.hadoop.conf.Configured
-
- org.apache.nutch.util.AbstractChecker
-
- org.apache.nutch.crawl.LinkDbReader
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Configurable
,Tool
public class LinkDbReader extends AbstractChecker implements Closeable
Read utility for the LinkDb.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
LinkDbReader.LinkDBDumpMapper
-
Field Summary
-
Fields inherited from class org.apache.nutch.util.AbstractChecker
keepClientCnxOpen, stdin, tcpPort, usage
-
-
Constructor Summary
Constructors Constructor Description LinkDbReader()
LinkDbReader(Configuration conf, Path directory)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
String[]
getAnchors(Text url)
Inlinks
getInlinks(Text url)
void
init(Path directory)
static void
main(String[] args)
void
openReaders()
protected int
process(String line, StringBuilder output)
void
processDumpJob(String linkdb, String output, String regex)
int
run(String[] args)
-
Methods inherited from class org.apache.nutch.util.AbstractChecker
getProtocolOutput, parseArgs, processSingle, processStdin, processTCP, run
-
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
-
-
-
-
Constructor Detail
-
LinkDbReader
public LinkDbReader()
-
LinkDbReader
public LinkDbReader(Configuration conf, Path directory) throws Exception
- Throws:
Exception
-
-
Method Detail
-
openReaders
public void openReaders() throws IOException
- Throws:
IOException
-
getAnchors
public String[] getAnchors(Text url) throws IOException
- Throws:
IOException
-
getInlinks
public Inlinks getInlinks(Text url) throws IOException
- Throws:
IOException
-
close
public void close() throws IOException
- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Throws:
IOException
-
processDumpJob
public void processDumpJob(String linkdb, String output, String regex) throws IOException, InterruptedException, ClassNotFoundException
-
process
protected int process(String line, StringBuilder output) throws Exception
- Specified by:
process
in classAbstractChecker
- Throws:
Exception
-
-