Package org.apache.nutch.tools
Class ResolveUrls
- java.lang.Object
-
- org.apache.nutch.tools.ResolveUrls
-
public class ResolveUrls extends Object
A simple tool that will spin up multiple threads to resolve urls to ip addresses. This can be used to verify that pages that are failing due to UnknownHostException during fetching are actually bad and are not failing due to a dns problem in fetching.
-
-
Constructor Summary
Constructors Constructor Description ResolveUrls(String urlsFile)
Create a new ResolveUrls with a file from the local file system.ResolveUrls(String urlsFile, int numThreads)
Create a new ResolveUrls with a urls file and a number of threads for the Thread pool.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static void
main(String[] args)
Runs the resolve urls tool.void
resolveUrls()
Creates a thread pool for resolving urls.
-
-
-
Constructor Detail
-
ResolveUrls
public ResolveUrls(String urlsFile)
Create a new ResolveUrls with a file from the local file system.- Parameters:
urlsFile
- The local urls file, one url per line.
-
ResolveUrls
public ResolveUrls(String urlsFile, int numThreads)
Create a new ResolveUrls with a urls file and a number of threads for the Thread pool. Number of threads is 100 by default.- Parameters:
urlsFile
- The local urls file, one url per line.numThreads
- The number of threads used to resolve urls in parallel.
-
-
Method Detail
-
resolveUrls
public void resolveUrls()
Creates a thread pool for resolving urls. Reads in the url file on the local filesystem. For each url it attempts to resolve it keeping a total account of the number resolved, errored, and the amount of time.
-
main
public static void main(String[] args)
Runs the resolve urls tool.- Parameters:
args
- the input arguments for this tool. Running with 'help' will print parameter options.
-
-