Package org.apache.nutch.tools
Class CommonCrawlFormatFactory
- java.lang.Object
-
- org.apache.nutch.tools.CommonCrawlFormatFactory
-
public class CommonCrawlFormatFactory extends Object
Factory class that creates newCommonCrawlFormat
objects (a.k.a. formatter) that map crawled files to CommonCrawl format.
-
-
Constructor Summary
Constructors Constructor Description CommonCrawlFormatFactory()
-
Method Summary
All Methods Static Methods Concrete Methods Deprecated Methods Modifier and Type Method Description static CommonCrawlFormat
getCommonCrawlFormat(String formatType, String url, Content content, Metadata metadata, Configuration nutchConf, CommonCrawlConfig config)
Deprecated.static CommonCrawlFormat
getCommonCrawlFormat(String formatType, Configuration nutchConf, CommonCrawlConfig config)
-
-
-
Method Detail
-
getCommonCrawlFormat
public static CommonCrawlFormat getCommonCrawlFormat(String formatType, String url, Content content, Metadata metadata, Configuration nutchConf, CommonCrawlConfig config) throws IOException
Deprecated.Returns a new instance of aCommonCrawlFormat
object specifying the type of formatter.- Parameters:
formatType
- the type of formatter to be created.url
- the url.content
- the content.metadata
- the metadata.nutchConf
- the configuration.config
- the CommonCrawl output configuration.- Returns:
- the new
CommonCrawlFormat
object. - Throws:
IOException
- If any I/O error occurs.
-
getCommonCrawlFormat
public static CommonCrawlFormat getCommonCrawlFormat(String formatType, Configuration nutchConf, CommonCrawlConfig config) throws IOException
- Throws:
IOException
-
-