Uses of Class org.apache.nutch.crawl.CrawlDatum (apache-nutch 1.20 API)

Packages that use CrawlDatum
Package	Description
org.apache.nutch.analysis.lang	Text document language identifier.
org.apache.nutch.crawl	Crawl control code and tools to run the crawler.
org.apache.nutch.fetcher	The Nutch multi-threaded fetching module
org.apache.nutch.hostdb
org.apache.nutch.indexer	Index content, configure and run indexing and cleaning jobs to add, update, and delete documents from an index.
org.apache.nutch.indexer.anchor	An indexing plugin for inbound anchor text.
org.apache.nutch.indexer.arbitrary	Indexing filter to add document arbitrary data to the index from the output of a user-specified class.
org.apache.nutch.indexer.basic	A basic indexing plugin, adds basic fields: url, host, title, content, etc.
org.apache.nutch.indexer.feed	Indexing filter to index meta data from RSS feeds.
org.apache.nutch.indexer.filter
org.apache.nutch.indexer.geoip	This plugin implements an indexing filter which takes advantage of the GeoIP2-java API.
org.apache.nutch.indexer.jexl	This plugin implements a dynamic indexing filter which uses JEXL expressions to allow filtering based on the page's metadata
org.apache.nutch.indexer.links
org.apache.nutch.indexer.metadata	Indexing filter to add document metadata to the index.
org.apache.nutch.indexer.more	A more indexing plugin, adds "more" index fields:last modified date, MIME type, content length.
org.apache.nutch.indexer.replace	Indexing filter to allow pattern replacements on metadata.
org.apache.nutch.indexer.staticfield	A simple plugin called at indexing that adds fields with static data.
org.apache.nutch.indexer.subcollection	Indexing filter to assign documents to subcollections.
org.apache.nutch.indexer.tld	Top Level Domain Indexing plugin.
org.apache.nutch.indexer.urlmeta	URL Meta Tag Indexing Plugin
org.apache.nutch.microformats.reltag	A microformats Rel-Tag Parser/Indexer/Querier plugin.
org.apache.nutch.protocol	Classes related to the `Protocol` interface, see also `org.apache.nutch.net.protocols`.
org.apache.nutch.protocol.file	Protocol plugin which supports retrieving local file resources.
org.apache.nutch.protocol.ftp	Protocol plugin which supports retrieving documents via the ftp protocol.
org.apache.nutch.protocol.htmlunit	Protocol plugin which supports retrieving documents via HTTP/HTTPS using Selenium and the HtmlUnitDriver web driver for the for the HtmlUnit headless browser.
org.apache.nutch.protocol.http	Protocol plugin which supports retrieving documents via the http protocol.
org.apache.nutch.protocol.http.api	Common API used by HTTP plugins (`http`, `httpclient`, etc.)
org.apache.nutch.protocol.httpclient	Protocol plugin which supports retrieving documents via the HTTP andHTTPS protocols, optionally with Basic, Digest and NTLM authentication schemes for web server as well as proxy server.
org.apache.nutch.protocol.interactiveselenium	Protocol plugin which supports retrieving documents using and interacting with Selenium.
org.apache.nutch.protocol.okhttp	Protocol plugin for HTTP/HTTPS based on okhttp, supports HTTP 1.1 and/or http/2.
org.apache.nutch.protocol.selenium	Protocol plugin which supports retrieving documents via Selenium.
org.apache.nutch.scoring	The `ScoringFilter` interface.
org.apache.nutch.scoring.depth	Scoring filter to stop crawling at a configurable depth (number of "hops" from seed URLs).
org.apache.nutch.scoring.link	Scoring filter used in conjunction with `WebGraph`.
org.apache.nutch.scoring.metadata	Metadata Scoring Plugin
org.apache.nutch.scoring.opic	Scoring filter implementing a variant of the Online Page Importance Computation (OPIC) algorithm.
org.apache.nutch.scoring.orphan	Scoring filter to modify score or status of orphaned pages (no inlinks found for a configurable amount of time).
org.apache.nutch.scoring.similarity
org.apache.nutch.scoring.similarity.cosine	Implements the cosine similarity metric for scoring relevant documents
org.apache.nutch.scoring.tld	Top Level Domain Scoring plugin.
org.apache.nutch.scoring.urlmeta	URL Meta Tag Scoring Plugin
org.apache.nutch.segment	A segment stores all data from on generate/fetch/update cycle: fetch list, protocol status, raw content, parsed content, and extracted outgoing links.
org.apache.nutch.util	Miscellaneous utility classes.
org.creativecommons.nutch	Sample plugins that parse and index Creative Commons metadata.

Uses of CrawlDatum in org.apache.nutch.analysis.lang

Methods in org.apache.nutch.analysis.lang with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	LanguageIndexingFilter.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`

Uses of CrawlDatum in org.apache.nutch.crawl

Fields in org.apache.nutch.crawl declared as CrawlDatum
Modifier and Type	Field	Description
`CrawlDatum`	Generator.SelectorEntry.`datum`

Methods in org.apache.nutch.crawl that return CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	AbstractFetchSchedule.`forceRefetch(Text url, CrawlDatum datum, boolean asap)`	This method resets fetchTime, fetchInterval, modifiedTime, retriesSinceFetch and page signature, so that it forces refetching.
`CrawlDatum`	FetchSchedule.`forceRefetch(Text url, CrawlDatum datum, boolean asap)`	This method resets fetchTime, fetchInterval, modifiedTime and page signature, so that it forces refetching.
`CrawlDatum`	CrawlDbReader.`get(String crawlDb, String url, Configuration config)`
`protected CrawlDatum`	DeduplicationJob.DedupReducer.`getDuplicate(CrawlDatum existingDoc, CrawlDatum newDoc)`
`CrawlDatum`	AbstractFetchSchedule.`initializeSchedule(Text url, CrawlDatum datum)`	Initialize fetch schedule related data.
`CrawlDatum`	FetchSchedule.`initializeSchedule(Text url, CrawlDatum datum)`	Initialize fetch schedule related data.
`static CrawlDatum`	CrawlDatum.`read(DataInput in)`
`CrawlDatum`	AbstractFetchSchedule.`setFetchSchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime, long modifiedTime, int state)`	Sets the `fetchInterval` and `fetchTime` on a successfully fetched page.
`CrawlDatum`	AdaptiveFetchSchedule.`setFetchSchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime, long modifiedTime, int state)`
`CrawlDatum`	DefaultFetchSchedule.`setFetchSchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime, long modifiedTime, int state)`
`CrawlDatum`	FetchSchedule.`setFetchSchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime, long modifiedTime, int state)`	Sets the `fetchInterval` and `fetchTime` on a successfully fetched page.
`CrawlDatum`	MimeAdaptiveFetchSchedule.`setFetchSchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime, long modifiedTime, int state)`
`CrawlDatum`	AbstractFetchSchedule.`setPageGoneSchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime)`	This method specifies how to schedule refetching of pages marked as GONE.
`CrawlDatum`	FetchSchedule.`setPageGoneSchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime)`	This method specifies how to schedule refetching of pages marked as GONE.
`CrawlDatum`	AbstractFetchSchedule.`setPageRetrySchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime)`	This method adjusts the fetch schedule if fetching needs to be re-tried due to transient errors.
`CrawlDatum`	FetchSchedule.`setPageRetrySchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime)`	This method adjusts the fetch schedule if fetching needs to be re-tried due to transient errors.

Methods in org.apache.nutch.crawl that return types with arguments of type CrawlDatum
Modifier and Type	Method	Description
`RecordWriter<Text,CrawlDatum>`	CrawlDbReader.CrawlDatumCsvOutputFormat.`getRecordWriter(TaskAttemptContext context)`
`RecordWriter<Text,CrawlDatum>`	CrawlDbReader.CrawlDatumJsonOutputFormat.`getRecordWriter(TaskAttemptContext context)`

Methods in org.apache.nutch.crawl with parameters of type CrawlDatum
Modifier and Type	Method	Description
`long`	AbstractFetchSchedule.`calculateLastFetchTime(CrawlDatum datum)`	This method return the last fetch time of the CrawlDatum
`long`	FetchSchedule.`calculateLastFetchTime(CrawlDatum datum)`	Calculates last fetch time of the given CrawlDatum.
`int`	CrawlDatum.`compareTo(CrawlDatum that)`	Sort two `CrawlDatum` objects by decreasing score.
`CrawlDatum`	AbstractFetchSchedule.`forceRefetch(Text url, CrawlDatum datum, boolean asap)`	This method resets fetchTime, fetchInterval, modifiedTime, retriesSinceFetch and page signature, so that it forces refetching.
`CrawlDatum`	FetchSchedule.`forceRefetch(Text url, CrawlDatum datum, boolean asap)`	This method resets fetchTime, fetchInterval, modifiedTime and page signature, so that it forces refetching.
`protected CrawlDatum`	DeduplicationJob.DedupReducer.`getDuplicate(CrawlDatum existingDoc, CrawlDatum newDoc)`
`static boolean`	CrawlDatum.`hasDbStatus(CrawlDatum datum)`
`static boolean`	CrawlDatum.`hasFetchStatus(CrawlDatum datum)`
`CrawlDatum`	AbstractFetchSchedule.`initializeSchedule(Text url, CrawlDatum datum)`	Initialize fetch schedule related data.
`CrawlDatum`	FetchSchedule.`initializeSchedule(Text url, CrawlDatum datum)`	Initialize fetch schedule related data.
`void`	CrawlDbFilter.`map(Text key, CrawlDatum value, Mapper.Context context)`
`void`	CrawlDbReader.CrawlDbDumpMapper.`map(Text key, CrawlDatum value, Mapper.Context context)`
`void`	CrawlDbReader.CrawlDbStatMapper.`map(Text key, CrawlDatum value, Mapper.Context context)`
`void`	CrawlDbReader.CrawlDbTopNMapper.`map(Text key, CrawlDatum value, Mapper.Context context)`
`void`	DeduplicationJob.DBFilter.`map(Text key, CrawlDatum value, Mapper.Context context)`
`void`	Generator.CrawlDbUpdater.CrawlDbUpdateMapper.`map(Text key, CrawlDatum value, Mapper.Context context)`
`void`	Generator.SelectorMapper.`map(Text key, CrawlDatum value, Mapper.Context context)`
`void`	CrawlDatum.`putAllMetaData(CrawlDatum other)`	Add all metadata from other CrawlDatum to this CrawlDatum.
`void`	CrawlDatum.`set(CrawlDatum that)`	Copy the contents of another instance into this instance.
`CrawlDatum`	AbstractFetchSchedule.`setFetchSchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime, long modifiedTime, int state)`	Sets the `fetchInterval` and `fetchTime` on a successfully fetched page.
`CrawlDatum`	AdaptiveFetchSchedule.`setFetchSchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime, long modifiedTime, int state)`
`CrawlDatum`	DefaultFetchSchedule.`setFetchSchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime, long modifiedTime, int state)`
`CrawlDatum`	FetchSchedule.`setFetchSchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime, long modifiedTime, int state)`	Sets the `fetchInterval` and `fetchTime` on a successfully fetched page.
`CrawlDatum`	MimeAdaptiveFetchSchedule.`setFetchSchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime, long modifiedTime, int state)`
`CrawlDatum`	AbstractFetchSchedule.`setPageGoneSchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime)`	This method specifies how to schedule refetching of pages marked as GONE.
`CrawlDatum`	FetchSchedule.`setPageGoneSchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime)`	This method specifies how to schedule refetching of pages marked as GONE.
`CrawlDatum`	AbstractFetchSchedule.`setPageRetrySchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime)`	This method adjusts the fetch schedule if fetching needs to be re-tried due to transient errors.
`CrawlDatum`	FetchSchedule.`setPageRetrySchedule(Text url, CrawlDatum datum, long prevFetchTime, long prevModifiedTime, long fetchTime)`	This method adjusts the fetch schedule if fetching needs to be re-tried due to transient errors.
`boolean`	AbstractFetchSchedule.`shouldFetch(Text url, CrawlDatum datum, long curTime)`	This method provides information whether the page is suitable for selection in the current fetchlist.
`boolean`	FetchSchedule.`shouldFetch(Text url, CrawlDatum datum, long curTime)`	This method provides information whether the page is suitable for selection in the current fetchlist.
`void`	CrawlDbReader.CrawlDatumCsvOutputFormat.LineRecordWriter.`write(Text key, CrawlDatum value)`
`void`	CrawlDbReader.CrawlDatumJsonOutputFormat.LineRecordWriter.`write(Text key, CrawlDatum value)`
`protected void`	DeduplicationJob.DedupReducer.`writeOutAsDuplicate(CrawlDatum datum, Reducer.Context context)`

Method parameters in org.apache.nutch.crawl with type arguments of type CrawlDatum
Modifier and Type	Method	Description
`void`	CrawlDbMerger.Merger.`reduce(Text key, Iterable<CrawlDatum> values, Reducer.Context context)`
`void`	CrawlDbReducer.`reduce(Text key, Iterable<CrawlDatum> values, Reducer.Context context)`
`void`	DeduplicationJob.DedupReducer.`reduce(K key, Iterable<CrawlDatum> values, Reducer.Context context)`
`void`	DeduplicationJob.StatusUpdateReducer.`reduce(Text key, Iterable<CrawlDatum> values, Reducer.Context context)`
`void`	Generator.CrawlDbUpdater.CrawlDbUpdateReducer.`reduce(Text key, Iterable<CrawlDatum> values, Reducer.Context context)`
`void`	Injector.InjectReducer.`reduce(Text key, Iterable<CrawlDatum> values, Reducer.Context context)`	Merge the input records of one URL as per rules below :

Uses of CrawlDatum in org.apache.nutch.fetcher

Methods in org.apache.nutch.fetcher that return CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	FetchItem.`getDatum()`

Methods in org.apache.nutch.fetcher with parameters of type CrawlDatum
Modifier and Type	Method	Description
`org.apache.nutch.fetcher.FetchItemQueues.QueuingStatus`	FetchItemQueues.`addFetchItem(Text url, CrawlDatum datum)`
`static FetchItem`	FetchItem.`create(Text url, CrawlDatum datum, String queueMode)`	Create an item.
`static FetchItem`	FetchItem.`create(Text url, CrawlDatum datum, String queueMode, int outlinkDepth)`	Create an item.

Constructors in org.apache.nutch.fetcher with parameters of type CrawlDatum
Constructor	Description
`FetchItem(Text url, URL u, CrawlDatum datum, String queueID)`
`FetchItem(Text url, URL u, CrawlDatum datum, String queueID, int outlinkDepth)`

Uses of CrawlDatum in org.apache.nutch.hostdb

Fields in org.apache.nutch.hostdb declared as CrawlDatum
Modifier and Type	Field	Description
`protected CrawlDatum`	UpdateHostDbMapper.`crawlDatum`

Methods in org.apache.nutch.hostdb with parameters of type CrawlDatum
Modifier and Type	Method	Description
`void`	CrawlDatumProcessor.`count(CrawlDatum crawlDatum)`	Process a single crawl datum instance to aggregate custom counts.
`void`	FetchOverdueCrawlDatumProcessor.`count(CrawlDatum crawlDatum)`

Uses of CrawlDatum in org.apache.nutch.indexer

Methods in org.apache.nutch.indexer with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	IndexingFilter.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`	Adds fields or otherwise modifies the document that will be indexed for a parse.
`NutchDocument`	IndexingFilters.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`	Run all defined filters.
`void`	CleaningJob.DBFilter.`map(Text key, CrawlDatum value, Mapper.Context context)`

Uses of CrawlDatum in org.apache.nutch.indexer.anchor

Methods in org.apache.nutch.indexer.anchor with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	AnchorIndexingFilter.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`	The `AnchorIndexingFilter` filter object which supports boolean configuration settings for the deduplication of anchors.

Uses of CrawlDatum in org.apache.nutch.indexer.arbitrary

Methods in org.apache.nutch.indexer.arbitrary with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	ArbitraryIndexingFilter.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`	The `ArbitraryIndexingFilter` filter object uses reflection to instantiate the configured class and invoke the configured method.

Uses of CrawlDatum in org.apache.nutch.indexer.basic

Methods in org.apache.nutch.indexer.basic with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	BasicIndexingFilter.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`	The `BasicIndexingFilter` filter object which supports few configuration settings for adding basic searchable fields.

Uses of CrawlDatum in org.apache.nutch.indexer.feed

Methods in org.apache.nutch.indexer.feed with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	FeedIndexingFilter.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`	Extracts out the relevant fields: FEED_AUTHOR FEED_TAGS FEED_PUBLISHED FEED_UPDATED FEED And sends them to the `Indexer` for indexing within the Nutch index.

Uses of CrawlDatum in org.apache.nutch.indexer.filter

Methods in org.apache.nutch.indexer.filter with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	MimeTypeIndexingFilter.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`

Uses of CrawlDatum in org.apache.nutch.indexer.geoip

Methods in org.apache.nutch.indexer.geoip with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	GeoIPIndexingFilter.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`

Uses of CrawlDatum in org.apache.nutch.indexer.jexl

Methods in org.apache.nutch.indexer.jexl with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	JexlIndexingFilter.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`

Uses of CrawlDatum in org.apache.nutch.indexer.links

Methods in org.apache.nutch.indexer.links with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	LinksIndexingFilter.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`

Uses of CrawlDatum in org.apache.nutch.indexer.metadata

Methods in org.apache.nutch.indexer.metadata with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	MetadataIndexer.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`

Uses of CrawlDatum in org.apache.nutch.indexer.more

Methods in org.apache.nutch.indexer.more with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	MoreIndexingFilter.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`

Uses of CrawlDatum in org.apache.nutch.indexer.replace

Methods in org.apache.nutch.indexer.replace with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	ReplaceIndexer.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`

Uses of CrawlDatum in org.apache.nutch.indexer.staticfield

Methods in org.apache.nutch.indexer.staticfield with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	StaticFieldIndexer.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`	The `StaticFieldIndexer` filter object which adds fields as per configuration setting.

Uses of CrawlDatum in org.apache.nutch.indexer.subcollection

Methods in org.apache.nutch.indexer.subcollection with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	SubcollectionIndexingFilter.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`

Uses of CrawlDatum in org.apache.nutch.indexer.tld

Methods in org.apache.nutch.indexer.tld with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	TLDIndexingFilter.`filter(NutchDocument doc, Parse parse, Text urlText, CrawlDatum datum, Inlinks inlinks)`

Uses of CrawlDatum in org.apache.nutch.indexer.urlmeta

Methods in org.apache.nutch.indexer.urlmeta with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	URLMetaIndexingFilter.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`	This will take the metatags that you have listed in your "urlmeta.tags" property, and looks for them inside the CrawlDatum object.

Uses of CrawlDatum in org.apache.nutch.microformats.reltag

Methods in org.apache.nutch.microformats.reltag with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	RelTagIndexingFilter.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`

Uses of CrawlDatum in org.apache.nutch.protocol

Methods in org.apache.nutch.protocol with parameters of type CrawlDatum
Modifier and Type	Method	Description
`ProtocolOutput`	Protocol.`getProtocolOutput(Text url, CrawlDatum datum)`	Get the `ProtocolOutput` for a given url and crawldatum
`crawlercommons.robots.BaseRobotRules`	Protocol.`getRobotRules(Text url, CrawlDatum datum, List<Content> robotsTxtContent)`	Retrieve robot rules applicable for this URL.

Uses of CrawlDatum in org.apache.nutch.protocol.file

Methods in org.apache.nutch.protocol.file with parameters of type CrawlDatum
Modifier and Type	Method	Description
`ProtocolOutput`	File.`getProtocolOutput(Text url, CrawlDatum datum)`	Creates a `FileResponse` object corresponding to the url and return a `ProtocolOutput` object as per the content received
`crawlercommons.robots.BaseRobotRules`	File.`getRobotRules(Text url, CrawlDatum datum, List<Content> robotsTxtContent)`	No robots parsing is done for file protocol.

Constructors in org.apache.nutch.protocol.file with parameters of type CrawlDatum
Constructor	Description
`FileResponse(URL url, CrawlDatum datum, File file, Configuration conf)`	Default public constructor

Uses of CrawlDatum in org.apache.nutch.protocol.ftp

Methods in org.apache.nutch.protocol.ftp with parameters of type CrawlDatum
Modifier and Type	Method	Description
`ProtocolOutput`	Ftp.`getProtocolOutput(Text url, CrawlDatum datum)`	Creates a `FtpResponse` object corresponding to the url and returns a `ProtocolOutput` object as per the content received
`crawlercommons.robots.BaseRobotRules`	Ftp.`getRobotRules(Text url, CrawlDatum datum, List<Content> robotsTxtContent)`	Get the robots rules for a given url

Constructors in org.apache.nutch.protocol.ftp with parameters of type CrawlDatum
Constructor	Description
`FtpResponse(URL url, CrawlDatum datum, Ftp ftp, Configuration conf)`

Uses of CrawlDatum in org.apache.nutch.protocol.htmlunit

Methods in org.apache.nutch.protocol.htmlunit with parameters of type CrawlDatum
Modifier and Type	Method	Description
`protected Response`	Http.`getResponse(URL url, CrawlDatum datum, boolean redirect)`

Constructors in org.apache.nutch.protocol.htmlunit with parameters of type CrawlDatum
Constructor	Description
`HttpResponse(HttpBase http, URL url, CrawlDatum datum)`	Default public constructor.

Uses of CrawlDatum in org.apache.nutch.protocol.http

Methods in org.apache.nutch.protocol.http with parameters of type CrawlDatum
Modifier and Type	Method	Description
`protected Response`	Http.`getResponse(URL url, CrawlDatum datum, boolean redirect)`

Constructors in org.apache.nutch.protocol.http with parameters of type CrawlDatum
Constructor	Description
`HttpResponse(HttpBase http, URL url, CrawlDatum datum)`	Default public constructor.

Uses of CrawlDatum in org.apache.nutch.protocol.http.api

Methods in org.apache.nutch.protocol.http.api with parameters of type CrawlDatum
Modifier and Type	Method	Description
`ProtocolOutput`	HttpBase.`getProtocolOutput(Text url, CrawlDatum datum)`
`protected abstract Response`	HttpBase.`getResponse(URL url, CrawlDatum datum, boolean followRedirects)`
`crawlercommons.robots.BaseRobotRules`	HttpBase.`getRobotRules(Text url, CrawlDatum datum, List<Content> robotsTxtContent)`

Uses of CrawlDatum in org.apache.nutch.protocol.httpclient

Methods in org.apache.nutch.protocol.httpclient with parameters of type CrawlDatum
Modifier and Type	Method	Description
`protected Response`	Http.`getResponse(URL url, CrawlDatum datum, boolean redirect)`	Fetches the `url` with a configured HTTP client and gets the response.

Uses of CrawlDatum in org.apache.nutch.protocol.interactiveselenium

Methods in org.apache.nutch.protocol.interactiveselenium with parameters of type CrawlDatum
Modifier and Type	Method	Description
`protected Response`	Http.`getResponse(URL url, CrawlDatum datum, boolean redirect)`

Constructors in org.apache.nutch.protocol.interactiveselenium with parameters of type CrawlDatum
Constructor	Description
`HttpResponse(Http http, URL url, CrawlDatum datum)`

Uses of CrawlDatum in org.apache.nutch.protocol.okhttp

Methods in org.apache.nutch.protocol.okhttp with parameters of type CrawlDatum
Modifier and Type	Method	Description
`protected Response`	OkHttp.`getResponse(URL url, CrawlDatum datum, boolean redirect)`

Constructors in org.apache.nutch.protocol.okhttp with parameters of type CrawlDatum
Constructor	Description
`OkHttpResponse(OkHttp okhttp, URL url, CrawlDatum datum)`

Uses of CrawlDatum in org.apache.nutch.protocol.selenium

Methods in org.apache.nutch.protocol.selenium with parameters of type CrawlDatum
Modifier and Type	Method	Description
`protected Response`	Http.`getResponse(URL url, CrawlDatum datum, boolean redirect)`

Constructors in org.apache.nutch.protocol.selenium with parameters of type CrawlDatum
Constructor	Description
`HttpResponse(Http http, URL url, CrawlDatum datum)`

Uses of CrawlDatum in org.apache.nutch.scoring

Methods in org.apache.nutch.scoring that return CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	AbstractScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`
`CrawlDatum`	ScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`	Distribute score value from the current page to all its outlinked pages.
`CrawlDatum`	ScoringFilters.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`

Methods in org.apache.nutch.scoring with parameters of type CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	AbstractScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`
`CrawlDatum`	ScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`	Distribute score value from the current page to all its outlinked pages.
`CrawlDatum`	ScoringFilters.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`
`float`	AbstractScoringFilter.`generatorSortValue(Text url, CrawlDatum datum, float initSort)`
`float`	ScoringFilter.`generatorSortValue(Text url, CrawlDatum datum, float initSort)`	This method prepares a sort value for the purpose of sorting and selecting top N scoring pages during fetchlist generation.
`float`	ScoringFilters.`generatorSortValue(Text url, CrawlDatum datum, float initSort)`	Calculate a sort value for Generate.
`float`	AbstractScoringFilter.`indexerScore(Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore)`
`float`	ScoringFilter.`indexerScore(Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore)`	This method calculates a indexed document score/boost.
`float`	ScoringFilters.`indexerScore(Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore)`
`void`	AbstractScoringFilter.`initialScore(Text url, CrawlDatum datum)`
`void`	ScoringFilter.`initialScore(Text url, CrawlDatum datum)`	Set an initial score for newly discovered pages.
`void`	ScoringFilters.`initialScore(Text url, CrawlDatum datum)`	Calculate a new initial score, used when adding newly discovered pages.
`void`	AbstractScoringFilter.`injectedScore(Text url, CrawlDatum datum)`
`void`	ScoringFilter.`injectedScore(Text url, CrawlDatum datum)`	Set an initial score for newly injected pages.
`void`	ScoringFilters.`injectedScore(Text url, CrawlDatum datum)`	Calculate a new initial score, used when injecting new pages.
`default void`	ScoringFilter.`orphanedScore(Text url, CrawlDatum datum)`	This method may change the score or status of CrawlDatum during CrawlDb update, when the URL is neither fetched nor has any inlinks.
`void`	ScoringFilters.`orphanedScore(Text url, CrawlDatum datum)`	Calculate orphaned page score during CrawlDb.update().
`void`	AbstractScoringFilter.`passScoreBeforeParsing(Text url, CrawlDatum datum, Content content)`
`void`	ScoringFilter.`passScoreBeforeParsing(Text url, CrawlDatum datum, Content content)`	This method takes all relevant score information from the current datum (coming from a generated fetchlist) and stores it into `Content` metadata.
`void`	ScoringFilters.`passScoreBeforeParsing(Text url, CrawlDatum datum, Content content)`
`void`	AbstractScoringFilter.`updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List<CrawlDatum> inlinked)`
`void`	ScoringFilter.`updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List<CrawlDatum> inlinked)`	This method calculates a new score of CrawlDatum during CrawlDb update, based on the initial value of the original CrawlDatum, and also score values contributed by inlinked pages.
`void`	ScoringFilters.`updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List<CrawlDatum> inlinked)`	Calculate updated page score during CrawlDb.update().

Method parameters in org.apache.nutch.scoring with type arguments of type CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	AbstractScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`
`CrawlDatum`	ScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`	Distribute score value from the current page to all its outlinked pages.
`CrawlDatum`	ScoringFilters.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`
`void`	AbstractScoringFilter.`updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List<CrawlDatum> inlinked)`
`void`	ScoringFilter.`updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List<CrawlDatum> inlinked)`	This method calculates a new score of CrawlDatum during CrawlDb update, based on the initial value of the original CrawlDatum, and also score values contributed by inlinked pages.
`void`	ScoringFilters.`updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List<CrawlDatum> inlinked)`	Calculate updated page score during CrawlDb.update().

Uses of CrawlDatum in org.apache.nutch.scoring.depth

Methods in org.apache.nutch.scoring.depth that return CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	DepthScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`

Methods in org.apache.nutch.scoring.depth with parameters of type CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	DepthScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`
`float`	DepthScoringFilter.`generatorSortValue(Text url, CrawlDatum datum, float initSort)`
`float`	DepthScoringFilter.`indexerScore(Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore)`
`void`	DepthScoringFilter.`initialScore(Text url, CrawlDatum datum)`
`void`	DepthScoringFilter.`injectedScore(Text url, CrawlDatum datum)`
`void`	DepthScoringFilter.`passScoreBeforeParsing(Text url, CrawlDatum datum, Content content)`
`void`	DepthScoringFilter.`updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List<CrawlDatum> inlinked)`

Method parameters in org.apache.nutch.scoring.depth with type arguments of type CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	DepthScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`
`void`	DepthScoringFilter.`updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List<CrawlDatum> inlinked)`

Uses of CrawlDatum in org.apache.nutch.scoring.link

Methods in org.apache.nutch.scoring.link with parameters of type CrawlDatum
Modifier and Type	Method	Description
`float`	LinkAnalysisScoringFilter.`generatorSortValue(Text url, CrawlDatum datum, float initSort)`
`float`	LinkAnalysisScoringFilter.`indexerScore(Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore)`
`void`	LinkAnalysisScoringFilter.`initialScore(Text url, CrawlDatum datum)`
`void`	LinkAnalysisScoringFilter.`passScoreBeforeParsing(Text url, CrawlDatum datum, Content content)`

Uses of CrawlDatum in org.apache.nutch.scoring.metadata

Methods in org.apache.nutch.scoring.metadata that return CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	MetadataScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`	This will take the metadata that you have listed in your "scoring.parse.md" property, and looks for them inside the parseData object.

Methods in org.apache.nutch.scoring.metadata with parameters of type CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	MetadataScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`	This will take the metadata that you have listed in your "scoring.parse.md" property, and looks for them inside the parseData object.
`void`	MetadataScoringFilter.`passScoreBeforeParsing(Text url, CrawlDatum datum, Content content)`	Takes the metadata, specified in your "scoring.db.md" property, from the datum object and injects it into the content.

Method parameters in org.apache.nutch.scoring.metadata with type arguments of type CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	MetadataScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`	This will take the metadata that you have listed in your "scoring.parse.md" property, and looks for them inside the parseData object.

Uses of CrawlDatum in org.apache.nutch.scoring.opic

Methods in org.apache.nutch.scoring.opic that return CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	OPICScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`	Get a float value from Fetcher.SCORE_KEY, divide it by the number of outlinks and apply.

Methods in org.apache.nutch.scoring.opic with parameters of type CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	OPICScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`	Get a float value from Fetcher.SCORE_KEY, divide it by the number of outlinks and apply.
`float`	OPICScoringFilter.`generatorSortValue(Text url, CrawlDatum datum, float initSort)`	Use `getScore()`.
`float`	OPICScoringFilter.`indexerScore(Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore)`	Dampen the boost value by scorePower.
`void`	OPICScoringFilter.`initialScore(Text url, CrawlDatum datum)`	Set to 0.0f (unknown value) - inlink contributions will bring it to a correct level.
`void`	OPICScoringFilter.`injectedScore(Text url, CrawlDatum datum)`
`void`	OPICScoringFilter.`passScoreBeforeParsing(Text url, CrawlDatum datum, Content content)`	Store a float value of CrawlDatum.getScore() under Fetcher.SCORE_KEY.
`void`	OPICScoringFilter.`updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List<CrawlDatum> inlinked)`	Increase the score by a sum of inlinked scores.

Method parameters in org.apache.nutch.scoring.opic with type arguments of type CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	OPICScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`	Get a float value from Fetcher.SCORE_KEY, divide it by the number of outlinks and apply.
`void`	OPICScoringFilter.`updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List<CrawlDatum> inlinked)`	Increase the score by a sum of inlinked scores.

Uses of CrawlDatum in org.apache.nutch.scoring.orphan

Methods in org.apache.nutch.scoring.orphan with parameters of type CrawlDatum
Modifier and Type	Method	Description
`void`	OrphanScoringFilter.`orphanedScore(Text url, CrawlDatum datum)`
`void`	OrphanScoringFilter.`updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List<CrawlDatum> inlinks)`	Used for orphan control.

Method parameters in org.apache.nutch.scoring.orphan with type arguments of type CrawlDatum
Modifier and Type	Method	Description
`void`	OrphanScoringFilter.`updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List<CrawlDatum> inlinks)`	Used for orphan control.

Uses of CrawlDatum in org.apache.nutch.scoring.similarity

Methods in org.apache.nutch.scoring.similarity that return CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	SimilarityModel.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`
`CrawlDatum`	SimilarityScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`

Methods in org.apache.nutch.scoring.similarity with parameters of type CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	SimilarityModel.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`
`CrawlDatum`	SimilarityScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`

Method parameters in org.apache.nutch.scoring.similarity with type arguments of type CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	SimilarityModel.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`
`CrawlDatum`	SimilarityScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`

Uses of CrawlDatum in org.apache.nutch.scoring.similarity.cosine

Methods in org.apache.nutch.scoring.similarity.cosine that return CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	CosineSimilarity.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`

Methods in org.apache.nutch.scoring.similarity.cosine with parameters of type CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	CosineSimilarity.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`

Method parameters in org.apache.nutch.scoring.similarity.cosine with type arguments of type CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	CosineSimilarity.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`

Uses of CrawlDatum in org.apache.nutch.scoring.tld

Methods in org.apache.nutch.scoring.tld with parameters of type CrawlDatum
Modifier and Type	Method	Description
`float`	TLDScoringFilter.`indexerScore(Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore)`

Uses of CrawlDatum in org.apache.nutch.scoring.urlmeta

Methods in org.apache.nutch.scoring.urlmeta that return CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	URLMetaScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`	This will take the metatags that you have listed in your "urlmeta.tags" property, and looks for them inside the parseData object.

Methods in org.apache.nutch.scoring.urlmeta with parameters of type CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	URLMetaScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`	This will take the metatags that you have listed in your "urlmeta.tags" property, and looks for them inside the parseData object.
`void`	URLMetaScoringFilter.`passScoreBeforeParsing(Text url, CrawlDatum datum, Content content)`	Takes the metadata, specified in your "urlmeta.tags" property, from the datum object and injects it into the content.

Method parameters in org.apache.nutch.scoring.urlmeta with type arguments of type CrawlDatum
Modifier and Type	Method	Description
`CrawlDatum`	URLMetaScoringFilter.`distributeScoreToOutlinks(Text fromUrl, ParseData parseData, Collection<Map.Entry<Text,CrawlDatum>> targets, CrawlDatum adjust, int allCount)`	This will take the metatags that you have listed in your "urlmeta.tags" property, and looks for them inside the parseData object.

Uses of CrawlDatum in org.apache.nutch.segment

Methods in org.apache.nutch.segment with parameters of type CrawlDatum
Modifier and Type	Method	Description
`boolean`	SegmentMergeFilter.`filter(Text key, CrawlDatum generateData, CrawlDatum fetchData, CrawlDatum sigData, Content content, ParseData parseData, ParseText parseText, Collection<CrawlDatum> linked)`	The filtering method which gets all information being merged for a given key (URL).
`boolean`	SegmentMergeFilters.`filter(Text key, CrawlDatum generateData, CrawlDatum fetchData, CrawlDatum sigData, Content content, ParseData parseData, ParseText parseText, Collection<CrawlDatum> linked)`	Iterates over all `SegmentMergeFilter` extensions and if any of them returns false, it will return false as well.

Method parameters in org.apache.nutch.segment with type arguments of type CrawlDatum
Modifier and Type	Method	Description
`boolean`	SegmentMergeFilter.`filter(Text key, CrawlDatum generateData, CrawlDatum fetchData, CrawlDatum sigData, Content content, ParseData parseData, ParseText parseText, Collection<CrawlDatum> linked)`	The filtering method which gets all information being merged for a given key (URL).
`boolean`	SegmentMergeFilters.`filter(Text key, CrawlDatum generateData, CrawlDatum fetchData, CrawlDatum sigData, Content content, ParseData parseData, ParseText parseText, Collection<CrawlDatum> linked)`	Iterates over all `SegmentMergeFilter` extensions and if any of them returns false, it will return false as well.

Uses of CrawlDatum in org.apache.nutch.util

Methods in org.apache.nutch.util with parameters of type CrawlDatum
Modifier and Type	Method	Description
`protected ProtocolOutput`	AbstractChecker.`getProtocolOutput(String url, CrawlDatum datum, boolean checkRobotsTxt)`

Uses of CrawlDatum in org.creativecommons.nutch

Methods in org.creativecommons.nutch with parameters of type CrawlDatum
Modifier and Type	Method	Description
`NutchDocument`	CCIndexingFilter.`filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)`

Uses of Classorg.apache.nutch.crawl.CrawlDatum

Uses of CrawlDatum in org.apache.nutch.analysis.lang

Uses of CrawlDatum in org.apache.nutch.crawl

Uses of CrawlDatum in org.apache.nutch.fetcher

Uses of CrawlDatum in org.apache.nutch.hostdb

Uses of CrawlDatum in org.apache.nutch.indexer

Uses of CrawlDatum in org.apache.nutch.indexer.anchor

Uses of CrawlDatum in org.apache.nutch.indexer.arbitrary

Uses of CrawlDatum in org.apache.nutch.indexer.basic

Uses of CrawlDatum in org.apache.nutch.indexer.feed

Uses of CrawlDatum in org.apache.nutch.indexer.filter

Uses of CrawlDatum in org.apache.nutch.indexer.geoip

Uses of CrawlDatum in org.apache.nutch.indexer.jexl

Uses of CrawlDatum in org.apache.nutch.indexer.links

Uses of CrawlDatum in org.apache.nutch.indexer.metadata

Uses of CrawlDatum in org.apache.nutch.indexer.more

Uses of CrawlDatum in org.apache.nutch.indexer.replace

Uses of CrawlDatum in org.apache.nutch.indexer.staticfield

Uses of CrawlDatum in org.apache.nutch.indexer.subcollection

Uses of CrawlDatum in org.apache.nutch.indexer.tld

Uses of CrawlDatum in org.apache.nutch.indexer.urlmeta

Uses of CrawlDatum in org.apache.nutch.microformats.reltag

Uses of CrawlDatum in org.apache.nutch.protocol

Uses of CrawlDatum in org.apache.nutch.protocol.file

Uses of CrawlDatum in org.apache.nutch.protocol.ftp

Uses of CrawlDatum in org.apache.nutch.protocol.htmlunit

Uses of CrawlDatum in org.apache.nutch.protocol.http

Uses of CrawlDatum in org.apache.nutch.protocol.http.api

Uses of CrawlDatum in org.apache.nutch.protocol.httpclient

Uses of CrawlDatum in org.apache.nutch.protocol.interactiveselenium

Uses of CrawlDatum in org.apache.nutch.protocol.okhttp

Uses of CrawlDatum in org.apache.nutch.protocol.selenium

Uses of CrawlDatum in org.apache.nutch.scoring

Uses of CrawlDatum in org.apache.nutch.scoring.depth

Uses of CrawlDatum in org.apache.nutch.scoring.link

Uses of CrawlDatum in org.apache.nutch.scoring.metadata

Uses of CrawlDatum in org.apache.nutch.scoring.opic

Uses of CrawlDatum in org.apache.nutch.scoring.orphan

Uses of CrawlDatum in org.apache.nutch.scoring.similarity

Uses of CrawlDatum in org.apache.nutch.scoring.similarity.cosine

Uses of CrawlDatum in org.apache.nutch.scoring.tld

Uses of CrawlDatum in org.apache.nutch.scoring.urlmeta

Uses of CrawlDatum in org.apache.nutch.segment

Uses of CrawlDatum in org.apache.nutch.util

Uses of CrawlDatum in org.creativecommons.nutch

Uses of Class
org.apache.nutch.crawl.CrawlDatum