Package org.apache.nutch.indexer.jexl
This plugin implements a dynamic indexing filter which uses JEXL expressions to allow filtering based on the page's metadata
Available primitives in the JEXL context:
- status, fetchTime, modifiedTime, retries, interval, score, signature, url, text, title
Available objects in the JEXL context:
- httpStatus - contains majorCode, minorCode, message
- documentMeta, contentMeta, parseMeta - contain all the Metadata properties.
Each property value is always an array of Strings (so if you expect one value, use [0]) - doc - contains all the NutchFields from the NutchDocument.
Each property value is always an array of Objects.
-
Class Summary Class Description JexlIndexingFilter AnIndexingFilter
that allows filtering of documents based on a JEXL expression.