Package org.apache.nutch.indexer.jexl

This plugin implements a dynamic indexing filter which uses JEXL expressions to allow filtering based on the page's metadata

Available primitives in the JEXL context:

  • status, fetchTime, modifiedTime, retries, interval, score, signature, url, text, title

Available objects in the JEXL context:

  • httpStatus - contains majorCode, minorCode, message
  • documentMeta, contentMeta, parseMeta - contain all the Metadata properties.
    Each property value is always an array of Strings (so if you expect one value, use [0])
  • doc - contains all the NutchFields from the NutchDocument.
    Each property value is always an array of Objects.