Class ExtParser

  • All Implemented Interfaces:
    Configurable, Parser, Pluggable

    public class ExtParser
    extends Object
    implements Parser
    A wrapper that invokes external command to do real parsing job.
    Author:
    John Xing
    • Constructor Detail

      • ExtParser

        public ExtParser()
    • Method Detail

      • getParse

        public ParseResult getParse​(Content content)
        Description copied from interface: Parser

        This method parses the given content and returns a map of <key, parse> pairs. Parse instances will be persisted under the given key.

        Note: Meta-redirects should be followed only when they are coming from the original URL. That is:
        Assume fetcher is in parsing mode and is currently processing foo.bar.com/redirect.html. If this url contains a meta redirect to another url, fetcher should only follow the redirect if the map contains an entry of the form <"foo.bar.com/redirect.html", Parse with a ParseStatus indicating the redirect>.

        Specified by:
        getParse in interface Parser
        Parameters:
        content - Content to be parsed
        Returns:
        a map containing <key, parse> pairs