Class NutchTool

    • Field Detail

      • currentJob

        protected Job currentJob
      • numJobs

        protected int numJobs
      • currentJobNum

        protected int currentJobNum
    • Constructor Detail

      • NutchTool

        public NutchTool()
    • Method Detail

      • run

        public abstract Map<String,​Object> run​(Map<String,​Object> args,
                                                     String crawlId)
                                              throws Exception
        Runs the tool, using a map of arguments. May return results, or null.
        Parameters:
        args - a Map of arguments to be run with the tool
        crawlId - a crawl identifier to associate with the tool invocation
        Returns:
        Map results object if tool executes successfully otherwise null
        Throws:
        Exception - if there is an error during the tool execution
      • getProgress

        public float getProgress()
        Get relative progress of the tool. Progress is represented as a float in range [0,1] where 1 is complete.
        Returns:
        a float in range [0,1].
      • getStatus

        public Map<String,​Object> getStatus()
        Returns current status of the running tool
        Returns:
        a populated Map, the fields of which can be accessed to obtain status.
      • stopJob

        public boolean stopJob()
                        throws Exception
        Stop the job with the possibility to resume. Subclasses should override this, since by default it calls killJob().
        Returns:
        true if succeeded, false otherwise
        Throws:
        Exception - if there is an error stopping the current Job
      • killJob

        public boolean killJob()
                        throws Exception
        Kill the job immediately. Clients should assume that any results that the job produced so far are in an inconsistent state or missing.
        Returns:
        true if succeeded, false otherwise.
        Throws:
        Exception - if there is an error stopping the current Job