Package org.apache.nutch.parse.html
Class HTMLMetaProcessor
- java.lang.Object
-
- org.apache.nutch.parse.html.HTMLMetaProcessor
-
public class HTMLMetaProcessor extends Object
Class for parsing META Directives from DOM trees. This class handles specifically Robots META directives (all, none, nofollow, noindex), finding BASE HREF tags, and HTTP-EQUIV no-cache instructions. All meta directives are stored in a HTMLMetaTags instance.
-
-
Constructor Summary
Constructors Constructor Description HTMLMetaProcessor()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static void
getMetaTags(HTMLMetaTags metaTags, Node node, URL currURL)
Sets the indicators inrobotsMeta
to appropriate values, based on any META tags found under the givennode
.
-
-
-
Method Detail
-
getMetaTags
public static final void getMetaTags(HTMLMetaTags metaTags, Node node, URL currURL)
Sets the indicators inrobotsMeta
to appropriate values, based on any META tags found under the givennode
.- Parameters:
metaTags
- aHTMLMetaTags
to populate with tags discovered in the given Nodenode
- a DOMNode
to process and extract metadata fromcurrURL
- the cononical URL associated with the metatags and Node
-
-