Class QuerystringURLNormalizer
- java.lang.Object
-
- org.apache.nutch.net.urlnormalizer.querystring.QuerystringURLNormalizer
-
- All Implemented Interfaces:
Configurable
,URLNormalizer
public class QuerystringURLNormalizer extends Object implements URLNormalizer
URL normalizer plugin for normalizing query strings but sorting query string parameters. Not sorting query strings can lead to large amounts of duplicate URL's such as ?a=x&b=y vs b=y&a=x.
-
-
Field Summary
-
Fields inherited from interface org.apache.nutch.net.URLNormalizer
X_POINT_ID
-
-
Constructor Summary
Constructors Constructor Description QuerystringURLNormalizer()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Configuration
getConf()
String
normalize(String urlString, String scope)
void
setConf(Configuration conf)
-
-
-
Method Detail
-
getConf
public Configuration getConf()
- Specified by:
getConf
in interfaceConfigurable
-
setConf
public void setConf(Configuration conf)
- Specified by:
setConf
in interfaceConfigurable
-
normalize
public String normalize(String urlString, String scope) throws MalformedURLException
- Specified by:
normalize
in interfaceURLNormalizer
- Throws:
MalformedURLException
-
-