Class OutlinkExtractor

    • Constructor Detail

      • OutlinkExtractor

        public OutlinkExtractor()
    • Method Detail

      • getOutlinks

        public static Outlink[] getOutlinks​(String plainText,
                                            Configuration conf)
        Extracts Outlink from given plain text. Applying this method to non-plain-text can result in extremely lengthy runtimes for parasitic cases (postscript is a known example).
        Parameters:
        plainText - the plain text from wich URLs should be extracted.
        conf - a populated Configuration
        Returns:
        Array of Outlinks within found in plainText
      • getOutlinks

        public static Outlink[] getOutlinks​(String plainText,
                                            String anchor,
                                            Configuration conf)
        Extracts Outlink from given plain text and adds anchor to the extracted Outlinks
        Parameters:
        plainText - the plain text from wich URLs should be extracted.
        anchor - the anchor of the url
        conf - a populated Configuration
        Returns:
        Array of Outlinks within found in plainText