Class ArcRecordReader

  • All Implemented Interfaces:
    Closeable, AutoCloseable

    public class ArcRecordReader
    extends RecordReader<Text,​BytesWritable>
    The ArchRecordReader class provides a record reader which reads records from arc files. Arc files are essentially tars of gzips. Each record in an arc file is a compressed gzip. Multiple records are concatenated together to form a complete arc. For more information on the arc file format
    See Also:
    ArcFileFormat. Arc files are used by the Internet Archive and grub projects., archive.org, grub.org
    • Field Detail

      • splitStart

        protected long splitStart
      • pos

        protected long pos
      • splitEnd

        protected long splitEnd
      • splitLen

        protected long splitLen
      • fileLen

        protected long fileLen
    • Constructor Detail

      • ArcRecordReader

        public ArcRecordReader​(Configuration conf,
                               FileSplit split)
                        throws IOException
        Constructor that sets the configuration and file split.
        Parameters:
        conf - The job configuration.
        split - The file split to read from.
        Throws:
        IOException - If an IO error occurs while initializing file split.