Class XmlInputStream

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable

    @NotThreadSafe
    public class XmlInputStream
    extends java.io.FilterInputStream
    Cleans up often very bad XML. Primarily, this will convert named HTM entities into their HTM encoded Unicode code point representation.
    1. Strips leading white space
    2. Recodes £ etc to &#...;
    3. Recodes lone & as &

    This is a slightly modified (class/method rename) from an SO answer: https://stackoverflow.com/questions/7286428/help-the-java-sax-parser-to-understand-bad-xml

    Author:
    https://stackoverflow.com/users/823393/oldcurmudgeon
    • Field Summary

      • Fields inherited from class java.io.FilterInputStream

        in
    • Constructor Summary

      Constructors 
      Constructor Description
      XmlInputStream​(java.io.InputStream in)
      Constructs a new XML Input Stream.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      int length()
      NB: This is a Troll length (i.e. it goes 1, 2, many) so 2 actually means "at least 2"
      int read()
      Reads the next byte.
      int read​(@org.jetbrains.annotations.NotNull byte[] data, int offset, int length)
      Reads the next length of bytes from the stream into the given byte array at the given offset.
      java.lang.String toString()
      To string implementation.
      • Methods inherited from class java.io.FilterInputStream

        available, close, mark, markSupported, read, reset, skip
      • Methods inherited from class java.io.InputStream

        nullInputStream, readAllBytes, readNBytes, readNBytes, transferTo
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Constructor Detail

      • XmlInputStream

        public XmlInputStream​(java.io.InputStream in)
        Constructs a new XML Input Stream.
        Parameters:
        in - the base input stream
    • Method Detail

      • length

        public int length()
        NB: This is a Troll length (i.e. it goes 1, 2, many) so 2 actually means "at least 2"
        Returns:
        the length
      • read

        public int read()
                 throws java.io.IOException
        Reads the next byte.
        Overrides:
        read in class java.io.FilterInputStream
        Returns:
        the byte read
        Throws:
        java.io.IOException - thrown when there is an problem reading
      • read

        public int read​(@NotNull
                        @org.jetbrains.annotations.NotNull byte[] data,
                        int offset,
                        int length)
                 throws java.io.IOException
        Reads the next length of bytes from the stream into the given byte array at the given offset.
        Overrides:
        read in class java.io.FilterInputStream
        Parameters:
        data - the buffer to store the data read
        offset - the offset in the buffer to start writing
        length - the length of data to read
        Returns:
        the number of bytes read
        Throws:
        java.io.IOException - thrown when there is an issue with the underlying stream
      • toString

        public java.lang.String toString()
        To string implementation.
        Overrides:
        toString in class java.lang.Object
        Returns:
        a string representation of the data given and read from the stream.