Skip to content

Releases: openpreserve/jhove

JHOVE 1.6

25 May 17:12
Compare
Choose a tag to compare

XML HANDLER AND TEXT HANDLER

  1. The default version of MIX is now 2.0. In earlier versions it was 0.2.
    However, MIX 2.0 still isn't supported in the text handler, so it will
    produce 1.0 output by default. The XML handler will produce MIX 2.0
    output.

TIFF MODULE

  1. JHOVE returned a "String index out of range: 4" exceptions during
    TIFF validation for a tiff contains an empty (not NULL) date/time
    field. This has been corrected so that a date/time field with
    the wrong length won't be parsed but will report an error instead.
  2. If text tags contain characters which aren't printable ASCII, these
    are now output as escape sequences so that invalid XML isn't
    output.

UTF-8 MODULE

  1. Updated to Unicode 6.0.0.

JHOVE 1.5

25 May 17:14
Compare
Choose a tag to compare

PDF MODULE

  1. An ArrayIndexOutOfBoundsException was thrown on a PDF with an invalid
    object number in the cross-reference stream. In JHOVE 1.5, this is
    correctly reported as a violation of well-formedness.

UTF-8 MODULE

  1. With some very simple UTF-8 files, JHOVE handlers would throw an exception
    processing them, and the GUI would fail silently. This happened with files
    using no UTF-8 blocks. This has been fixed.

TEXTMD (multiple modules)

  1. TextMD metadata can now optionally be reported. To get this, it's
    necessary to edit jhove.conf. TextMD can be enabled on a per-module
    basis for HtmlModule, AsciiModule, Utf8Module, and XmlModule.
    The element for each chosen module must contain the element withtextmd=true (no spaces).
  2. The TextMD feature was added by Thomas Ledoux.

JHOVE 1.4

25 May 17:17
Compare
Choose a tag to compare

PDF MODULE

  1. The PDF/A profile has been updated to the final version of
    19005-1:2005(E) and made more thorough. Among the changes:

    a. The set-state and no-op actions disqualify a PDF/A candidate.

    b. The ASCIIHexDecode and ASCII85Decode filters no longer
    disqualify a candidate.

    c. Checking of outlines has been added.

    d. Additional checking of Type 1 fonts and symbolic fonts.

    e. Bug fix in checking type 2 subfonts.

    f. An LZW filter in an image object disqualifies a candidate.

    g. The xpacket processing instruction is checked for attributes
    which disqualify from PDF/A.

    h. Conformity to implementation limits is checked as a condition
    of PDF/A conformity.

JPEG2000 MODULE

  1. The pathological case of an image with no components is checked so
    it won't cause a crash.

XML HANDLER

  1. A reset() function has been added so that if the handler is reused,
    it will return to a valid initial state.

JHOVE 1.3

25 May 17:18
Compare
Choose a tag to compare

GENERAL

  1. The build.xml files now force compilation to Java 1.4, preventing
    accidental distributions that aren't 1.4-compatible.
  2. Spaces are allowed in file paths on Windows, if the path is
    enclosed in quotes. This fix had been in version 1.1i, and had been
    lost since then.

PDF MODULE

  1. According to the PDF 1.6 specification, table 3.4, parameters for a
    stream filter can be either a dictionary or the null object. The null
    object was treated as an error; it is now allowed.
  2. Object stream handling was seriously buggy, causing rejection of
    well-formed and valid files; it's better now.
  3. In PDF 1.4, an outline dictionary unconditionally must have a "First"
    and a "Last" entry. JHOVE follows this requirement, declaring a file
    invalid if it isn't met. However, PDF 1.6 relaxes the requirement,
    applying it only "if there are any open or closed outline entries."
    Thus, an empty outline dictionary with no "First" or "Last" entry
    is valid. It is now accepted (for all PDF versions).
  4. If a page number tree in a PDF file is missing an expected "Nums"
    entry, this was being reported as an invalid date. A more appropriate
    error message is now given.

TIFF MODULE

  1. TIFF tag 33723 (IPTC-NAA) was considered valid only if the data
    type is ASCII or LONG. But according to Aware Systems, the valid
    types are UNDEFINED and BYTE. All four types are now accepted.

XML HANDLER

  1. Omissions in MIX 1.0 and 2.0 output have been fixed.

JHOVE 1.2

25 May 17:21
Compare
Choose a tag to compare

GENERAL

  1. A bug has been fixed in CountedInputStream, which could potentially
    have caused infinite recursion in some modules.

HTML MODULE

  1. An incompatibility with Java 1.6 has been fixed.

PDF MODULE

  1. A null pointer exception would be thrown for PDF documents without a
    document root tree. This has been fixed.
  2. A source of possible false positives in PDF profiles has been fixed.
  3. Certain checks weren't being done to Type 2 fonts, and some PDF/A
    profile violations might have been missed as a result. This has
    been fixed.

WAVE MODULE

  1. Sub-chunks of the 'adtl' chunk are now constrained to even byte
    boundaries.

XML HANDLER

  1. MIX 2.0 is now supported.
  2. The URL for the MIX 0.2 schema has changed to reflect the change
    on the LOC MIX site.
  3. The handler was sometimes incorrectly reporting whether the
    AESAudioMetadata property had an empty value or not. This has
    been fixed.

JHOVE 1.1

25 May 17:22
Compare
Choose a tag to compare

COMMAND-LINE INTERFACE

  1. Allow filenames with internal spaces if they are quoted on the
    command line.
  2. Corrected error setting the Classpath in the Windows Shell script
    (jhove.bat)
  3. Corrected error opening the configuration file using the default
    GCJ parser in the GNU Java Runtime Environment.

GUI (SWING) INTERFACE (JHOVE VIEW)

  1. AES metadata properties displayed in the RepInfo window rearranged
    slightly to make their ordering consistent with the Text and XML
    handlers.

  2. The JhoveView.main() method will now accept a "-c configFile" option
    on the command line. The GUI interface can now be invoked by:

      java -jar bin/JhoveView.jar -c configFile
    
  3. Corrected error opening the configuration file using the default
    GCJ parser in the GNU Java Runtime Environment.

  4. Correct recurrent problems with reading the configuration file on
    Windows installations.

AIFF MODULE

  1. Correct value for first sample offset by included non-zero offset
    defined in the SSND chunk.
  2. Do not report bitrate reduction data for PCM data.
  3. All non-final instance fields and methods are protected, rather than
    private.

ASCII MODULE

  1. A minimal file containing no line-end characters now does not
    produce an empty ASCIIMetadata property, which is invalid against
    the JHOVE schema.
  2. Zero-length files are considered not well-formed.
  3. Issue informative message if file contains no printable characters.
  4. All non-final instance fields and methods are protected, rather than
    private.

BYTESTREAM MODULE

  1. All non-final instance fields and methods are protected, rather than
    private.

GIF MODULE

  1. All non-final instance fields and methods are protected, rather than
    private.

HTML MODULE

  1. The HTMLMetadata block in the module output is only produced if
    there is at least one actual metadata property to report.
  2. All non-final instance fields and methods are protected, rather than
    private.

JPEG MODULE

  1. The JPEG module reports the X and Y sampling frequency for files
    meeting the JFIF profile.
  2. The JPEG module reports the pixel aspect ratio for JFIF profile
    files for which it is defined.
  3. File handles were not being properly closed when processing embedded
    EXIF metadata. In cases where JHOVE was invoked against large
    numbers of objects this was causing a premature crash due to the
    resource leak.
  4. All non-final instance fields and methods are protected, rather than
    private.
  5. Correct parsing of the EXIF "subsecTimeOriginal" (37251) and
    "subsecTimeDigitized" (37522) properties.
  6. Validation errors in embedded EXIF metdata were not being fully
    reported.

JPEG 2000 MODULE

  1. All non-final instance fields and methods are protected, rather than
    private.
  2. Files generated by the LuraWave codec are no longer incorrecly identified
    as having unrecognized QCC marker segments.

PDF MODULE

  1. Date strings are now parsed with strict conformance to the ASN.1
    syntax.
  2. Destinations defined by indirect references to non-existent objects
    are assumed to have the value "null". Files containing such
    destinations are reported as "well-formed, but not valid".
  3. No attempt is made to display encrypted outline item title strings are
    not displayed.
  4. Catch error if the Info key of the trailer dictionary is not an
    indirect reference.
  5. Read entire page tree structure, regardless of its internal
    organization. This error may have caused the under reporting of
    page resources, such as fonts and images.
  6. The NISO Compression Scheme for all images using the CCITTFaxDecode
    compression filter is now reported properly; previously, the scheme
    was always reported as CCITT 1D even if the actual compression
    algorithm was CCITT Group 3 or 4.
  7. Properly parse UTF-16 escape characters encoded in double-byte form.
  8. The module properly stops looking for the header comment after 1024
    bytes.
  9. All non-final instance fields and methods are protected, rather than
    private.
    1. The number of incremental updates is now reported correctly, rather than
      the total number of file trailers, which is one greater than the number
      of updates.

    2. Only up to 1000 fonts will be reported. After that, an informative
      message will be generated. The limit can be set using the parameter
      "nxxxx" in the module-specific section of the configuration file:

      <module>
        <class>edu.harvard.hul.ois.jhove.module.PdfModule</class>
        <param>n2000</param>
      </module>
      
    3. Subfonts of Type 0 are now being properly reported.

    4. PDF/A-1b profile is now being properly reported.

    5. Permit trailer info key to be optional.

    6. Additional correction for outline recursion.

    7. Fix treatment of indirect object of Actions.

    8. Correctly handle trailer dictionary without Info entry.

    9. Ignore comments within dictionaries.

TIFF MODULE

  1. Corrected error parsing pyramidal TIFF using the SubIFDs tag with a
    type of IFD (13) rather than LONG (4).

  2. Correct parsing of the EXIF "subsecTimeOriginal" (37251) and
    "subsecTimeDigitized" (37522) properties.

  3. All sub-IFDs of a pyramidal TIFF are now properly parsed.

  4. The EXIF GainControl tag (41991) is now correctly identified as
    a SHORT, not a RATIONAL, value.

  5. Corrected error in which valid files were reported as being only
    well-formed due to an incorrect parsing of the DateTime (306) tag.

  6. Byte-aligned offsets can be considered well-formed if the module
    parameter "byteoffset=true" is set in the configuration file:

      <module>
        <class>edu.harvard.hul.ois.jhove.module.TiffModule</class>
        <param>byteoffset=true</param>
      </module>
    
  7. All non-final instance fields and methods are protected, rather than
    private.

  8. Correct parsing of the EXIF "subsecTimeOriginal" (37251) and
    "subsecTimeDigitized" (37522) properties.

  9. Using the "-s" option, the TIFF module was incorrectlly reporting
    signature matches for text files starting with "II".

  10. Validation errors in embedded EXIF metdata were not being fully
    reported.

UTF8 MODULE

  1. Corrected error under which malformed UTF-8 files containing encoding
    sequences starting with a byte value in the range 0xF8 through 0xFF
    were reported as well-formed and valid.
  2. Zero-length files are considered not well-formed.
  3. Issue informative message if file contains no printable characters.
  4. All non-final instance fields and methods are protected, rather than
    private.

WAVE MODULE

  1. BWF files now set the correct start time in the AES metadata.
  2. All non-final instance fields and methods are protected, rather than
    private.
  3. "cue " and "adtl" chunks are now properly read.

XML MODULE

  1. The DTD is assumed to be the first DOCTYPE system ID in the file with an
    ".dtd" extension.
  2. All non-final instance fields and methods are protected, rather than
    private.
  3. The module correctly handles schemaLocation attributes that do not
    provide two whitespace-separated URIs.

TEXT HANDLER

  1. AES audio metadata properties rearranged slightly to make their
    ordering consistent with the XML schema.

XML HANDLER

  1. Correct sample rate formatting in AES Time Code Format (TCF)
    temporal references.

  2. Correct face IDREF in AES metadata.

  3. Disallowed control characters are removed from content.

  4. Null property values no longer generate empty elements.

  5. Image technical metadata can be reported in terms of the MIX 1.0 schema,
    as opposed to the default reporting against MIX 0.2. To specify the
    1.0 schema include the directive:

      <mixVersion>1.0</mixVersion>
    

    if the configuration file.

JHOVE API

  1. The process() and processFile() methods of the JhoveBase class are now
    public, to permit direct access to the API by applications.
  2. Checksum calculations now use buffered I/O uniformly for improved
    performance.
  3. All non-final fields and methods in the JhoveBase class are
    protected, rather than private.
  4. When invoked with the "-s" option JHOVE now reports the signature
    matched format and MIME type.
  5. The processing of files in a directory is now performed in an
    alphabetically sorted order.

ADUMP UTILITY

  1. Display the field values of known chunks.

TDUMP UTILITY

  1. New format that sorts all tag definitions by their byte offset and
    also displays the byte ranges for image data.
  2. Command line flags permit the suppression of BYTE data display (-b) and
    and subIFD parsing (-s).

USERHOME UTILITY

  1. A new utility program, UserHome, is available to determine the value
    of the Java user.home property needed to know where to place the
    configuration file. This utility can be invoked by the driver scripts
    "userhome" (Bourne shell) or "userhome.bat" (Windows).