Variable-length integer used in Apple `'dcmp' (0)` and `'dcmp' (1)` compressed resource formats: C++98/STL parsing library

A variable-length integer, in the format used by the 0xfe chunks in the 'dcmp' (0) and 'dcmp' (1) resource compression formats. See the dcmp_0 and dcmp_1 specs for more information about these compression formats.

This variable-length integer format can store an integer x in any of the following ways:

  • In a single byte, if 0 <= x <= 0x7f (7-bit unsigned integer)
  • In 2 bytes, if -0x4000 <= x <= 0x3eff (15-bit signed integer with the highest 0x100 values unavailable)
  • In 5 bytes, if -0x80000000 <= x <= 0x7fffffff (32-bit signed integer)

In practice, values are always stored in the smallest possible format, but technically any of the larger formats could be used as well.

Application

Mac OS

KS implementation details

License: MIT
Minimal Kaitai Struct required: 0.8

This page hosts a formal specification of Variable-length integer used in Apple `'dcmp' (0)` and `'dcmp' (1)` compressed resource formats using Kaitai Struct. This specification can be automatically translated into a variety of programming languages to get a parsing library.

Usage

Runtime library

All parsing code for C++98/STL generated by Kaitai Struct depends on the C++/STL runtime library. You have to install it before you can parse data.

For C++, the easiest way is to clone the runtime library sources and build them along with your project.

Code

Using Kaitai Struct in C++/STL usually consists of 3 steps.

  1. We need to create an STL input stream (std::istream). One can open local file for that, or use existing std::string or char* buffer.
    #include <fstream>
    
    std::ifstream is("path/to/local/file.bin", std::ifstream::binary);
    
    #include <sstream>
    
    std::istringstream is(str);
    
    #include <sstream>
    
    const char buf[] = { ... };
    std::string str(buf, sizeof buf);
    std::istringstream is(str);
    
  2. We need to wrap our input stream into Kaitai stream:
    #include "kaitai/kaitaistream.h"
    
    kaitai::kstream ks(&is);
    
  3. And finally, we can invoke the parsing:
    dcmp_variable_length_integer_t data(&ks);
    

After that, one can get various attributes from the structure by invoking getter methods like:

data.first() // => The first byte of the variable-length integer.
This determines which storage format is used.

* For the 1-byte format,
  this encodes the entire value of the value.
* For the 2-byte format,
  this encodes the high 7 bits of the value,
  minus `0xc0`.
  The highest bit of the value,
  i. e. the second-highest bit of this field,
  is the sign bit.
* For the 5-byte format,
  this is always `0xff`.

data.value() // => The decoded value of the variable-length integer.

C++98/STL source code to parse Variable-length integer used in Apple `'dcmp' (0)` and `'dcmp' (1)` compressed resource formats

dcmp_variable_length_integer.h

#ifndef DCMP_VARIABLE_LENGTH_INTEGER_H_
#define DCMP_VARIABLE_LENGTH_INTEGER_H_

// This is a generated file! Please edit source .ksy file and use kaitai-struct-compiler to rebuild

#include "kaitai/kaitaistruct.h"
#include <stdint.h>

#if KAITAI_STRUCT_VERSION < 9000L
#error "Incompatible Kaitai Struct C++/STL API: version 0.9 or later is required"
#endif

/**
 * A variable-length integer,
 * in the format used by the 0xfe chunks in the `'dcmp' (0)` and `'dcmp' (1)` resource compression formats.
 * See the dcmp_0 and dcmp_1 specs for more information about these compression formats.
 * 
 * This variable-length integer format can store an integer `x` in any of the following ways:
 * 
 * * In a single byte,
 *   if `0 <= x <= 0x7f`
 *   (7-bit unsigned integer)
 * * In 2 bytes,
 *   if `-0x4000 <= x <= 0x3eff`
 *   (15-bit signed integer with the highest `0x100` values unavailable)
 * * In 5 bytes, if `-0x80000000 <= x <= 0x7fffffff`
 *   (32-bit signed integer)
 * 
 * In practice,
 * values are always stored in the smallest possible format,
 * but technically any of the larger formats could be used as well.
 * \sa https://github.com/dgelessus/python-rsrcfork/blob/f891a6e/src/rsrcfork/compress/common.py Source
 */

class dcmp_variable_length_integer_t : public kaitai::kstruct {

public:

    dcmp_variable_length_integer_t(kaitai::kstream* p__io, kaitai::kstruct* p__parent = 0, dcmp_variable_length_integer_t* p__root = 0);

private:
    void _read();
    void _clean_up();

public:
    ~dcmp_variable_length_integer_t();

private:
    bool f_value;
    int32_t m_value;

public:

    /**
     * The decoded value of the variable-length integer.
     */
    int32_t value();

private:
    uint8_t m_first;
    int32_t m_more;
    bool n_more;

public:
    bool _is_null_more() { more(); return n_more; };

private:
    dcmp_variable_length_integer_t* m__root;
    kaitai::kstruct* m__parent;

public:

    /**
     * The first byte of the variable-length integer.
     * This determines which storage format is used.
     * 
     * * For the 1-byte format,
     *   this encodes the entire value of the value.
     * * For the 2-byte format,
     *   this encodes the high 7 bits of the value,
     *   minus `0xc0`.
     *   The highest bit of the value,
     *   i. e. the second-highest bit of this field,
     *   is the sign bit.
     * * For the 5-byte format,
     *   this is always `0xff`.
     */
    uint8_t first() const { return m_first; }

    /**
     * The remaining bytes of the variable-length integer.
     * 
     * * For the 1-byte format,
     *   this is not present.
     * * For the 2-byte format,
     *   this encodes the low 8 bits of the value.
     * * For the 5-byte format,
     *   this encodes the entire value.
     */
    int32_t more() const { return m_more; }
    dcmp_variable_length_integer_t* _root() const { return m__root; }
    kaitai::kstruct* _parent() const { return m__parent; }
};

#endif  // DCMP_VARIABLE_LENGTH_INTEGER_H_

dcmp_variable_length_integer.cpp

// This is a generated file! Please edit source .ksy file and use kaitai-struct-compiler to rebuild

#include "dcmp_variable_length_integer.h"

dcmp_variable_length_integer_t::dcmp_variable_length_integer_t(kaitai::kstream* p__io, kaitai::kstruct* p__parent, dcmp_variable_length_integer_t* p__root) : kaitai::kstruct(p__io) {
    m__parent = p__parent;
    m__root = this;
    f_value = false;

    try {
        _read();
    } catch(...) {
        _clean_up();
        throw;
    }
}

void dcmp_variable_length_integer_t::_read() {
    m_first = m__io->read_u1();
    n_more = true;
    if (first() >= 128) {
        n_more = false;
        switch (first()) {
        case 255: {
            m_more = m__io->read_s4be();
            break;
        }
        default: {
            m_more = m__io->read_u1();
            break;
        }
        }
    }
}

dcmp_variable_length_integer_t::~dcmp_variable_length_integer_t() {
    _clean_up();
}

void dcmp_variable_length_integer_t::_clean_up() {
    if (!n_more) {
    }
}

int32_t dcmp_variable_length_integer_t::value() {
    if (f_value)
        return m_value;
    m_value = ((first() == 255) ? (more()) : (((first() >= 128) ? ((((first() << 8) | more()) - 49152)) : (first()))));
    f_value = true;
    return m_value;
}