.roh file format: Python parsing library

Avantes USB spectrometers are supplied with a Windows binary which generates one ROH and one RCM file when the user clicks "Save experiment". In the version of 6.0, the ROH file contains a header of 22 four-byte floats, then the spectrum as a float array and a footer of 3 floats. The first and last pixel numbers are specified in the header and determine the (length+1) of the spectral data. In the tested files, the length is (2032-211-1)=1820 pixels, but Kaitai determines this automatically anyway.

The wavelength calibration is stored as a polynomial with coefficients of 'wlintercept', 'wlx1', ... 'wlx4', the argument of which is the (pixel number + 1), as found out by comparing with the original Avantes converted data files. There is no intensity calibration saved, but it is recommended to do it in your program - the CCD in the spectrometer is so uneven that one should prepare exact pixel-to-pixel calibration curves to get reasonable spectral results.

The rest of the header floats is not known to the author. Note that the newer version of Avantes software has a different format, see also https://kr.mathworks.com/examples/matlab/community/20341-reading-spectra-from-avantes-binary-files-demonstration

The RCM file contains the user-specified comment, so it may be useful for automatic conversion of data.

Written and tested by Filip Dominec, 2017

File extension

roh

KS implementation details

License: CC0-1.0

This page hosts a formal specification of .roh file format using Kaitai Struct. This specification can be automatically translated into a variety of programming languages to get a parsing library.

Usage

Parse a local file and get structure in memory:

data = AvantesRoh60.from_file("path/to/local/file.roh")

Or parse structure from a bytes:

from kaitaistruct import KaitaiStream, BytesIO

raw = b"\x00\x01\x02..."
data = AvantesRoh60(KaitaiStream(BytesIO(raw)))

After that, one can get various attributes from the structure by invoking getter methods like:

data.unknown1 # => get unknown1

Python source code to parse .roh file format

avantes_roh60.py

# This is a generated file! Please edit source .ksy file and use kaitai-struct-compiler to rebuild

from pkg_resources import parse_version
from kaitaistruct import __version__ as ks_version, KaitaiStruct, KaitaiStream, BytesIO


if parse_version(ks_version) < parse_version('0.7'):
    raise Exception("Incompatible Kaitai Struct Python API: 0.7 or later is required, but you have %s" % (ks_version))

class AvantesRoh60(KaitaiStruct):
    """Avantes USB spectrometers are supplied with a Windows binary which 
    generates one ROH and one RCM file when the user clicks "Save
    experiment". In the version of 6.0, the ROH file contains a header 
    of 22 four-byte floats, then the spectrum as a float array and a 
    footer of 3 floats. The first and last pixel numbers are specified in the 
    header and determine the (length+1) of the spectral data. In the tested 
    files, the length is (2032-211-1)=1820 pixels, but Kaitai determines this 
    automatically anyway.
    
    The wavelength calibration is stored as a polynomial with coefficients
    of 'wlintercept', 'wlx1', ... 'wlx4', the argument of which is the
    (pixel number + 1), as found out by comparing with the original 
    Avantes converted data files. There is no intensity calibration saved,
    but it is recommended to do it in your program - the CCD in the spectrometer 
    is so uneven that one should prepare exact pixel-to-pixel calibration curves 
    to get reasonable spectral results.
    
    The rest of the header floats is not known to the author. Note that the 
    newer version of Avantes software has a different format, see also
    https://kr.mathworks.com/examples/matlab/community/20341-reading-spectra-from-avantes-binary-files-demonstration
    
    The RCM file contains the user-specified comment, so it may be useful
    for automatic conversion of data.
    
    Written and tested by Filip Dominec, 2017
    """
    def __init__(self, _io, _parent=None, _root=None):
        self._io = _io
        self._parent = _parent
        self._root = _root if _root else self
        self._read()

    def _read(self):
        self.unknown1 = self._io.read_f4le()
        self.wlintercept = self._io.read_f4le()
        self.wlx1 = self._io.read_f4le()
        self.wlx2 = self._io.read_f4le()
        self.wlx3 = self._io.read_f4le()
        self.wlx4 = self._io.read_f4le()
        self.unknown2 = [None] * (9)
        for i in range(9):
            self.unknown2[i] = self._io.read_f4le()

        self.ipixfirst = self._io.read_f4le()
        self.ipixlast = self._io.read_f4le()
        self.unknown3 = [None] * (4)
        for i in range(4):
            self.unknown3[i] = self._io.read_f4le()

        self.spectrum = [None] * (((int(self.ipixlast) - int(self.ipixfirst)) - 1))
        for i in range(((int(self.ipixlast) - int(self.ipixfirst)) - 1)):
            self.spectrum[i] = self._io.read_f4le()

        self.unknown4 = [None] * (3)
        for i in range(3):
            self.unknown4[i] = self._io.read_f4le()