Variable length quantity, unsigned integer, base128, little-endian: GraphViz block diagram (.dot) source

A variable-length unsigned integer using base128 encoding. 1-byte groups consists of 1-bit flag of continuation and 7-bit value, and are ordered "least significant group first", i.e. in "little-endian" manner.

This particular encoding is specified and used in:

  • DWARF debug file format, where it's dubbed "unsigned LEB128" or "ULEB128". http://dwarfstd.org/doc/dwarf-2.0.0.pdf - page 139
  • Google Protocol Buffers, where it's called "Base 128 Varints". https://developers.google.com/protocol-buffers/docs/encoding?csw=1#varints
  • Apache Lucene, where it's called "VInt" http://lucene.apache.org/core/3_5_0/fileformats.html#VInt
  • Apache Avro uses this as a basis for integer encoding, adding ZigZag on top of it for signed ints http://avro.apache.org/docs/current/spec.html#binary_encode_primitive

More information on this encoding is available at https://en.wikipedia.org/wiki/LEB128

This particular implementation supports serialized values to up 8 bytes long.

KS implementation details

License: CC0-1.0
Minimal Kaitai Struct required: 0.7

This page hosts a formal specification of Variable length quantity, unsigned integer, base128, little-endian using Kaitai Struct. This specification can be automatically translated into a variety of programming languages to get a parsing library.

GraphViz block diagram source

vlq_base128_le.dot

digraph {
	rankdir=LR;
	node [shape=plaintext];
	subgraph cluster__vlq_base128_le {
		label="VlqBase128Le";
		graph[style=dotted];

		vlq_base128_le__seq [label=<<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0">
			<TR><TD BGCOLOR="#E0FFE0">pos</TD><TD BGCOLOR="#E0FFE0">size</TD><TD BGCOLOR="#E0FFE0">type</TD><TD BGCOLOR="#E0FFE0">id</TD></TR>
			<TR><TD PORT="groups_pos">0</TD><TD PORT="groups_size">1</TD><TD>Group</TD><TD PORT="groups_type">groups</TD></TR>
			<TR><TD COLSPAN="4" PORT="groups__repeat">repeat until !(_.has_next)</TD></TR>
		</TABLE>>];
		vlq_base128_le__inst__len [label=<<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0">
			<TR><TD BGCOLOR="#E0FFE0">id</TD><TD BGCOLOR="#E0FFE0">value</TD></TR>
			<TR><TD>len</TD><TD>groups.length</TD></TR>
		</TABLE>>];
		vlq_base128_le__inst__value [label=<<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0">
			<TR><TD BGCOLOR="#E0FFE0">id</TD><TD BGCOLOR="#E0FFE0">value</TD></TR>
			<TR><TD>value</TD><TD>(((((((groups[0].value + (len &gt;= 2 ? (groups[1].value &lt;&lt; 7) : 0)) + (len &gt;= 3 ? (groups[2].value &lt;&lt; 14) : 0)) + (len &gt;= 4 ? (groups[3].value &lt;&lt; 21) : 0)) + (len &gt;= 5 ? (groups[4].value &lt;&lt; 28) : 0)) + (len &gt;= 6 ? (groups[5].value &lt;&lt; 35) : 0)) + (len &gt;= 7 ? (groups[6].value &lt;&lt; 42) : 0)) + (len &gt;= 8 ? (groups[7].value &lt;&lt; 49) : 0))</TD></TR>
		</TABLE>>];
		subgraph cluster__group {
			label="VlqBase128Le::Group";
			graph[style=dotted];

			group__seq [label=<<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0">
				<TR><TD BGCOLOR="#E0FFE0">pos</TD><TD BGCOLOR="#E0FFE0">size</TD><TD BGCOLOR="#E0FFE0">type</TD><TD BGCOLOR="#E0FFE0">id</TD></TR>
				<TR><TD PORT="b_pos">0</TD><TD PORT="b_size">1</TD><TD>u1</TD><TD PORT="b_type">b</TD></TR>
			</TABLE>>];
			group__inst__has_next [label=<<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0">
				<TR><TD BGCOLOR="#E0FFE0">id</TD><TD BGCOLOR="#E0FFE0">value</TD></TR>
				<TR><TD>has_next</TD><TD>(b &amp; 128) != 0</TD></TR>
			</TABLE>>];
			group__inst__value [label=<<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0">
				<TR><TD BGCOLOR="#E0FFE0">id</TD><TD BGCOLOR="#E0FFE0">value</TD></TR>
				<TR><TD>value</TD><TD>(b &amp; 127)</TD></TR>
			</TABLE>>];
		}
	}
	vlq_base128_le__seq:groups_type -> group__seq [style=bold];
	group__inst__has_next:has_next_type -> vlq_base128_le__seq:groups__repeat [color="#404040"];
	vlq_base128_le__seq:groups_type -> vlq_base128_le__inst__len [color="#404040"];
	group__inst__value:value_type -> vlq_base128_le__inst__value [color="#404040"];
	vlq_base128_le__inst__len:len_type -> vlq_base128_le__inst__value [color="#404040"];
	group__inst__value:value_type -> vlq_base128_le__inst__value [color="#404040"];
	vlq_base128_le__inst__len:len_type -> vlq_base128_le__inst__value [color="#404040"];
	group__inst__value:value_type -> vlq_base128_le__inst__value [color="#404040"];
	vlq_base128_le__inst__len:len_type -> vlq_base128_le__inst__value [color="#404040"];
	group__inst__value:value_type -> vlq_base128_le__inst__value [color="#404040"];
	vlq_base128_le__inst__len:len_type -> vlq_base128_le__inst__value [color="#404040"];
	group__inst__value:value_type -> vlq_base128_le__inst__value [color="#404040"];
	vlq_base128_le__inst__len:len_type -> vlq_base128_le__inst__value [color="#404040"];
	group__inst__value:value_type -> vlq_base128_le__inst__value [color="#404040"];
	vlq_base128_le__inst__len:len_type -> vlq_base128_le__inst__value [color="#404040"];
	group__inst__value:value_type -> vlq_base128_le__inst__value [color="#404040"];
	vlq_base128_le__inst__len:len_type -> vlq_base128_le__inst__value [color="#404040"];
	group__inst__value:value_type -> vlq_base128_le__inst__value [color="#404040"];
	group__seq:b_type -> group__inst__has_next [color="#404040"];
	group__seq:b_type -> group__inst__value [color="#404040"];
}