Codecs Part I - VLQ
What is VLQ?
VLQ (Variable-Length Quantity) is a method for encoding integers in a compact, variable-length format. It is widely used in file formats such as MIDI, DWARF debugging information, and Google Protocol Buffers.
How VLQ Works
VLQ encodes integers using 7-bit chunks in each byte, where:
- The first 7 bits store the number's data.
- The Most Significant Bit (MSB) acts as a continuation flag.
- If
MSB = 1
, another byte follows. - If
MSB = 0
, it's the last byte.
VLQ decoder implementation:
VLQ Implementation
There you can find a number of embedded systems programming examples. You may download the repo and
run all the examples. Please take a look at the
readme.
Encoding Example
Let's encode the number 300 using VLQ:
- Convert
300
to binary:100101100
(9 bits). - Split it into 7-bit chunks (right to left):
00101100 00000010
- Set MSB for all but the last byte:
10101100 00000010
- Encoded as two bytes:
0xAC 0x02
Decoding VLQ
To decode, remove the MSBs and concatenate the remaining 7-bit chunks:
// VLQ Decoding Example
Binary representation:
10101100 00000010 // (VLQ encoded value)
// Remove MSBs:
0101100 0000010
// Reassemble:
0000010 0101100 = 100101100 (300)
Use Cases
- MIDI Files: Used to store delta times efficiently.
- DWARF Debugging Format: Found in ELF executables.
- Google Protocol Buffers: Uses a similar technique called varint.
Advantages
- ✅ Saves space for small numbers.
- ✅ Efficient for serialization and data streaming.
- ✅ Easy to parse.
Disadvantages
- ❌ Processing overhead due to bit manipulation.
Conclusion
VLQ is an efficient way to encode integers compactly and is used in various applications. It balances space efficiency and processing overhead, making it a common choice in data serialization and file formats.