The spec references
[VByte] Hugh E. Williams and Justin Zobel. Compressing integers for fast file access.
The paper http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.18.3782&rep=rep1&type=pdf mentions
- using bits 7-1 as payload and bit 0 (rightmost) bit to indicate if there are more bytes
- 0 indicates last byte of multi-byte
However, it seems that the hdt-java implementation uses another approach and references http://nlp.stanford.edu/IR-book/html/htmledition/variable-byte-codes-1.html
- using bits 6-0 as payload and bit 7 (leftmost) bit to indicate if there are more bytes
- 1 indicates last byte of multi-byte
So perhaps the spec should mention the nlp.stanford page, instead of the original paper ?
The spec references
The paper http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.18.3782&rep=rep1&type=pdf mentions
However, it seems that the hdt-java implementation uses another approach and references http://nlp.stanford.edu/IR-book/html/htmledition/variable-byte-codes-1.html
So perhaps the spec should mention the nlp.stanford page, instead of the original paper ?