Disguising information in plain view is a toy problem I'm regularly drawn to, usually by inventing ridiculous methods to store a key. I've tried many approaches, like repurposing the nanoseconds of ext4 timestamps, broadcasting via modified Bluetooth mouse firmware, modifying a home appliance, and recently by modifying Exif metadata.
Rather than adding or modifying metadata or image noise, metadata can be
reordered. Exif tags are encoded lowest-first, but readers do not enforce this.
One Sony A7R photo includes 63 tags, offering log2(63!) order
permutations, or 289 bits, larger than an AES-256 key, simply by rearranging
existing content.
In practice Exif reordering isn't great, as readers print warnings for misordered tags, and tags are constrained by groupings, but implementing it introduced me to Lehmer code, which offers a way to encode permutations, and revealed its applicability to "free storage" hiding in seemingly unordered collections appearing everywhere.
Lehmer is straightforward to understand and easy to implement. For each position in a unique sequence, count later items smaller than the current item. The Lehmer digits for (A, B, C, D) are (0, 0, 0, 0), while for (D, C, A, B) they are (3, 2, 0, 0), with the maximum digit decreasing by one at each position, alongside its required storage.
Here it is on a 13-item sequence, yielding enough digits to encode a 32-bit number:
| Key: | |
| Sequence: |
Anywhere a unique sequence can be recovered, Lehmer can extract a bitstream. Such a sequence occurs on this page, which contains unique sentences, or bits worth of Lehmer digits:
It's difficult to avoid noticing these sequences after spotting one. In computing, most collections which are logically unordered assume some order during serialisation. Here is a CSS stylesheet:
td, th { text-decoration: underline; color: green; }
p { font-weight: bold; }
a { color: red; }
The order of the selector lists, properties, and rules is mostly ignored during evaluation for non-overlapping rules, yet a file order exists and can be queried via DOM. Wikipedia's stylesheet has 1,003 rules with 1,950 selectors and 2,731 properties. It is possible to embed over a kilobyte of data just by reordering the rules, and even more by reordering selectors and properties too.
A kilobyte is enough to hide malware behind a tiny and innocuous decoder. Who
said an Internet worm couldn't be written in CSS? Here's a benign demo encoded
in this page's example CSS block:
Lehmer coding seems to have a natural home in file watermarking. Otherwise identical tarballs and ZIP files might have their directories reordered in a manner easy to miss during extraction, and the watermark may even durably transfer to a target filesystem, such as by dictating the order of the ext4 birth timestamp.
Transferability is possible in other ways. Consider how a browser might behave during printing when rendering HTML or an SVG where absolutely positioned elements have been reordered. At least with Firefox on Linux for SVG, the PDF output follows the input order (first, second).
For program binaries, -ffunction-sections along with a GNU linker
script could reorder functions in a binary, or perhaps transform a table of
static strings used for rendering error messages.
Imagine working with a backend API that appears to generate JSON properties or Protocol Buffer fields in a random order. A first thought might attribute this to a hash table in the implementation, but what if the ordering was instead exfiltrating data? In the case of Protocol Buffers, few engineers would even examine the physical representation.
A spy armed with only a Javascript interpreter must derive a new mutable DHT key each day to which intelligence is published. Their handler instructs them to use Lehmar on the list of article titles appearing on the BBC News web site at midday, of which there are currently 76 (= 369 bits). The world is glad I don't write spy movies.
A fun application might be to bridge Lehmer to the real world. Take this innocuous mug rack advertised on Amazon:
Ignoring the non-unique nature of these particular mugs, a vision model could identify bounding boxes, and an order derived from the average lightness or hue of each box. Such a rack would offer 44 bits, not quite enough for a good password by itself.
This would make a fun mobile app to build. Next time you visit someone's home, be sure to keep an eye out for any extra large racks filled with unique mugs.
David Wilson · dw@hmmz.org · Sun 5 Jul 20:31:09 2026