The Lumber Room

"Consign them to dust and damp by way of preserving them"

Archive for May 31st, 2009

Extreme image compression: the Twitter challenge

with 3 comments

If a picture is worth a 1000 words, how much of a picture can you fit in 140 characters?

Mario Klingemann (Quasimondo on Flickr) had a fascinating — call it crazy if you like — idea: can you encode an image such that it can be sent as a single Twitter message (“tweet”)? Twitter allows 140 characters, which seems like nothing. It’s pretty much guaranteed that you’ll be able to get nothing meaningful out of so few bits, right?

Well, he came up with this, using a bunch of clever tricks: using the full Unicode range for “characters” (Chinese, etc.) to squeeze a few more bits’ worth, representing colours as blends of just 8 colours (3 bits!), and arriving at a Voronoi triangulation through a genetic algorithm:

© Quasimondo:Flickr/CC-BY-NC (210 bytes?)

“Mona Tweeta” © Quasimondo:Flickr/CC-BY-NC (210 bytes?)

The one on the right is the real Mona Lisa, and the left one is what fits in 140 characters, specifically the message: “圑嘌婂搒孵怤實恄幖戰怴搝愩娻屗奊唀唭嚟帧啜徠山峔巰喜圂嗊埯廇嗕患嚵幇墥彫壛嶂壋悟声喿墰廚埽崙嫖嘵奰恛嬂啷婕媸姴嚥娐嗪嫤圣峈嬻尤囮愰啴屽嶍屽嶰寂喿嶐唥帑尸庠啞彐啯廂喪帄嗆怠嗙开唅恰唦慼啥憛幮悐喆悠喚忐嗳惐唔戠啹媊婼捐啸抃岖嗅怲幀嗈拀唹坭嵄彠喺悠單囏庰抂唋岰媮岬夣宐彋媀恦啼彐壔姩宔嬀”

This is pretty impressive, you’d think, for 140 characters. But it gets better. Brian Campbell started a contest on Stack Overflow, and some brilliant approaches turned up.

Boojum wrote a nanocrunch.cpp, based on fractal compression, which can do this (on the left is the original, for comparsion):

Boojum-origby Boojum [490 bytes] by Boojum (490 bytes)

Sam Hocevar wrote img2twit, which segments the image into square cells and tries to randomly assign points and colours to them until something is close. It can do this:

img2twit-origimg2twit-dec
img2twit by Sam Hocevar (250 bytes?)
You can watch a movie of the image evolving; it’s pretty cool!

There were also attempts at converting the image to a vector format and encoding that instead. Needless to say, it works well for vector-like images:
so-logoso-logo-decoded (almost perfect!)

but it’s hard to even convert some images to vector form:
autotrace by autotrace (before compression!)

Finally, this is how Dennis Lee’s record-holding “optimizing general-purpose losy image codec” DLI does:

DLI-comp

Comparison: JPEG at 536 bytes, img2twit at 534 bytes, DLI at 534 bytes

Or if you want to be fair and compare at 250 bytes, here’s img2twit and DLI:

img2twit at ~250 bytes, DLI at 243 bytes

img2twit at ~250 bytes, DLI at 243 bytes

Amazing!

For silly amusement, you can read a liberal translation of the original message, or the Reddit thread with ASCII porn.

Disclaimer: I did not participate in any of this, and I know nothing about image compression, so no doubt there are errors in the above. Please point them out. All images are copyright the respective owners, and the quote in the first line is by Brian Campbell on Stack Overflow.

Advertisement

Written by S

Sun, 2009-05-31 at 13:37:44