MP3 for image compression
During some random discussion with my colleague today, I rediscovered an experiment I did quite some years ago: MP3 compression of image data. This is, feed raw 8-bit-per-pixel grayscale image data into an MP3 encoder, let it compress it, and decode the MP3 file again. After some conversion (the decoder will always produce 16 bit signed output, but 8 bit unsigned image data is needed), a raw image file will result of this whole process, ready to be displayed with an arbitrary image viewer. Sounds simple, eh? :)
I just reimplemented the whole thing, this time using some more-or-less standard Unix tools (bash, ImageMagick, LAME, Python) instead of Turbo Pascal and Paint Shop Pro handiwork. For the first test, I used some representative image, scaled it down to 960×720 pixels, and pretended that it was a 24 kHz mono sound file. The LAME encoder options were -q 0 -k -b $SomeBitrate
(maximum quality, no lowpass filtering) with bitrates from 8 kbps on. The results are not as bad as the rude misuse of an audio-specific algorithm for image compression would suggest: Above 24 kbps (about 1 bit per pixel), quality is actually quite acceptable. Of course, it doesn’t even remotely match JPEG’s quality, there are still much artifacts that look a bit like noisy analogue TV reception.
By the way, the generated MP3 files sound a bit like recordings from space probes, so if you need some nifty sound effects, mp3img might be the right tool for you :)
Files
- mp3_vs_jpeg.png (355k) – the MP3 vs. JPEG quality assessment
- mp3img (1k) – shell script to compress image files with MP3
(This article is also available in Russian at Softdroid)
Cool!
Nice. How about AAC and OGG? :)
So I guess JPEG compression of a sound file is the next step?
@Toke, why not go for ethernet-over-audio by pumping arbitrary data through the MP3 encoder, playing it, recording it on another machine, then trying to reverse that to what the MP3 encoder saw? Purpose unknown, but sounds challenging in a fun way.
You’ve just invented dialup
Cool, so you’re now halfways visualizing sounds from space ;)
Cool! I had a project in which the objective was sending images encoded as text. This comes in handy in remote areas where there’s only eg. GSM available. The challenge is in compression. ;)
@Miguel, isn’t this just using something like PNG or JPEG compression to get a byte stream, then using something like Base64 to encode it as text? I would imagine this is doable in about 2 dozen lines of code (less if you don’t care about readability).
Nice! Finally i’ve got something to show to people who dont beleive that mp3 lowers the quality of an audio signal :)
Scanning every other line backwards should improve quality by reducing the discontinuity. I see long ringing around sharp features like the Stop sign. Large images with more gradual transitions would work better.
When experimenting with color, it’s annoying that few audio tools support 3 channels. Only vorbis can take them. And a conversion from signed (sound) to unsigned (graphics) should be done.
Opus has a highpass filter for DC (solid colors). The images look like ghosts with glowing edges as if a TV was tuned slightly off channel.
When saving music into JPEG as Left-Red,Blue, Right-Green, I hear that the Green channel gets quantized less. Direct RGB encoding without a transform into YCbCr yields better quality. Color subsampling should be avoided, to avoid combining consecutive rows.
In case you’re reading this in 2024 and getting “UnicodeEncodeError: ‘charmap’ codec can’t encode characters in position 23-25: character maps to ” error, try replacing the python command at the end with this:
python -c "import io;import sys;sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8');sys.stdout.write( 'P5\n$Dimensions\n255\n'+''.join([chr((ord(c)+128)&255) for c in sys.stdin.read()[::2]]))" <mp3img.gray >mp3img.pgm
It basically adds this:
import io
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
So that utf-8 terminals can display everything correctly I guess. Wouldn’t work without modifying the code in my terminal (windows terminal, msys2).
vort3: Thanks for the fix! Python 3 didn’t even exist when this was originally written, and nobody could anticipate that they would be so bone-headed to make it incompatible with existing Python versions … But I digress. While your fix technically works, I’d rather recommend solving the Python 3 issue in a much more idiomatic manner:
python3 -c "open('mp3img.pgm', 'wb').write(b'P5\n$Dimensions\n255\n' + bytes((c + 128) & 255 for c in open('mp3img.gray', 'rb').read()))"