Tech Flashback: MP3 Encoding Through the Years

It may be hard to believe, but the MPEG-1/MPEG-2 Layer III standard (commonly known as MP3) is now 19 years old (from public release date, or 21 years old from the first standards release). In technology terms, this is an eternity.

The standard, developed by the Motion Picture Experts Group (that’s what MPEG stands for), provided a means to provide audio compression which could reach 11:1 while still retaining “near CD quality” by using psychoacoustic principles – a type of perceptual coding which discards information which is not easily noticed or heard.

Initially requiring a decent amount of CPU for real time decoding (at least 80486 class), computing power advanced rapidly and MP3 decoding became almost trivial with the advent of the Pentium I and integration of Floating Point Units. For many many years, it was the compression standard of choice amongst consumers for its ubiquitous support in standalone flash-memory players, and later support in CD players as well. It is now an ISO standard, which gives it an advantage when it comes to archival.

Today, MP3 is exceeded in compression efficiency by MPEG-4 AAC which is gaining wider acceptance and can deliver much higher quality especially at low bitrates, although MP3 still sees wide use in many music libraries.

My Early MP3 Adventures

I didn’t discover MP3 until about 1998, when I was ripping CDs and exploring different compression modes. I was aware of the ACM compression modes, such as the GSM, ADPCM modes which were offered with Windows, but all of them featured significant audio quality impairment and didn’t really have a “middle ground” – i.e. ADPCM was too big, and ACELP voice modes were too small.

I was using a piece of shareware software called CDCopy v4.511 by Markus Barth. I found my archived copy of it the other day, and while it still runs, its dependence on ASPI means that it no longer is able to deal with ripping CDs on a modern Windows 7 x64 system.

CDCopy4511

The software itself seemed quite simple, and it even has many different formats supported, although some require external encoders to function.

CDCopyFormats CDCopyFormats2

Encode DialogI had fond memories encoding my first MP3. It took a whole ten minutes to encode a three minute song. It had an integrated encoder, and while it was not clearly identified, it is highly probable that it is based on BladeEnc. That being said, it’s probably quite an old version, as it performs slowly (roughly only 2x running on my AMD Phenom II x6 1090T BE at 3.90Ghz).

The output file was much smaller, having been encoded in 128kbit/s, and sounded minimally altered through the $10 “boxy beige” computer speakers I had at the time.

It was slow though – you know that when you have the option to System shut down when ready. I couldn’t imagine what it would be like to compress a whole library of music at that rate, but I did come up with the bright idea that, given tweaking to mono, it became fully feasible to store a song on a floppy disk (not that anyone would want to do that nowadays). Windows Media Player (in Windows 98) already had a Fraunhofer-IIS decoder for MP3, so nobody needed to install any special software to play it back.

A decade later …

It wasn’t until almost a decade later, that I discovered audiophile quality headphones and the deficiencies in MP3 became apparent. At most bit-rates, the quality of the MP3 is not completely transparent (although, above 256kbit/s, the differences require a lot of concentrated listening to discern).

Interestingly, it was found that despite decoders being the same, depending on which encoder was used, the perceived quality of a file at the same bit-rate is different. This can be especially true if you compare those encoded using “fast” settings versus those with “high quality”, but only if your ears and audio equipment are good enough.

Different MP3s at the same bit-rate can have different quality as well as a different character. The MP3’s I encoded out of the above program had a distinct swimminess to it which is not replicated by most MP3 encoders I’ve heard to date.

This in itself, shows that the MP3 standard had room for improvement – the improvements were all made at the encoder end with new and improved psychoacoustic models and encoding techniques. There was a phase in MP3’s life where people would steadfastly stick to their preferred encoder out of a belief that it was superior – some people preferred l3enc, mp3enc, bladeenc, or LAME. The consensus is now generally in LAME’s favour, especially as MP3 encoder development has generally slowed or halted.

I became aware of codec tests where the subjective quality of each encoder was tested to try and establish which encoder is best at a given bit-rate condition. There were many ties, and statistically-close results which seemed to show that many encoders were similar at the time of testing.

Encoding Through the Years

While the codec tests provide a quality metric based on mean opinion score, it doesn’t recognize that each person’s preference is a little bit different. To track the differences in MP3 encoding through the years, I’ve managed to obtain various MP3 encoders to test. All, surprisingly, still run on any x86 32-bit based Windows install.

  • L3ENC V0.99a Beta (1994) Fraunhofer-IIS Shareware
  • L3ENC V2.71 Fraunhofer-IIS (1996)
  • MP3ENC V3.1 Fraunhofer-IIS Demo (1998)
  • CDCopy 4.511 (1998) Shareware (Sounds identical to BladeEnc 0.94.2)
  • Helix MP3 Encoder v5.1 (2005) (Closest relative to Xing still available)
  • Poikosoft Easy CD-DA Extractor v15.3.2 (2011) (Using LAME 3.99)
  • Apple iTunes 11.1.4 (2014) (Latest Fraunhofer?)

A test track based on a 23-second lossless excerpt of Katy Perry – Walking on Air from her latest album PRISM (Deluxe) was used. Due to the use of a short excerpt, for the purposes of research into MP3 encoder development over the years, I believe this to be in the spirit of fair use and does not violate Copyright. Of course, if you like the song, go and buy it!

This was encoded with each encoder set to all allowable bit-rate ranges from 56kbit/s to 320kbit/s. Some encoders were unable to produce results at certain bit-rates and thus no samples are provided at those bit-rates. This ranged from error messages to crashes during encoding. Likewise, some encoders provided no control over sample rate – only files with 44100Hz sample rate were accepted. Some encoders automatically toggled between Joint-Stereo/Joint-Stereo with Mid-Side Extension/Stereo modes, and others were manually set to each to provide a comparison at how Joint-Stereo techniques can improve quality at a given bit-rate.

L3enc v0.999a is the earliest publically released MP3 encoder, and as a result of its beta status, its bitstream does not play correctly in VLC or MPC-HC. It does play correctly in Quicktime and Winamp – the reasoning may be that it doesn’t strictly comply with MP3 standards. As a result, you may hear glitching on the samples.

The Test Samples

I encourage you to explore as many test samples as you have the time or inclination to. It took a plentiful amount of time to make this happen, although I’m sure the majority won’t necessarily appreciate it.

My observations are as follows:

  • L3enc v0.99a is hilariously poor at most bitrates, although it does get better above 192kbit/s. Even at 128kbit/s, it sort of “shimmies” and has amplitude perturbations which are quire distracting. Below that, it kinda starts sounding muddy, swimmy and glitchy. It’s got a very interesting character.
  • L3enc v2.71 is markedly improved to the point it is rather acceptable. It does have harshness at the high end, but the glitchiness is mostly gone. Unfortunately, many bit-rates were not available due to an error message.
  • Blade produces very interesting treble sing-along – it’s like a terrible pre-echo that affects all bitrates below 128kbit/s. At 128kbit/s, it tames itself into a over-noisy-quantised treble that’s distracting but very characteristic of many MP3s I’ve come across in the early MP3 days. At 160kbit/s and above, it starts being acceptable although the sound does sound a little hollow. The encoder puts itself at a big disadvantage by not supporting Joint Stereo (even when selected), and thus needs to dedicate half the bit rate to each channel (not being able to use the commonality between channels). This explains, to some degree, why the audio is poor, although not the magnitude of the swimmyness.
  • Helix produces less offensive results, and is quite acceptable at all the available bitrates. I think it’s likely that the earlier versions of Xing produced much lower quality files and it is because this encoder comes from 2005, it is competitive with modern encoders.
  • LAME produces fairly good results across the board, and it’s clear from comparing lower bitrate Joint-Stereo to Independent Stereo clips that Joint-Stereo gives the encoder an edge in preserving quality. That being said, the Independent Stereo clips at lower bitrates show quite clearly that MP3 is incapable of clearly conveying the full 44100Hz sample rate clearly at below 96kbit/s. It seems that LAME makes good use of filtering to reduce the amount of glitches, at the expense of treble clarity.
  • iTunes likely uses the latest Fraunhofer-IIS encoder, and as a result, represents the “other” preferred encoder in use nowadays. It performs superbly when compared to earlier efforts.

Conclusion

When you listen to the files side-by-side, it’s clear that the quality of MP3 encoders have improved dramatically throughout the lifetime of the MP3 standard. Improvements in the encoders require no changes at the decoding end at all, and are solely in algorithms and psychoacoustic models. The biggest improvements have been had at smaller bitrates, improving the quality by reducing the anomalous pre-echo, swimmyness, glitchyness or muddiness that is perceived. Different encoders do have different characteristics, for example, BladeEnc seemed to over-emphasize the treble, whereas LAME tried to tone it back slightly to reduce the likelihood that glitches would become objectionable.

I suppose this is also a good reminder to those hanging onto old encoding software that encoders have improved, and by using older software, you are not gaining the benefits of faster encoding (through better instruction set optimization for later CPUs) and higher quality through improved psychoacoustic models.

Appendix: Test Sample Data

A tool called Mr Questionman was used to try and gain insights into which encoder likely produced the given files. While the iTunes encodings seem to be occasionally mis-identified as BLADE, the rest of the encodings produced consistent results.

About lui_gough

I'm a bit of a nut for electronics, computing, photography, radio, satellite and other technical hobbies. Click for more about me!
This entry was posted in Computing, Tech Flashback and tagged , , , , . Bookmark the permalink.

Error: Comment is Missing!