Codec shoot-out 2005 - Qualification

Table of content:

1 : Introduction
2 : What's new in 2005
3 : Test setup
4 : Usability
5 : Sources, bitrates and sizes
6 : Settings
7 : Encoding speed
8 : Test 1: Matrix (normal bandwidth) -- ultra high bandwidth
9 : Conclusion
10: Outlook

Welcome ladies and gentleman, to yet another installment of my (in)famous codec comparisons.

Another year has elapsed and it is time to look at the current state of video codecs again. The past year saw the rise of MPEG-4 AVC codecs, most notably the open source x264 variety, as well as Quicktime which also includes an AVC codec. AVC also made it into the first hardware devices like the Sony PlayStation Portable, Apple's latest iPod, and with the launch of HD DVD and Blu-ray looming, hardware capable of handing high definition AVC content will soon be available.

In the MPEG-4 ASP arena, not much has happened. XviD development slowed down considerably, 3ivX didn't have much news either, so we saw only the public release of last year's first timer, HDX4, and of course the launch of DivX6.

Besides Microsoft's implementation, VC-1 is now also available in hardware form from a handful of partners, but when it comes to Software, Microsoft's implementation is still the only one I know of.

What's new in 2005

This year, I'm introducing several major changes to the codec comparison. For starters, last years 10 contestants really pushed the envelope in terms of time spent for the comparison, and at the same time, people kept asking why certain codecs were not included. Thus I started looking for a means to include more codecs, while not blowing up the amount of time spent and I decided to adopt a system known from sports:

Instead of putting all codecs through the same test at the same time, I opted for a three stage test:

Stage 1 is known as the qualification round. But unlike sports, past experience give me a pretty good idea of how a codec can perform and I use this knowledge to pass codecs directly to the next round. For instance, last year's winner obviously doesn't have to qualify anymore, neither does last year's runner up, and all other codecs that left a good impression. The qualification phase is thus uniquely meant for those codecs that did not perform well in previous tests, and completely new entries. The qualification phase involves a direct comparison against XviD 1.1 beta 2 on the movie Matrix 1, which at this point is not a very hard source to encode. Any codec that can play in the same league will be qualified.

Stage 2 is known as the group round. All qualified codecs will be placed into three groups based on the standard they implement. The groups are MPEG-4 ASP, MPEG-4 AVC and non MPEG-4. All codecs in a group have to face off against each other, with the best two making it to the final round. This round uses Matrix Revolutions as its source - using the same settings as the 2004 comparison.

Stage 3 is the final round. In this round, the remaining 6 codecs will face off against each other using all three sources - the ones known from previous comparisons, and the new one, Steamboy - a two hour anime movie which proved to be rather hard to encode.

To level the MPEG-4 playfield somewhat, this year all MPEG-4 codecs will be using the MP4 container. Non MPEG-4 codecs will use the AVI container or their native container.

 

Having laid out how the comparison works, let's have a look at the history of the codecs that are entered in this comparison:


Similar to the comparison in previous years, the qualification phase lasted two weeks. Codec makers had up to 1.5 weeks to provide suggested settings and binaries for the task at hand. I have decided to include QuickTime despite not getting an answer from Apple regarding the participation in this comparison. Needless to say that I have to consider this as a sign that Apple is afraid of competition, as all other players, major and minor, have no problem participating.

The contestants

All codecs were tested in a 2 pass setup using the settings suggested by the developers where applicable. At this time, Dirac only offers one pass encoding.

Test setup:

All testing was done on the following hardware:

AMD Athlon 64 X2 4600+
Shuttle SN25P barebone system
2x 512 MB PC3200 Kingston HyperX DDR RAM CL2
Gigabyte Geforce 7800GT GFX card
Philips 230W5B2 Display connected via DVI

To encode I used the following software:

DGIndex 1.4.5 & DGDecode
AviSynth 2.55 for frameserving (avs2yuv doesn't support 2.56)
VirtualDub 1.6.11 for VfW encoding
Nandub 1.0 RC2 lumafix to multiplex AVI
Mp4box CSV dated November 24th to multiplex MP4

Usability and additional software

This year marks the introduction of another review category: usability. This is because the test field is very inhomogeneous and we have various means of encoding. Besides the traditional VfW codecs, more and more codecs offer a commandline utility or a DirectShow filter for encoding, or have a whole suite that does a lot more than just encoding. In this test, QuickTime is the most special tool because it's basically a player that can export a file it can play to another format.

Here is a quick table outlining which encoding methods are available

Codec / Encoding method VfW CLI DirectShow Other  
Dirac - x(2) - -  
Elecard - x(3) x(7) x(9)  
LMP4 x(1) x(4) - -  
ND ASP x(3) x(3) x(8) x(10)  
QT7 -   - x(11)  
Snow x(1) x(4) - -  
Theora x(1) x(5) -    
VSS x x - -  
XviD x x(6) - -  

(1): ffdshow
(2): ffmpeg with Dirac patch applied (generally not publicly available so you have to build it on your own), encodedirac
(3): no publicly available
(4): ffmpeg or mencoder
(5): ffmpeg2theora
(6): encraw
(7): Mainconcept H.264 encoder
(8): Early versions could be used in Graphedit if you renamed Graphedit to recode.exe, they've put a complete stop to this in later versions
(9): Elecard MobileConverter
(10): Recode2
(11): QuickTime player

Despite the variety of tools, there are some crucial problems for certain codecs:

Elecard was so nice to provide a commandline encoder for me. Initially it could only handle raw YUV input, but they created a special version for me to support direct AviSynth input. For the end user, people will have to use the MainConcept / Elecard tools or encode via Graphedit.

Dirac: Those who have previously tried Dirac know why the codec is not very popular. Dirac's own encoder can only handle YUV input, and if you decode Matrix to a raw YUV file, you're dealing with a 62 GB temporary file. There is a patch for ffmpeg, and ffmpeg accepts YUV input via stdin, meaning you can use avs2yuv to decode an AviSynth script and pipe the output directly to ffmpeg to create a direct encoder chain. However, standard ffmpeg distributions don't contain this patch so you need to make your own build.

A warning to those familiar with avs2yuv and mencoder: ffmpeg requires raw data from avs2yuv or needs to be told that the input is yuv4mpeg.

Finally, QuickTime proved to be the most difficult of the bunch. It opens AVIs, but very few of those. In fact, it cannot handle AviSynth scripts wrapped into an AVI or VFAPI frameserving or any other kind of frameserving. With an MPEG-2 plugin, it can read VOBs, but that isn't an option if you want to provide the same input to all contestants. The only other alternative is to decode your video to an AVI containing raw YUV2 data, change the fourCC to something QuickTime can handle (and consequently no other Windows based player recognizes), and then encode. So, for QuickTime, there's effectively no alternative than the 62 GB intermediary file, and that alone is grounds for disqualification. Are you willing to decode every video source to a raw stream first, just so that QuickTime can open it if no other codec requires this?

Playback

Codec / Playback method VfW DirectShow Other  
Dirac - x(3) x(5)  
Elecard -(1) x x(6)  
LMP4 x(2) x(2) x(6)  
ND ASP - x x(6)  
QT7 - x(4) x(7)  
Snow x(2) x(2) x(6)  
Theora x(2) x(2) x(6)  
VSS x x x(6)  
XviD x x x(6)  

(1): If the stream is muxed into an AVI and the fourCC is set to a value a pre-installed MPEG-4 AVC VfW codec supports
(2): ffdshow
(3): only supports raw Dirac streams, without audio
(4): using QuickTime Alternative
(5): mplayer or VLC. You need to apply a patch to mplayer and create your own build, and compile VLC with Dirac support after having built the Dirac libraries.
(6): mplayer or VLC
(7): QuickTime

There is a DirectShow filter which allows playback of Dirac in all DirectShow based players. However, this only applies to raw Dirac streams. While a patched ffmpeg can mux Dirac into AVI, you need a special patched build of VLC to play those files, and that's the only way. And this VLC build is not a standard build so once again you have to create one on your own. In addition to that, Dirac streams do not allow any seeking at all. You can pause and resume, but there's no going back or forward at all. And you can naturally not load those AVI files containing Dirac into your favorite AVI editor because there's no Dirac VfW codec.

The only other noteworthy contestant is VSS. The VSS VfW codec uses packed bitstream, which only the VSS decoder can handle. But if you use mp4creator of ffmpeg to extract the raw data from a VSS AVI, and mux that raw stream into an MP4, the files will play file using any other AVC capable decoder.

Last but not least, seeking in Theora will make your eyes hurt. Instead of go backwards to the last keyframe and decode from there, you'll get the frame as it is on screen, so very distorted until the next keyframe. Now imagine taking screenshots like that...


In terms of flexibility, VSS, LMP4, Snow and XviD are the most flexible, then comes Elecard, and QT7 and Dirac are clearly insufficient. A 62 GB intermediary file is just as unacceptable as being forced to a single player if you want to add audio to your video, and having no seeking. Therefore, both Dirac and QuickTime are at this point disqualified. They will be featured in this document where applicable (I decided to include them in the speed table), but there is no information about the final filesize because I never encoded the entire moving using either codec.

Sources, Bitrates and Sizes

Movies I encoded:

Matrix - Region1, NTSC, length: 2h16

Encoding parameters:

I used a 128kbit/s CBR MP3 audio track I created using BeSweet 1.5 beta 31 for all codecs using the AVI container. I used the same BeSweet and the latest Nero 6 AAC encoder DLLs to create a 128 kbit/s CBR HE AAC audio track for all codecs using the MP4 container. As in previous years, the goal is to put Matrix onto a 700 MB CD. The audio file sizes were as follows:

The proper filesize would be 127'826 KB, so Lame is pretty much on spot, with Nero being a little too large (nothing new, it was like that last year as well) and Vorbis being even a bit higher, but they are close enough to not require another attempt at encoding.

With MP3 being muxed into an AVI and the AAC audio stream into the MP4 container, this resulted in a video bitrate of 587 kbit/s for codecs using the MP4 container, and 580 kbit/s for codecs using the AVI container. This difference may appear unfair, but let's not forget that this comparison is trying to use a realistic backup scenario, and it's simply not possible to put all codecs into the same container. Finally, I was told by people working on Theora that the Ogg overhead is 1.08%, so that gave me a video bitrate of 581 kbit/s for Theora.

As you may know, not every rate control mechanism is perfect so here are the final movie sizes I got:

Codec Matrix  
Dirac n/a (disqualified prior to encoding the full movie)  
Elecard 717'015 KB  
LMP4 716'589 KB  
QT7 n/a (disqualified prior to encoding the full movie)  
ND ASP 716'740 KB  
Snow 720'132 KB  
Theora 738'231 KB  
VSS 717'395 KB  
XviD 714'598 KB  

Note that 700MB equals 716'800KB. I've marked every size in orange if it is more than 2.5 MB above target size (2.5 MB is about the oversize a CD can have and you can still burn it without having to activate overburning), red if it's more than 10 MB off target (grounds for disqualification). The ones marked in bright yellow are undersized more than 2.5 MB (that points to a rate control issue as well but is less problematic than oversized files).

As you can see, this year we got really close to target. Two results are noteworthy: I got two profiles from Videosoft, and the higher quality profile (which basically reduced encoding speed by 50%), ended up 16 MB undersized, which VideoSoft deemed acceptable, but since we switched to the profile that offers a better speed / quality tradeoff, the problem doesn't show here. Also noteworthy is that XviD misses the target by more than 2000 KB. It is at this point not yet sure what caused this, but perhaps plugging a bitrate into the encoder instead of a target size like I usually do has something to do with it (target size is not an option if you're going to extract the video stream from the AVI to put it into an MP4 - there's no bitrate calculator that can give you the target size in this scenario). Since Theora has no 2 pass ratecontrol, I figured I'd still have a look at the result to see if it makes any use of the almost 20 MB oversize.

Codec settings (applied with respect to the default codec settings)

As you can see, Snow doesn't allow regular two pass encoding. Instead, you need to make the first pass using a target quantizer that should match the average quantizer of the second pass as close as possible. If you don't pick a proper value for the first pass, your second pass size will be off.

In addition, there's no reliable way to reach a certain target size with Dirac, making it useless when you aim to hit a certain target size (unless you're willing to encode as many times until you hit the target size, and at 5 fps, that's going to take you a long time).

Encoding speed:

Since there is only one source in this qualification, the choice was easy. As usual, I'm performing a two pass over the first 10'000 frames, note the time spent from the start of the first pass till the end of the second pass, and divide this by the number of frames encoded. The usual disclaimer also applies: even though 10'000 frame should be rather representative, I did only perform each measurement once and did not reset my PC in between. Looking at how major hardware sites on the web measure speed, I think my methodology still is a considerable improvement, but it in no way lives up to a scientific standard.

Codec Speed SMP optimized  
Dirac 5.00 fps -  
Elecard 40.08 fps x  
LMP4 26.01 fps x(1)  
ND ASP 139.36 x  
QT7 7.45 fps x  
Snow 18.60 fps -  
Theora 24.91 fps    
VSS 45.98 fps x  
XviD 92.59 fps -(2)  

(1): Not used because it trades visible quality for a rather low gain in speed
(2): XviD is not SMP optimized but because VirtualDub separates input reading (and thus decoding and resizing of the source) from encoding, it can still make good use of an SMP system. This naturally applies to other VfW codecs as well when used within VirtualDub.

In addition for disqualification due to usability issues, I put another killer requirement in the speed area. With 1 MB cache away from having the fastest workstation CPU there is for video encoding these days, I decided to disqualify all codecs that do not reach at least 10.0 FPS. This is not only cutting down the total encoding time for the comparison, it is also to be considered that if you have a slower CPU, encoding time will be much lower. For instance, my forum moderator Bond also made a QT test on his 833 MHz P3, and encoding proceeded at 0.7 FPS. Or take Dirac's 5 FPS and translate this into HD content. If we assume we're going to raise both horizontal and vertical resolution by a factor of two, thus the number of pixels by a factor of four, and if we assume encoding time scales linearly, that means 1.25 FPS, and encoding Matrix would take 85 hours or 3.5 days. If the result would crush the competition utterly, you may be willing to do that, but if you combine that with the usability problems, and the severe video quality problems I will outline on page two, then it only makes sense not to wait that long.

I've used the following colorscheme for the speed table: red = disqualified due to a single digit FPS rate. Orange: below real-time speed (real-time means the codec encodes at least as many frames as the video has per second), and green for codecs that encode faster than 2x real-time. VSS just barely misses the green category, and NeroDigital reaches speed levels previously unheard of.

Other important stuff

Here's the AviSynth script I used to encode the movies, so that you can perfectly reproduce my results. I used force film in DGIndex in Matrix.

Matrix script:

mpeg2source("D:\DVDs\THE_MATRIX_16X9LB_N_AMERICA\VIDEO_TS\matrix.d2v")
crop(0,60,-2,-64)
LanczosResize(640,272)

Playback

Because there is no fourCC in MP4, there is effectively only one playback filter for all MPEG-4 codecs this year: the ateme decoder. It supports all MPEG-4 AVC high profile features including interlaced content (for which I have zero use and which should be banned altogether but that's another issue entirely). I could've picked the libavcodec decoder instead, which is more often used. The main reason I did chose the ateme decoder is that it supports frame accurate seeking, so I can tell my player to jump to frame X, and it will jump there. ffdshow can't do this properly and it's thus not a perfect implementation of a DirectShow playback filter. I also came across another issue: ffdshow brightens up the picture unnaturally when I use the VMR9 renderer (which I must in order to take screenshots). You can see that in the Theora screenshots on the next page. And just in case, I still gave ffdshow a go in the lobby shootout and found that while decoders do tend to decode a bit differently, it has no effect on how you rate codecs in relation to each other.

I enabled deblocking for MPEG-4 ASP content. There is no postprocessing for AVC content because AVC has a built-in deblocker (and those who have activated postprocessing for AVC content in ffdshow will be in for a really bad surprise.. it works but completely ruins your picture). ffdshow was used for Snow, and Dirac's own DirectShow filter was used for Dirac.

All files were reviewed using Media Player Classic as it can play back pretty much everything you can throw at it. Furthermore, the player can forward to any frame in the video which literally saved me days when making screenshots. Thank you Gabest for making such a great player!

Now proceed to the test (low bandwidth JPEG version. This version loads 262KB of images for the beginning, and the total image size for the matrix test is 1.72 MB). If you have a lot of bandwidth and / or don't mind waiting, there's a high bandwidth PNG version (loads 1.48 MB of images initially and if you want to see all the images, you have to download 9.52 MB). Also note that the high bandwidth version requires that you have enough browser cache left or the images will be reloaded each time you zoom in or out.

This document was last updated on December 26, 2005