Saturday, August 27, 2011

Encoding Wars - The Return of the Revenge of the Sequel

Over the past decade, the video compression technology used in fansubs has changed several times - from DivX to XviD, from XviD to H.264 - as has the preferred container format - from AVI to OGM to MKV and MP4.  Each of these transitions has been accompanied by bitter complaints and a fair amount of inconvenience.  The transition to H.264 was very painful for viewers with slow computers.  When MKV first appeared, the comment section of BoxTorrents (now BakaBT) was filled with jeremiads against the new format.  Even the DivX/XviD transition has left scars - some encodes from that period will not play correctly, because they were done with beta versions of the codecs and contain encoding bugs.

Time has fixed most of these issues.  Today's multi-core PC's laugh off H.264.  The Combined Community Codec Project (CCCP) has provided a standardized playback kit for Windows, as well as a benchmark for testing the correctness of encodes.  Backward compatibility is quite good, barring only the occasional ancient, buggy encode.

What time hasn't fixed is the problem of encode bloat.  Most encoding advances have been introduced with the claim that they would reduce the size of the video.  When H.264 was introduced, that was true initially: H.264 episodes were smaller than their XviD counterparts.  But fairly quickly, they became equal size, and then larger: the codec improvements were used to preserve more detail rather than reduce file size.  File sizes grew, and then grew some more.  The transition to HD resolutions exacerbated the problem.

This is circuitous introduction (tl;dr) to the latest improvement in encoding - 10-bit H.264, sometimes called Hi10P.  I don't confess to understand the technical details, but the claim is that Hi10P reduces video encode size by 30%, at level quality.  Experiments with Hi10P began in the spring.  This summer, CCCP added formal support, and now the floodgates are open.

Compared to previous transitions, this one is pretty painless.  CCCP seems to work correctly "out of the box."  Many groups are moving to Hi10P in a sensible way: shows that were started in 8-bit technology are being finished that way.  And as with previous transitions, the improved compression technology is being used to reduce file size.

Would anyone care to take a bet on how long that will last?  My prediction is that by this time next year, Hi10P encodes will be as big as their 8-bit counterparts are today, and after that, the seemingly inexorable rise in file sizes will continue.  As a (near) senior citizen with so-so eyesight, this leaves me baffled.  I don't need to see every imperfection in the original film stock or cels.  Personally, I think this is a form of competition among encoders: mine is bigger than yours, so to speak.  The offers section of BakaBT is filled with new encodes claiming to produce the next minute improvement in visual quality (often at the expense of subtitle legibility, timing accuracy, or typesetting fluidity).  A few encoders buck this trend: for example, Atsui produces a fine balance between quality and size.  But for almost everyone else, it seems that bigger is better.

One final note on the stampede to Hi10P.  An encoding colleague says that at the moment, there are mutually canceling bugs in the encoding and decoding software that produce slightly "tinted" encodes and then correct it in playback (something about color-space translation).  As a result, when the software is corrected, these early Hi10P encodes will look slightly tinted.  I won't notice it of course, but I wonder if we'll see a rash of "Hi10Pv2" encodes at some point?


  1. This is one of the most sensible posts I've seen about 10-bit encode so far.

  2. I don't foresee an across-the-board improvement in file-size/quality balance. After all, I've already read about how notorious bloat-coder Tenshi (of Coalgirls fame) plans to deliberately offset any efficiency gains from 10-bit encoding by jacking up bitrates or lowering CRFs even more.

    As the proud owner of a slow computer, the transition to h264 wasn't that bad -- it was the transition to 720p and greater use of softsubbed typesetting/karaoke that proved problematic.

    But really, I think the best encoding advancement has not come with codecs, but with the use of CRF encoding to produce variable file sizes based on what bitrates the content actually *needs* to look good. The abandonment of CD-R based constant sizes like 175/233 MB has meant a little less pointless bitrate starvation and bloat, at least in the SD releases I typically watch. Contemporary ~220 480p h264 encodes are a dramatic improvement over 175 MB XviD encodes from 2006, and the improvement is better than what you'd get by cranking up the filesize to 220 MB with XviD.

  3. The main problem with the hype about Hi10P is due to misinformation.

    I understand that this is not really a blog about encoding, but the thing is that you have to see this from an encoder's viewpoint. To encoders, there is never enough bits to throw at a video. One of the most infamous things that will usually require more bits per frame is a scene involving rain or snow. Even the current accepted encoding quality/sizes do not cover cases where there is a lot of high-frequency spatial content (thus requiring more bits to encode the same scene).

    My group is personally beginning to do Hi10P encodes and releases (with a build of x264 that doesn't have the color conversion bug that was discussed). However, our Hi10P encodes are done to the same bitrate as the regular 8-bit encodes, and they are very close to the same size.