Tuesday, October 23, 2012

Technology Marches On

When softsubbing first became possible and then fashionable, it involved a number of compromises. Fancy sign typesetting wasn't really doable, and elaborate karaokes seemed to go the way of the dinosaur. The technology for elaborate typesetting, and the performance of softsubbed playback, simply wasn't there.

Well, time marches on. PCs have become powerful (I write this on a quad-core i5 that spends 99.9% of its time doing nothing). More importantly, the software for typesetting and subtitle playback has improved considerably. As a result, sign typesetting is approaching the glory days of hardsubbed AFX signs, and complex karaokes are rising from their graves.

For typesetting, the key development has been the use of motion tracking software, combined with automated frame-by-frame transforms of a base sign based on the tracked motions. This allows frame-by-frame typesetting of signs that move in non-linear ways, with none of the hassle associated with the manual process. Back in 2009, I set a 50-frame moving sign in Orphan's Hand Maid May by hand, and it took me hours, with manual computation of the position deltas between frames. I did a similar effort with a non-linear sign in Orphan's Space Neko Theater; after that, I swore off the practice. But with new software technology, it's no longer necessary.

The motion tracking software is based on the sorts of techniques used in motion-capture special effects. After designating an initial set of points to be tracked, the software follows the image (in this case, a sign) through subsequent frames, generating coordinates. Those coordinates feed an Aegisub automation script that applies the coordinate changes (including changes in angles) to an initial typesetting specification. The result is a frame-by-frame sign that accurately tracks the motion on the screen.

Complex softsubbed karaokes are a more recent development. When Polished released a DVD version of Tokimeki Memorial a few years back, the initial version, which emodied C1's hardcoded karaoke in the script, simply wouldn't play. Polished had to redo the scripts with a simple, line-timed karaoke. The problem wasn't the speed of PCs; it was the subtitle rendering software, vsfilter, which suffered from a number of design bottlenecks. Recently, the community became sufficiently fed up to code up a replacement, called xy-filter, which is significantly more efficient.

I saw this in action with the recent DVD redo of Rescue Wings. When topf(h) added the typesetting, he simply incorporated the Ureshii karaokes verbatim — even though some of them to thousands of lines. With xy-filter, they play back as smooth as butter on almost any modern machine. Now fancy karaokes are making a comeback. GotWoot's opening for the season's hit show, Magi, runs to 6500 lines. I'm sure more will follow.

These developments will not be without their detractors. Viewers with old PCs will be in trouble. Non-techies will have difficulties in figuring out how and where to install xy-filter. As with the advent of the MKV container, replacing OGM; of h.264, replacing XviD; and 10-bit encoding, replacing 8-bit; the fansub community will by and large ignore them. One hopes that recoders (who typically change the original format to MP4 for playback on tablets and phones) can provide relief to technology laggards.

So with beautiful signs and complex karaokes again with reach, I think it's time for updated versions of some classics. I'd love to see Amatsuki, Yume Tsukai, Nodame Cantabile, and other classics from Ureshii or C1 redone, with their original karaokes. (Some are probably still beyond reach: the Skip Beat OP karaoke is more than 8MB long.) If you've got the interest, and the raws, I have the scripts.


  1. Well, as one of those technological laggards, I doubt I'll be incorporating super-fancy karaoke and typesetting into my releases anytime soon. However, I have thought of upping my game in the karaoke department and incorporating some \t effects, rather than the 2002-era simple \k and \kf I use now. But I don't want to release files that I can't play comfortably, which is why I'm more likely to go with \an8 (or omission if it's irrelevant or duplicated in dialogue) than do 1500 lines of typesetting for a 1.4-second sign.

    1. Another flaw I've discovered with xy-vsfilter: it disables MPC's internal subtitle renderer, or at least the part of it that allows you to change subtitle styles within the player. So essentially, xy-vsfilter holds viewers hostage to the fansub group's style choices -- not a good thing for those whose eyesight is a little lacking or isn't as good as it used to be. Or who have different preferences in regards to fonts, colors, sizes, borders, margins, etc.

      Though I suppose it does remain an option to extract the subs, edit the styles in Aegisub, and either remux or load the tweaked subs externally. But that's almost like requiring people to be fansubbers (or at least have fansubber software sitting around) if they want the same softsub style flexibility people enjoyed 6 or 7 years ago.