When I took up fansubbing in 2006, most groups had strong QC
teams and QC processes. The teams took pride in putting out a good product;
revisions after release were looked on with distaste. That began to change in
the later years of the decade. Groups began to compete on getting releases out
quickly – the so-called "speed-subbing" phenomenon. Nothing helps to
shorten time-to-release than trimming lengthy parts of the release process,
like QC. gg was among the first to go this routine, dispensing with QC
altogether. Then as simultaneous streaming took over, QC seemed less important,
because the "official subs" were assumed to be decent (a dangerous
assumption, it turned out). Old-line groups tried to retain a strong QC
process, but in most groups the QC team atrophied. Recruiting QCs became more
and more difficult.
QC is probably the least understood – and least appreciated
– part of the fansubbing process. QC is all about finding mistakes, not fixing
them. If you do your work as a QC well,
no one will notice; and if you do it poorly, everyone will blame you for the
mistakes that got through. QC requires attention to detail, as well as
selflessness that is rather rare these days. It's also an excellent way to learn and
understand the fansubbing process in all its complexity.
What is QC?
QC is the process of reviewing a fansub for mistakes – in
translation, timing, editing, typesetting, or encoding – and for possible
improvements. There are two phases: script QC (SQC) and release QC (RQC). The
former is focused on the script while it is still easily changed; the latter on
the final, hopefully releasable episode. In each case, the QC's job is to write
a report detailing the errors and suggested changes; it is not to change the script.
SQC is usually done in Aegisub, the ubiquitous tool for
subtitling anime. Aegisub offers many advantages, including the ability to
replay lines easily and to step frame-by-frame when necessary. It also has a
built-in spelling checker and other helpful tools.
RQC can be done in Aegisub, but it is better done by
watching an encoded and muxed file. This allows for checks that only apply to
the released file: missing or incorrectly typed fonts; missing or incorrect
chapters; random muxing mistakes that affect the video or audio.
Script QC (SQC)
Before starting, you will need the script, the encoded file,
and the fonts used in the episode. Any
unique fonts must be installed before invoking Aegisub. They can be deleted later,
if you don't want your font folder to become unduly cluttered. It also helps to
have an editing guide, which details the conventions to be used in the
show. (See my blog entry on editing for
information about compiling an editing guide.) If the translator or editor
didn't supply one, you should compile one for yourself. This is particularly
important for long series, where inter-episode consistency is easily lost, or
for fansubs based on CrunchyRoll scripts; CR is notorious for changing
character names from script to script.
With all that in hand, it's time to fire up Aegisub and
start looking for errors.
Translation Errors
Unless you know Japanese yourself, you are unlikely to find
true translation errors, but even a non-speaker can spot certain issues:
- Discrepancies in length. Sometimes a long Japanese line is translated as a very short English sentence. (The reverse happens as well, although it's less common.) Some compression is to be expected, particularly on conventional polite phrases, but significant length discrepancies may indicate that a phrase or clause has been dropped.
- Inconsistent romanization of names. Japanese names with long vowels (Kōsaku) can be romanized either by adding extra letters (Kousaku) or by treating long vowels as normal vowels (Kosaku). Whichever is chosen, it needs to be applied consistently to all Japanese names.
- Inconsistent honorifics. If the translation includes honorifics, then it needs to include them wherever they are present, and to exclude them when absent. It is easy to confuse honorifics with Japanese particles, e.g., to hear "-no" as "-dono."
- Inconsistent character names. This is a particular hazard in long series.
Timing Errors
Timers can have different conventions for handling lead-in,
lead-out, lines that cross scene boundaries, and so on (see this blog entry).
You need to understand the timer's preferred style before flagging timing
errors.
In checking timing, it is really helpful to have a keyframes
file. Modern compression algorithms, like H.264, do not put a keyframe at every
scene change and will insert a keyframe in the middle of a long, static scene.
A keyframes file provides a better (but not foolproof) indicator of where scene
boundaries really are. There are batch scripts that will generate a keyframes
file, if the encoder does not provide one.
While it is possible to check timing as you go, I usually
make a separate pass, looking only at the audio display in Aegisub, to check
timing. Issues to look for include:
- Missing lead-in or lead-out. Unless a line abuts against a scene boundary or another line, it should have both lead-in and lead-out.
- Scene shortfalls. With certain exceptions, lines should not start or stop a few frames from a scene boundary. The timer should have a standard about how many frames after the start or before the end of a scene must be present. If the line violates these standards, it should be snapped to the appropriate scene boundary.
- Scene bleeds. Sometimes, a line crosses a scene boundary by just a slight amount. The decision of whether to terminate the line at the scene boundary, or to continue into the next scene, depends on the timer's standards. Some timers cross the boundary if there's a full word in the next scene; other if there's a full syllable in the next scene.
- Gap between adjacent lines. Two adjacent but separated lines must have a minimum time between them, as established by the timer. Otherwise, they should be joined by extending the lead-out of the first line and possibly the lead-in of the second.
- Lead-out/lead-in balance between joined lines. When adjacent lines are joined, the balance between lead-out and lead-in can be tricky, particularly if the time spacing is short. If there's any spacing at all, there should be both lead-out and lead-in, even if below the normal minimums.
- Song timing errors. After the first episode in a series, the song translations are simply cut and pasted from episode to episode. A line at the start or end may be missed. Changes in keyframes may result in scene bleeds. The songs need to be checked on every episode, a tedious process.
Timing checks are complicated by the issue of false
keyframes. Sometimes, a keyframe gets generated when there is, in fact, no real
scene change. Thus, every possible timing violation involving a keyframe has to
be checked to see if the scene boundary is really there.
Editing Errors
This is the largest category of checks, and includes
spelling, grammar, punctuation, and style. Using tools can help to automate
editing checks, but there is still a lot of staring and thinking that has to be
done.
Automated Tools
Aegisub has a built-in spelling checker, but it gets tripped
up by Japanese names and phrases, and of course by the romanji in songs, if
included.
A different approach is to use the spelling and grammar
checker in Microsoft Word.
- Export the script as a plain text file.
- Edit the text file to remove any songs and signs.
- Join any sentences that are split across multiple lines into a single line.
- Replace all line breaks (\N) with space, and then replace any double spaces with single space.
- Save the edited file.
- Load the edited text file into Word and press F7.
Word's checker is far from perfect. In particular, it gets
grumpy about incomplete expressions and messes up on some common clichés (for
example, it doesn't like "It's all my fault.") All alleged spelling
mistakes have to be looked at; when in doubt, check the word on Google. Nonetheless,
Word will find subtle mistakes that often get missed by the eye, like repeated
articles ("the the") and "its/it's" confusion.
Problems can arise with expressions that have multiple
acceptable spellings, like "goodbye." Any of the accepted variants is
fine, but they need to be used consistently. The same applies to "Um"
vs "Umm," "Hm" vs "Hmm," and "Geez" vs
"Jeez." Hyphenation can be tricky too. Some compound English words
are now simply joined (like "heartbreak"); others are not. Again,
when you have a concern, Google is your friend.
Finally, a non-US spell checker will flag spellings that
vary between US and UK usage, like "honor/honour." Most fansub groups
use US spelling and grammar.
Grammar and
Punctuation
English grammar and punctuation are very complicated, and
you need to know the rules of the road. My blog on editing describes some of
the trickier rules, but I stumble over new ones all the time. For example,
plurals of mnemonics are made by simply adding an "s", e.g. "The
ABCs of Love" rather than "The ABC's of Love." The most common
problems seem to be:
- Singular/plural agreement. Impersonal sentences are particularly troublesome.
- Commas after "Then or So" or before "too" or in interjections beginning with "Oh." Both including the comma and omitting it are acceptable; including it is more formal, omitting it more conversational. Whatever choices are made, they need to be used consistently.
- Commas in compound sentences (and not in compound clauses). Compound sentences (two complete sentences joined by "and" or "or") must have a comma between them. Compound phrases (a sentence with one subject and two verbs, joined by "and" or "or") must not have a comma between the phrases. This rule is frequently violated in streaming scripts.
- Subjunctive conjugation. The English subjunctive is a swamp and can result in some quite peculiar sounding phrases, e.g. "If he be…" I generally prefer to ignore subjunctive conjugations, but if one is used, it needs to be right.
- Punctuation of quotations. US grammar and English grammar differ here. In the US, a concluding comma or period is placed inside the closing quotation mark, while an exclamation point or question mark is placed outside. In the UK, all punctuation marks are placed outside the closing quotation mark.
- Overuse of ellipses. Don't get me started on this one.
Style
Style issues are really nebulous, and it's all too easy for
a QC to turn into a "back-seat editor" (which will really tick off the
editor, by the way). Still, there are
style issues that the QC should look at and potentially flag:
- Inconsistent use of contractions. Most anime dialog is conversational speech. In English, conversational speech uses contractions. Formal speech may be appropriate in some cases (for example, an elderly servant, a snooty ojou-sama), but the formal versus informal distinction needs to be consistent. Teenagers rarely speak formally, so their speech should use contractions.
- "Will" versus "Shall." This is a particular instance of formal versus informal speech. The word "shall" rarely appears in US English conversation; its use is reserved to legal documents ("Congress shall make no law…"). The most common violation is "Shall we go?" In conversational speech, a person would say "Let's go, okay?" or "Should we go now?"
- Impersonals. Japanese translations are often full of impersonal phrases: "It seems…" or "There are…" Overuse makes the dialog stilted.
- Repeated words. If the same word appears in successive lines, it can be very jarring, unless the repetition is intended as reinforcement or is a quotation. "Just" gets thrown in way too often.
The list of potential style problems is endless; see my blog
on editing for a more comprehensive discussion.
Typesetting Errors
Typesetting must be inspected visually. To do that
correctly, all fonts used in the episode must be installed prior to running
Aegisub. Common problems include:
- Styling errors. If the script uses different styles for dialog versus thought, or present time versus flashbacks, each line must be checked for use of the correct style. Application of a "thought" style can be tricky if the character involved is not on-screen or is turned away from the viewer.
- 3-liners. If a line is too long, it may occupy three lines instead of two. Alternately, a 3-liner may be created if a two-line sub overlaps with another line. (Make sure you've installed the dialog font before flagging these kinds of errors.)
- Italics errors in fonts without true italics. If a font lacks true italics, the subtitle renderer creates pseudo-italics by leaning the font to the right. This causes crowding between an italicized word and a subsequent non-italicized word. The typesetter must provide padding (e.g. {\i1}word{\i0\fscx130} {\fscx}word).
- Crowding in fonts with true italics. Even with true italics, an italicized word that abuts an exclamation mark or question mark may look crowded. The typesetting must provide padding (e.g. {\i1}word{\i0\fscx30} {\fscx}!)
- Sign/dialog overlap. Signs may occur in any part of the screen and can overlap the dialog. If the dialog is not assigned to a higher layer than the sign, the dialog will be "under" the sign. The dialog may need to be moved to the top of the screen in order not to conflict with the sign.
- Incorrect start or end time. Every sign needs to be inspected for correct start and end time.
- Missing signs. Sometimes, signs that seem germane may not be typeset. This may be a deliberate decision on the part of the translator or typesetter, or it may be inadvertent.
Encoding Errors
As part of SQC, the QC must actually watch the episode from
end to end in order to check for mistakes in the video and audio. It's all too
easy to skip from line to line, but in that case, errors between lines will be
missed.
The SQC Report
The QC provides a written report of suggested changes back
to the team. Comments can be sorted by translation, timing, editing,
typesetting; if not sorted, then the comment needs to indicate who in the team
needs to look at the issue:
TL: line (with
time references; simply cut and paste from the script)
issue
For editing comments, where the QC has a suggestion to make,
the comment can be:
Edit: line
suggested new line
issue
Now, if there are a lot of changes, generating a
comprehensive report may be really tedious. One shortcut I use is to "fix"
a script as I go, save it under a new name, and then generate a
"differences" report using Linux diff or Windows WinMerge. This
differences report includes the old and new lines, with time references. It's
then very easy to annotate each change with a rationale or a description of the
underlying issue.
Release QC (RQC)
RQC differences from SQC in two significant ways. First, it
is done on a finished file, rather than by using Aegisub. Second, it flags only grievous errors, such as missing fonts, bad chapters, and so on. I've
already described my release checking process in this blog entry, so I won't
repeat the detailed checklist. For comprehensive checking, you will need the
final script as well as the finished episode. Here are a few of the more
critical steps in RQC:
- Load the final script into Aegisub and use the "Font Collector" feature to compile a list of required fonts. Check for errors (missing fonts, missing glyphs in fonts). Note that lack of italics or a bold font variant is not a fatal problem; the subtitle renderer compensates.
- Use Linux diff or Windows WinMerge to compare the initial and final scripts. Check that all changes were done correctly, e.g. with proper spelling.
- Use "mkvmerge -i" to get a list of fonts attached to the file. Make sure that every font has the correct MIME type (x-truetype-font). Check that all fonts are included. Check that the chapter file is included (if the episode is chaptered).
- Spot check the episode. Check that the correct script was muxed in. Check that tracks are properly labeled. If chapters are included, check that the chapter timing points are correct.
- Play the episode from end to end. Pay particular attention to songs and signs, and look for any encoding problems. If there are multiple scripts (for example, honorifics and no honorifics), you will have to watch the episode twice (gag).
Life after QC
When I was a QC, I couldn't wait to "graduate" to
a more creative position, editing in particular. Over time, I've added a
limited ability to typeset and time to my skill set. However, I still do QC, particularly
for other teams. I find that QC is a great way to avoid getting "boxed
in" by my own habits. I get to see
how other editors and typesetters work, and I always learn from them. It also
builds up goodwill, which I can draw on if I run into thorny issues in Orphan.
So whether you want to do QC forever or view it as an entry
point into fansubbing, give it a try. The mistakes are all there, waiting to be
found.
No comments:
Post a Comment