systematic validation of localization across all languages by martin Ørsted, microsoft ireland for...

23
Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Upload: ben-butts

Post on 15-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Systematic validation of localization across all languages

By Martin Ørsted, Microsoft IrelandFor the LRC XIII Conference

October 2008

Page 2: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Content

• The upstream effort• Downstream bullet-proofing• The single resource approach• Generic rules across a group of resources• Adding the languages• Conclusion• Questions?

Microsoft Ireland, Martin Ørsted

Page 3: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

The upstream effort

• Nothing beats fixing at dev time– use of newer programming languages with more

built-in error checking– Use of pseudo localization upstream– Educating developers– The use of controlled English– Source reuse systems

Microsoft Ireland, Martin Ørsted

Page 4: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

The upstream effort

• The upstream effort wont be perfect, due to:– Deadlines– Tradeoffs– The inadequacy of the development languages– Certain issues are difficult to bullet proof (law of

diminishing returns)– Choose your own favourite

Microsoft Ireland, Martin Ørsted

Page 5: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Downstream bullet-proofing

• Downstream bullet-proofing addresses shortcomings of upstream bullet-proofing

• But it also adds further benefits

• As the number of languages increase, the more it makes sense to invest

• What benefit can we realize from doing many languages?

Microsoft Ireland, Martin Ørsted

Page 6: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Single resource issues• Over-localization: The string should not have been translated. • Buffer limitation: The translation of the resource should not be more than a given

amount of characters, generally referred to as a string length limitation.• Illegal characters: Certain characters may not be allowed in the string• Dependency: Two resources may have to be translated the same, in effect one

resource is dependent on the other, references the other.• Backward compatibility: It is a special case of the dependency, basically, changing

a string from one version to another could cause a loss of backward compatibility.• Uniqueness: The string belongs to a group of strings that all have to have unique

names(translations), could be a list of commands for example.• Placeholder over-localized: Some localizable strings have placeholders in them. If

the placeholder gets localized the program can not drop the information into the placeholder and display it.

• Needed string decoration: Some strings may have control characters in the beginning or end of the string that should not be localized

Microsoft Ireland, Martin Ørsted

Page 7: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Examples of single resource issuesRule US string Example loc Issue description

Over-localization Common Files Might refer to a registry string. Rather than localizing the string the program will look up the localized name in the registry

Placeholder The file %1 could not be opened because %2

%1 and %2 are placeholders

Decoration \n\nOpen\n\n \n is a new-line character, sometimes used in dos style applications

Placeholder The file %s was last opened on %d %d

On %d%d the file %s was last opened

%s and %d are positional placeholders, their position has to be maintained, changing them as shown will cause an intermittent memory protection fault

Microsoft Ireland, Martin Ørsted

Page 8: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

The LocVer rule

• Thought up example:• String in Excel: Current Accounts• Localized string causes bug, we realize that

translation has to be 30 char or less• We create a rule: MaxLength=30• We apply the rule to all languages• If other languages break the rule we will know

Microsoft Ireland, Martin Ørsted

Page 9: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

The approach

Microsoft Ireland, Martin Ørsted

Page 10: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Benefit and cost

• + Find once, fix everywhere• + Enables reduced test, no need for regression

against other languages• - Management overhead, review new strings,

edit rules for changed strings• - Only viable with a good few languages• - Manual effort, either inspect strings as

added or add as bugs occur

Microsoft Ireland, Martin Ørsted

Page 11: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Last words on single resource

• Very valuable approach• But least preferred due to overhead• Much used

Microsoft Ireland, Martin Ørsted

Page 12: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Verification across a group of resources

Microsoft Ireland, Martin Ørsted

Page 13: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Groups of resources

• Look for patterns, for example:– Placeholders, %1, %2, %3– Commands, might be identifiable by resource

name

• Apply a generic rule to them– The rule will automatically cover new resources

that match pattern, and will automatically change if the resource change

Microsoft Ireland, Martin Ørsted

Page 14: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Groups of resources

• Positive– Less management overhead– Automatically adjusts to changes– Can become quite advanced

• Limits– Only work if you can identify a pattern– But much preferred in those cases– Fall back is individual resource rules

Microsoft Ireland, Martin Ørsted

Page 15: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

SQL queries across a pool of resources

• Same way LocVer fixes Functional (almost)• Query for things like:– US contains 2007, localized doesn’t– US contains Microsoft, localized doesn’t– Localized contains Xdocs (which was the code

name for the first version of InfoPath until late in development)

Microsoft Ireland, Martin Ørsted

Page 16: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

SQL queries

• Queries run once or twice• Loads of false positives• But may be worthwhile to review• Gets smarter with the added language

dimension

Microsoft Ireland, Martin Ørsted

Page 17: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Adding the languages

• The more languages added, the more intelligence can be applied

• Idea: Break the linear cost dependency between #Languages and eng and test costs

• Several possibilities

Microsoft Ireland, Martin Ørsted

Page 18: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Patterns across languages

• With 10 languages or more, you can look for patterns per resource like:– If 9 out 10 languages start with \n, should #10 also?– If 9 out 10 languages contain “Microsoft”, should #10

also?– If 9 out 10 languages localize two resources the same,

should #10 also?

– So both linguistic and functional issues will be caught

Microsoft Ireland, Martin Ørsted

Page 19: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Benefits across languages• Examples– DAL, Dynamic Auto Layout– Hotkey fixer, a way of programmatically assigning

hotkeys per language– Grouping on code pages, and only testing across one– Make pseudo loc understand LocVer, and test on

pseudo, reduce test on the languages– Controlled English becomes viable– Transliteration, MT– Test case versus test design specifications, the

introduction of randomness

Microsoft Ireland, Martin Ørsted

Page 20: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

The end result

Microsoft Ireland, Martin Ørsted

Page 21: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Conclusion

• The linear dependency between cost of test + engineering versus number of languages can be broken

• At the same time the quality can be systematically improved

• The trick is to design solutions where the work effort and hence cost does not linearly grow with added languages

Microsoft Ireland, Martin Ørsted

Page 22: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Conclusion continued

• DAL, SQL Queries, Generic rules, Single rules, Hotkey fixer all scale to extra languages with no extra effort

• But they come with various degrees of overhead

• Learnings across languages can introduce further efficiencies

Microsoft Ireland, Martin Ørsted

Page 23: Systematic validation of localization across all languages By Martin Ørsted, Microsoft Ireland For the LRC XIII Conference October 2008

Questions?

• Thank you for your time!

Microsoft Ireland, Martin Ørsted