Internationalization and Localization in iOS
iOS and Cocoa in general is well know for providing a wide range of internationalization,almost for free. Let’s take a look at these technologies
| API Name | Description |
|---|---|
NSLocale |
Properties related to the current region, including formats |
NSNumberFormatter |
Formats and parses number |
NSDateFormatter |
Formats and parses dates and times |
NSCalendar |
Support for various types of calendars and dates operations |
NSTimeZone |
Current Time zone and time related operations |
NSString |
Localization support, sort for various systems, UTF8 |
Locales
Locales are the cornerstone of everything related to internationalization. A locale is, fundamentally, an encapsulation or regional formatting standards, they are specific to a language, and most of the times to a region. For instance, it is not the same the brazilian portuguese local, than the local for the portuguese of Portugal. We access to the locales though the NSLocale API. There two ways to access the locales. In one hand we can get a dynamic local object through:
+ autoupdatingCurrentLocal
Or we can query static objects though:
+ currentLocale
+ systemLocale
- initWithLocaleIdentifier
The currentLocale is the user’ settings, the system locale is used to parse and format expressions. Sometimes it is useful to create a local with a given region for, let’s say, normalize a set of dates. We can do that with initWithLocaleIdentifier.
Once we have the local, we can start querying it for a vast array of configurations with - objectForKey:
| Component Key | Description |
|---|---|
NSLocaleIdentifier |
Key for the locale identifier: es_ES_PREEURO |
NSLocaleLanguageCode |
Key for the language: en |
NSLocaleCountryCode |
Country Code: AR |
NSLocaleScriptCode |
Locale script code |
NSLocaleVariantCode |
Variant for the locale: PREEURO |
NSLocaleExemplarCharacterSet |
Corresponding character set |
NSLocaleCalendar |
Calendar associated with the locale |
NSLocaleCollationIdentifier |
Collation system associated with the locale |
NSLocaleUsesMetricSystem |
Is the metric system supported in this locale? |
NSLocaleMeasurementSystem |
Measurement System: Metric / U.S. |
NSLocaleDecimalSeparator |
String with the decimal separator: . |
NSLocaleGroupingSeparator |
Numeric grouping separator: . |
NSLocaleCurrencySymbol |
Currency symbol: $ |
NSLocaleCurrencyCode |
Currency Code: USD |
NSLocaleCollatorIdentifier |
Collator identifier or nil if unknown |
NSLocaleQuotationBeginDelimiterKey |
Opening quotation mark |
NSLocaleQuotationEndDelimiterKey |
Closing quotation mark |
NSLocaleAlternateQuotationBeginDelimiterKey |
Alternate opening quotation mark |
NSLocaleAlternateQuotationEndDelimiterKey |
Alternate closing quotation mark |
It is important to notice that NSLocaleLanguageCode might not be the same of the language choose by the user for the UI. Actually, these are two different settings, independent from one another. If you need to get the UI language use:
[[[NSBundle mainBundle] preferredLocalizations] objectAtIndex:0];
NSBundle is the source for all the localized resources, and it have methods to know which is the preferred localization, as you can see.
Numbers
Numbers are really important for localizations, in different regions, numbers are shown in different forms. One aspect we tend to forget in the west, is that not everywhere people use arabic numerals. The arab world, for instance, uses the eastern arabic numerals. The separators are also different, even amongst the same language or region. For instance, in the english world, the comma is used to group thousands, while in Spain is used to separate decimals. In France the space is used to group thousands.
NSNumberFormatter handles all this issues for us. For instance, if the locale is set to ar_BH (arabic, Bahrain), the number 1234.56 will be displayed as ١٢٣٥. The thing here is to never user printf, scanf or any other similar function. Another issue is to overload the locale format with
[numberFormatter setFormat:@"$#,##0.00"];
This line will, basically, replace the appropriate locale format with the constant pattern.
Dates and Times
When we are working with dates we want to display dates in a locale specific manner. Cocoa provides NSDateFormatter for such a task. There is a handy convenience method that took the current locale, and produce an appropriate date string.
[NSDateFormatter localizedStringFromDate:(NSDate *)date
dateStyle:(NSDateFormatterStyle)dstyle
timeStyle:(NSDateFormatterStyle)tstyle];
Will render, for example, September 13, 2011 10:11:36 AM GMT-03:00 with the US locale, and 13 сентября 2011 г. 10:11:36 Аргентина стандартное время with the russian.
You can also overload the formats, but caution is advised, once again.
[NSDateFormatter dateFormatFromTemplate:@"dMMM"
options:0
locale:[NSLocale currentLocale]];
The template format is a string containing the pattern we want to use, and should be build upon Unicode Technical Standard #35. The options are not yet publicly defined, so you must pass 0.
NSDateFormatter is also able to fix the time zone.
[dateFormatter setTimeZone:[NSTimeZone timeZoneForSecondsFromGMT:3600]];
Calendars
Calendars encapsulate information about systems of reckoning time in which the beginning, length, and divisions of a year are defined. They provide information about the calendar and support for calendrical computations such as determining the range of a given calendrical unit and adding units to a given absolute time.
In Lion, the supported calendars is much broader, including some moon based calendars, like the arabic. In iOS the supported calendars are gregorian offset (Thai Buddhist, Japanese, etc).
Strings
Most of the string operations are very language and locale dependent. Thus Apple recommends to use NSString APIs to handle strings in a locale sensitive way. For instance, to sort a set of strings in english is not the same than to sort a string of chinese strings. Diacritic significance in sorting and comparison may be significant. Sometimes even the pronunciation of a string can change how you should sort it, like in Mandarin. This whole set of issues is handle by free though
- (NSComparisonResult)localizedStandardCompare:(NSString *)string;
There is also an API to allow searching in a locale fashion.
- (NSRange)rangeOfString:(NSString *)aString
options:(NSStringCompareOptions)mask
range:(NSRange)searchRange
locale:(NSLocale *)locale;
In different languages, case and diacritic search can vary. In Lion there is a new UI class named NSTextFinder, which is recommended to perform searches in strings.
Sometimes, it is also needed to split a text in tokens, like words, sentences, paragraphs or lines. You can do that with:
- (void)enumerateSubstringsInRange:(NSRange)range
options:(NSStringEnumerationOptions)opts
usingBlock:(void (^)(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop))block;
This will take into account the numerous ways in which different tokens are split. For instance, in chinese, there are no spaces between words. In Thai, the spaces separates phrases. That means, that you cannot ever rely on spaces as boundaries. In tokening characters, one of the major errors is to assume that each glyph corresponds to a character, this is not always the case. For instance, É can be stored as one or two unichars: U+00C9 for the latin E with acute, or U+0045 for the capital E plus U+0301 to combine it with the acute.
Check GitHub for some code.