Recently we had an issue with incorrect detection of language of the user’s browser.
Basically, when you want to guess (a.k.a. content negotiation) in what language you want to display your content/UI for the user, you should always check the Accept-Language HTTP header, instead of IP address to guess language by country retrieved from GeoIP database. This tweet sums up the reason really good:
The value of the header can have different forms. It can contain just one simple language code like en or language code and country code like en-GB. And also it can contain mix of this and more, plus with quality value called weight like en-US;q=0.8.
So our simple algorithm for detecting language didn’t take to account form with country code. Silly, I know. Shut up…
Browser was sending sk-SK, Symfony’s HTTP Foundation was sending to our app sk_SK and we expected sk in the app.
It got me curious what is proper form sk-sk, sk-SK or sk_SK and what else I can expect. So I started digging more into this whole Accept-Header thing.
Things I learned or refreshed my knowledge
- The language codes in the header are also called language tags. Useful to know as the keyword to put in the Google.
- Symfony’s Request::getLanguages() method is always normalizing language tags from sk-SK form to sk_SK form. The reasoning why it uses underscores instead of dashes is in this GitHub issue.
- What Is Correct Locale Tag? en_US vs. en-US
- Language tags and codes on Wikipedia
- You can find out what language tags browsers support by digging in their source code. For example, Firefox supports these and Chrome these.
- Firefox sends sk and Safari sends sk-SK. Other browsers may send similar forms.
- The PHP programmers should know about PHP’s Intl extension.
- Most of the projects and software are using ICU data for internalization/localization stuff.
- The list of locales that GlotPress supports
- IANA Language Subtag Registry
- If you need more power with various content negotiations, look at the Negotiation PHP library.
The more you know…
The fix for our issue
So our fix for the issue was simple to make. You can always expect from Symfony the language tag to be in code_CODE form. We just take that and map to our own (custom, people/URL friendly) locale code.
For example: cs_CZ -> cz, es -> es, es_VE -> es, en_US -> en. And when we will need to add specific variation of some language, let’s say British English, we will just add to our mapping en_GB -> en-gb.