The trouble with Chromium translations

Most applications that are intended for a broad international audience have their UI translated to various languages, the number of which can vary widely, depending on the resources of the vendor, especially their ability to recruit translators.

Vivaldi is currently being translated to 91 languages, a few more than Chrome.

Under the hood Vivaldi’s UI string (“string”: The term used in computer programming for a section of text) translation system actually consists of two independent systems, the Chromium one, and the system used by Vivaldi’s native UI.

This article will only cover the Chromium system and the challenges of using it.

The Chromium string/text translation and resource system consist of two kinds of files:

  • The GRD files, which can declare the US English version of the strings, and the location of various file resources like icons and documents (HTML, JS, CSS) used by the Chromium UI, and
  • The XTB files, one for each language, contain the various translations of the original strings in the associated GRD file.

When building the product (Vivaldi, in our case) these files are processed by various scripts in the build system and converted into files that can be handled by the Chromium code handling strings and the resources.

One of the challenges for a product like Vivaldi is that the strings and resources defined by the Chromium project are very specific to Chromium and Google Chrome, such as logos and the company and product names used in the string.

Of course, we in Vivaldi want to use the Vivaldi logo, and to use “Vivaldi” and name of the product and company. Duh!

That means that we have to change the resource definitions and the strings (and translations) to use Vivaldi’s preferred resources and names.

However, if you’ve read my article about maintaining a Chromium fork, you will have noticed that I said that you should never modify the Chromium translation files. Yet, I just said above that to use our preferred resources and strings we have to change the files. Why should we not change the files, and how do we work around the problem to still be able to use our chosen resources?

The reason why we should not change the files comes down to two reasons:

First, the Chromium resource files are frequently updated by the Chromium team. New resources and new strings are added, and old ones are changed to improve their meaning, and occasionally some are removed. All of these changes mean that when upgrading the Chromium source code, there is a significant risk that these changes will occur to the specific lines of the file we modified, or close to them. That means that we would have to resolve the conflicts between the new text and our changes, which will significantly increase the time needed to complete the update.

vivaldi/chromium translation strings

Second, for the strings we would have to not just modify the GRD file entry, we would have to modify the corresponding entry in each of the 80+ translation XTB files associated with each file, and to top it off, each of those entries has a numeric identifier calculated from the original string in the GRD file, so if you change the original string, you have to recalculate the value and update each XTB file for that entry. Ouch! Lots of work. Additionally, each of those updated entries in each file is another possible update merge conflict that has to be resolved manually. Double ouch!

So, how do we resolve this problem? How do we update the resources, strings, and translations without modifying the Chromium resource files? The answer is that we do change them, but we don’t change them.

What we have done in Vivaldi is to create our own resource GRD and XTB files for each set of Chromium resource files that we want to update, and add our file resources, strings, and translations in these files. The translation files are usually used to add the translations for the extra languages we support, but in some cases we do an extensive rewrite of the original string, which require more translations to be added in our version.

Then, while building the application we have updated the project and the scripts it used to automatically insert our updated changes into the data, before they are used to generate the binary files used by the application.

The result is that we don’t have to update the original files, but we can update the resources, strings, and translations.

This process is also used to automatically replace mentions of Chromium and Google Chrome company and product names with Vivaldi’s name, both in the original US English strings and the translations. This process does have its challenges, especially since “Google” is frequently used in combination with other words to name products we don’t support, like “Google Pay”, so we have to exclude such replacements.

Occasionally, there are strings that mention the Google, Chrome, or Chromium names when replacing them with Vivaldi is not desirable (and an example just showed up in the forums https://forum.vivaldi.net/topic/77930/wtf-what-the-floc-google-s-still-at-it/2?\_=1661690451983\, where information about a system Google is working on said Vivaldi instead, that has now been “fixed”), and in these cases, we exclude that particular string from being replaced.

Another recent example was the string “Chrome is made possible by the Chromium open-source project”, which was auto replaced into “Vivaldi is made possible by the Vivaldi open-source project”, not “Vivaldi is made possible by the Chromium open-source project”. Oooops! That was fixed by adding a full override of the text with correct wording.

Could we avoid using this kind of system? Well, it is not the only way to implement such a system.

One could add an independent set of resource files (and we have those for our own), and add our replacements in those files using different identifiers for them and replace the originals everywhere they are used. However, we would still have the problem with later updates, both of the strings and their meaning, and starting to use them elsewhere (which would have to be discovered and updated). Then there is the issue of more potential merge conflicts during updates.

Quite simply, using different identifiers would not work very well, since their use would have to be maintained continuously. Just replacing the original entries will generally work better.

And that ignores the use of product names in many strings. There are a lot of those names used around the code, and copying and modifying them into a different set of files would be a major undertaking, and would still have to be updated with new strings every Chromium upgrade.

The best way to avoid the search and replace of product names (and thus avoid the funny cases) would be for the Chromium team to stop using “Google”, “Google Chrome”, “Chromium” etc. hardcoded into the strings, but instead using variables that can insert the downstream project’s own preferred name in those strings. This kind of project would be a major undertaking by the Chromium team, and I sort of doubt they would be willing to take it on.

What do the other Chromium-based browser teams do? I have absolutely no idea. Maybe they use a similar system, or they have found their own way to manage the issue.

I’m still a techie, not a nettie

Welcome to a my new home on the Web.

After I left Opera a year ago, I considered moving my (old) home page to a new location, but did not find a good location to host it.

Moving to a new location became a bit more urgent when Opera announced their decision to  shut down MyOpera in a couple of months.

Fortunately, my new old boss, Jon von Tetzchner, decided that with MyOpera shutting down, he would provide a new home for all the people made “homeless” by the shutdown. The new site, Vivaldi.net, went live as a  beta version yesterday, and I have now started the migration here.

The small print: Opinions stated here are my own, and do not necessarily represent my employer’s views. Opinions are subject to change without notice, in particular when I find (or am pointed to) better information, unless I decide to be stubborn. Articles may contain spelling mitsakes, errors grammatical, or other mistakes; in such cases the correct meaning is what I meant to write, not what is in the text; when in doubt, ask.