Soooo … you say you want to maintain a Chromium fork?

Tree with branches at Innovation House Magnolia
The branches of a tree at Innovation House Magnolia

Photo by Ari Greve

(Note: this article assumes you have some familiarity with Git terminology, building Chromium, and related topics)

Building your own Chromium-based browser is a lot of work, unless you want to just ship the basic Chromium version without any changes.

If you are going to work on and release a Chromium-derived browser, on the technical side you will need a few things when you start with the serious work:

  • A Git source code repository for your changes
  • One or more developer machines, configured for each OS you want to release on
  • Test machines and devices to test your builds
  • Build machines for each platform. These should be connected to a system that will automatically build new test builds for each source update, and your work branches, as well as build production (official) builds. These should be much more powerful than your developer machines. Official builds will take several hours even on a powerful machine, and requires a lot of memory and disk space. There are various cloud solutions available, but you should weigh time and (especially) cost carefully. Frankly, having your own on-premises build server rack may cost “a bit” up front, but it lets you have better control of the system.
  • A web site where you can post your Official builds so that your users can download and install them

Now you are good to go, and you can start developing and releasing your browser.

Then … the Chromium team releases a new major version (which they do every 4 or 8 weeks, depending on the track) with lots of security fixes. Now your browser is buggy and unsecure. How do you get your fixes to the new version?

This process can get very involved and messy, especially if you have a lot of patches on the Chromium code. These will frequently introduce merge conflicts when updating the source code to a newer Chromium version because the upstream project have updated the code you patched, or just nearby, but there are a few things you can do about that to reduce the problems.

There are at least two major ways to maintain updates for a code base: A git branch, and diff patches to be applied on a clean checkout. Both have benefits and challenges, but both will have to be updated regularly to match the upstream code. The process described below is for a git branch.

The major rule is to put all (or as much as practical) of your additional independent code that is whole classes and functions (even extra functions in Chromium classes) in a separate repository module that have the Chromium code as a submodule. Vivaldi uses special extensions to the GN project language to update the relevant targets with the new files and dependencies.

Other rules for patches are:

  • Put all added include/imports *after* the upstream includes/import declaration.
  • Similarly, group all new functions and members in classes at the end of the section. Do the same for other declarations.
  • Any functions you have to add in a source file should always be put at the end of the file, or at the end of an internal namespace.
  • Generally, try to put an empty line above and below your patch.
  • Identify all of your patches’ start and end.
  • Don’t change indentation of unmodified original code lines, unless you have to (e.g. in Python files).
  • Repetitive patching of the same lines should be fixuped or squashed. Such repetitions have the potential to trigger multiple merge conflicts during the update, which could easily cause errors and bugs to be introduced.
  • NEVER (repeat: NEVER!!!) modify the Chromium string and translation files (GRD and XTB). You will be in for a world of hurt when strings change (and some tools can mess up these files under certain conditions). If you need to override strings add the overrides via scripts, e.g. in the grit system merging your own changes with with the upstream ones (Vivaldi is using such a modified system; if there is enough interest from embedders we may upstream it; you can find the scripts in the Vivaldi source bundle if you want to investigate).

Vivaldi uses (mostly) Git submodules to manage submodules, rather than the DEPS file system used by Chromium (some parts of Vivaldi’s upstream source code and tools are downloaded using this system, though). Our process for updating Chromium will work whichever system is used, with some modifications.

The first step of the process is identifying which upstream commit (U) you are going to move the code to, and what is the first (F) and last (L, which you create a work branch W for) commit you are going to move on top of that commit. If you have updated submodules you do this for those as well.

(There are different ways to organize the work branch. We use a branch that is rebased for each update. A different way is to merge the upstream updates into the branch you are using, however this quickly gets even messier than rebasing branches, especially when doing major updates, and after two years of that we started rebasing branches instead.)

The second step is to check out the upstream U commit, including submodules. If you are using Git submodules you configure these at this stage. This commit should be handled as a separate commit, and not included in the F to L commits.

Then you update the submodules with any patches, and update the commit references.

The resulting Chromium checkout can be called W_0

Now we can start moving patches on top of W_0. The git command for this is deceptively simple:

git rebase --onto W_0 F~1 W

This applies each commit F through to L (inclusive) in sequence onto the W_0 commit and names the resulting branch W.

A number of these commits (about 10% of patched files in Vivaldi’s source base) will encounter merge conflicts when they are applied, and the process will pause while you repair the conflicts.

It is important to carefully consider the conflicts and whether they may cause functionality to break, and register such possibilities in your bug tracking system.

Once the rebase has completed (a process that can take several workdays) it is time for the next step: Get the code to build again.

This is done the same way as you normally build your browser, fixing compile errors as they are encountered, and yet again registering any that could potentially break the product. This is also a step that can take several work days. A frequent source of build problems are API changes and retired/renamed header files.

Once you have it built and running on your machine, it is time to (finally) commit all your changes and update the work branch in the top module and push everything into your repository. My suggestion is that patches in Chromium are mostly committed as “fixups” of the original patch; this will reduce the merge conflict potential, and keeps your patch in one piece.

Then you should try compiling it on your other delivery platforms, and fix any compile errors there.

Once you have it built and preferably have it running of the other platforms, you can have your autobuilders build the product for each platform, and start more detailed testing, fixing the outstanding issues and regressions that might have been introduced by the update. Depending on your project’s complexity, this can take several weeks to complete.

This entire sequence can be partially automated; you still have to manually fix merge conflicts and compile errors, as well as testing and fixing the resulting executable.

At the time of writing, Vivaldi has just integrated Chromium 104 into our code base, a process that took just over two weeks (the process may take longer at times). Vivaldi is only using the 8-week-cycle Extended Stable releases of Chromium due to the time needed to update the code base and stabilize the product afterwards. In our opinion, if you have a significant number of patches, the only way you can follow the 4 week cycle is to have at least two full teams for upgrades and development, and very likely the upgrade process will have to update weekly to the most recent dev or canary release.

Once you get your browser into production every couple of weeks you are going to encounter a slightly different problem: keeping the browser up to date with the (security) patches applied to the upstream version you are basing your fork on. This means, again, that you have to update the code base, but these changes are usually not as major as they are for a major version upgrade. A slightly modified, less complicated variant of the above process can be used to perform such minor version updates, and in our case this smaller process usually takes just a few hours.

Good luck with your brand new browser fork!

Microsoft! You broke my backup system!

Backing up the data on your computer is one of the most frequently given advice to computer owners, and there are a number of ways to accomplish it.

The oldest way is to copy the data to an external media. Originally this was tapes, today it will frequently be one or more external harddrive or SSD. Swapping between at least two complete backups is recommended, with the inactive drives stored off-site to avoid destruction or loss in case of fire, theft, or other disasters (and if your area is prone to major disasters, it might be an idea to occasionally store a backup copy in a safe location hundreds of kilometers away; storage over a network connection could be an option for this).

More recently, online backup storage has become more common. Personally, I am slightly skeptical of these, mostly due to the loss of access control, but also because cloud services occasionally have service disruptions, and in some cases lose the data entrusted to them. In case you use such a service, my recommendation is to make sure the data are encrypted locally with a key not known to the service before they are uploaded; this prevents the service from accidentally or intentionally accessing your data, as well as preventing other unauthorized access. Another problem with such services is that they occasionally shut down business with little or no warning, so even if you use such a service, a local backup is recommended anyway. Backing up locally is also recommended when using online application services; these services are useful for working with others, but you might lose access when you most need the access.

There are various ways to perform a backup, from just using a simple copy command, to using more advanced backup applications in the OS, to purchasing commercial backup tools. Trial or Freeware versions of many such tools are frequently included on external harddrives.

My backup system

In my system at home I swap between two external SSD harddrives, and use Windows’s Backup software to manage the backup. Previously, I used a similar system with a commercial tool, but once I moved to Windows 10, I found that the Backup software in Windows seemed to work better for my purposes and I switched to it.

Better does not mean “perfect”, though. There are a few issues, but reasonably minor: 1) Swapping drives destroys the backup configuration, so I have to re-enter it when connecting the second drive. 2) The software does not resume backing up data from where it left off on the reconnected drive, causing it to use a lot more disk space, and requires occasional cleanup to remove old backups.

All this was manageable. At least until last week.

Microsoft breaks the backup

Recently, I finally caved in and allowed Windows 10 on my home computer to be updated to Feature Update 2004. Considering the problems that had been reported about loss of data in Chromium-based browsers, maybe I shouldn’t have, but Windows was now insisting on updating.

A couple of days after the update I switched backup disks, cleaned up some very old backups that were no longer needed, and set up the backup configuration again, and started a backup. A backup that failed! No data was copied to the drive.

I found no errors reported in the normal Event Viewer logs, until I dug down into the application specific logs for “File History backup”, where I found this meaningless warning: “Unusual condition was encountered during scanning user libraries for changes and performing backup of modified files for configuration <name of configuration file>”, with no information about what the “unusual condition” was.

As I usually do when having a problem like this, in order to find out what caused the problem, I started to test with the default configuration and then add more source drives for the backup to see which one broke the system.

The default configuration did copy those files, but it also copied a directory from one of my other drives, the main data drive, the copied directory is where I store all my photos. This directory was not part of the configuration. This directory may have been included because it is the configured default destination folder for the Windows photo import software.

However, when I added the rest of that drive to the list of folders to copy, no further files were copied (although a couple of days later some of the upper level folders did get backed up, none of the important folders were copied).

Removing that drive from the list, and adding the other drives I have for various tasks, projects, and software, those drives did get copied properly.

Going back to the problematic drive, further experiments did not succeed at backing up that drive more than the mentioned top level folders. Even experimentally adding some sibling folders of the Photo folder did not work; they weren’t even added to the list of folders to backup.

Eventually, I was forced to do a manual copy of that drive to a separate area of the backup drive, to make sure I did have a copy of it.

At present my conclusion is that in Feature update 2004 Microsoft did
something to the Backup/File History software, and it broke my system for
backups.

My initial guess at the cause of this problem is that the addition of the photo folder conflicts with adding the rest of the same drive to the list of files and folders to back up. Such overlapping lists should be merged, not create a fatal error.

A backup problem like this may not be a Security Vulnerability(TM), but it is definitely a Security Problem.

I have reported this via the Windows Feedback App, as well as to the @MicrosoftHelps Twitter account, but have so far not received any information about how to fix this problem (so, no help, so far).

Microsoft, there are some systems that should never break in production systems. The file system is one, account storage is another, and the backup software is one of the others that should never break. In this release it looks like you broke two such systems. And at least one is still broken 5-6 months after the public release!

Please fix this. Immediately!

Photo by Markus Spiske on Unsplash