Chapter 1. Introduction

Table of Contents

1.1. Description
1.2. Definition
1.3. Use Cases

1.1. Description

transactional-update is an application that allows to update a Linux system and its applications in an atomic way: The update will be performed in the background, not influencing the currently running system. The update will be activated by a reboot. This concept is similar to rpm-ostree or CoreOS' previous Container OS. However transactional-update is not another package manager, but is reusing the existing system tools such as RPM as the packaging format and zypper as the package manager. It depends on Btrfs due to its snapshotting and copy-on-write features.

The idea and reason to build up on existing tools is the ability to continue using existing packages and tool chains for delivery and application of updates. While currently only implemented for (open)SUSE environments the concept is vendor independent and may also be implemented for other package managers, package formats and file systems. It consists of the (open)SUSE specific transactional-update script and the generic tukit library.

Conceptually transactional-update creates a new snapshot with btrfs before performing any update and uses that snapshot for modifications. Since btrfs snapshots contain only the difference between two versions and thus are usually very small, updates done with transactional-update are very space efficient. This also means several snapshots can be installed at the same time without a problem.

1.2. Definition

A transactional update (also known as atomic upgrade) is an update that

  • is atomic:

    • The update does not influence the running system.

    • The machine can be powered off at any time. When powered on again either the unmodified old state or the new state is active. It is not possible to have a running system in an intermediate state.

  • can be rolled back:

    • If the upgrade fails or if a newer software version turns out to not be compatible with your infrastructure, the system can quickly be restored to a previous state.

1.3. Use Cases

As Linux distributions are evolving, new concepts are emerging, such rolling releases, containers, embedded systems or long time support releases. While the classical update mechanisms are probably perfectly fine for a regular desktop users or a conventional server system, the following example use cases may give an indication why an even more error-proof system may be desirable:

Distributions with rolling updates face the problem: how should intrusive updates be applied to a running system - without breaking the update mechanism itself? Examples like the migration from SysV init to systemd, a major version update of a desktop environment while the desktop is still running or even only a small update to D-Bus may give a good idea of the problem. The desktop environment may simply terminate, killing the update process and leaving the system in a broken, undefined state. If any update breaks such a system there needs to be a quick way to roll back the system to the last working state.

On mission critical systems or embedded systems one will usually want to make sure that no service or user behaviour interferes with the update of the system. Moreover the update should not modify the system, e.g. by uncontrolled restarts of services or unexpected modifications to the system in post scripts. Potential interruptions are deferred to a defined maintenance window instead. For really critical systems the update can be verified (e.g. using snapper diff) or discarded before actually booting into the new system. If an update encounters an error the new snapshot will be discarded automatically.

For cluster nodes it is important that the system is always in a consistent state, requires no manual interaction and is able to recover itself from error conditions. For these systems transactional-updates provides automatic updates; snapshots with failed updates will be automatically removed. Automatic reboots can be triggered using a variety of different reboot methods (e.g. rebootmgr, notify, kured or systemd), making the application of the updates cluster aware.

To summarize: The update should only be applied if there were no errors during the update. If it turns out that the update is causing errors (e.g. because of a new kernel version incompatible with the hardware) there should be a quick and easy way to roll back to the state before the update was applied.