V18 Upgrade Aborted: Common Pitfalls & Remediation

31 jan. 2022

In order to provide valuable feedback and service to our customers, we are constantly analyzing the migration status/process of 3CX V16 (Debian 9) to V18 (Debian 10). We do this to identify common pitfalls/errors that could affect the upgrade process. During recent analysis, we have identified the two main issues that are responsible for systems failing to upgrade: APT being locked, or unsupported third party source lists.

APT lock

66% of the systems on which the upgrade gracefully aborts, is due to an APT process being stuck. Specifically, most of these systems have an apt-update process which started in 2021 and remained stuck ever since. The solution to this is to simply restart the system before attempting to upgrade.

You can identify whether your system falls into the 66% described above by checking the first few lines of the log sent to the admin’s email upon an unsuccessful upgrade. If the log contains entries similar to the ones below, then rebooting your system should resolve the issue and allow for the upgrade to proceed.

Example of aborted upgrades due to APT being locked:

[19:06:24] Failed: Output of ps aux (apt lock)

root 15914 0.0 0.1 45940 7548 ? S 2021 14:40 | \_ apt-get update
root 11289 0.0 0.0 12780 916 ? S 19:06 0:0   | \_ grep 15914

[08:14:52] Failed: Output of ps aux (apt lock)

root 11550 0.0 0.1 45976 7816 ? S Sep24 4:24 | \_ apt-get update
root 28281 0.0 0.0 12780 928 ? S 08:14 0:00  | \_ grep 11550

[14:27:53] Failed: Output of ps aux (apt lock)

root 14734 0.0 0.4 46288 8336 ? S Sep26 6:04 | \_ apt-get update
root 26262 0.0 0.0 12780 984 ? S 14:27 0:00  | \_ grep 14734

Third party source lists

18% of the aborted upgrades are due to third party source lists under the /etc/apt/sources.list.d/ directory. To ensure a smooth upgrade, 3CX must minimize the possibility of a third party repository or package breaking the process. This is why we only allow a small number of third party source lists which have been tested by our team. That means that when the upgrade script detects any source list other than the ones permitted, it will abort the upgrade without making any changes.

Identifying third party source lists for removal

Similar to the APT issue described above, identifying any third party source lists interfering with the upgrade is easy. If the log sent to your email contains entries similar to the ones below, then you have third party source lists that need to be (re)moved for the upgrade script to proceed.

  • Preparation: Found an uncommon source list in /etc/apt/sources.list.d/: 3xcpbx.list
  • Preparation: Found an uncommon source list in /etc/apt/sources.list.d/: 3cxpbx.listecho
  • Preparation: Found an uncommon source list in /etc/apt/sources.list.d/: hetzner-mirror.list

In the example above you can see three source lists, hetzner-mirror.list and two which look like 3CX ones but are spelled incorrectly (3xcpbx.list, 3cxpbx.listecho). This will cause the upgrade to abort.

Permitted source files list

  • google-cloud.list
  • google-cloud-sdk.list
  • backports.list gce_sdk.list
  • 3cxpbx.list
  • 3cxpbx-testing.list
  • rasp.list
  • digitalocean-agent.list
  • google_osconfig_managed.list
  • google-cloud-monitoring.list
  • google-cloud-logging.list
  • droplet-agent.list

Make sure any files under /etc/apt/sources.list.d/ are included in the above list.

My system doesn’t fall into the above issues

The two issues described account for 84% of all aborted upgrades and do not affect a running system in any way as the upgrade script exits before making any changes. If your system falls under the remaining 16% please refer to this document.

General tips:

  • Schedule the upgrade outside business hours
  • Before upgrading (any software) you should always take a full backup of your VM as well as 3CX (store it outside the machine).
  • Ensure the system has enough resources (especially memory and disk space)
  • If you are deploying to a new machine, make sure you shut down the old one before restoring the backup otherwise you will end up with FQDN DNS problems (constantly switching between the old and the new server)