Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ubuntu 20.04 apt update breaks DNS #429

Closed
CecileRobertMichon opened this issue Nov 9, 2020 · 15 comments · Fixed by #443
Closed

Ubuntu 20.04 apt update breaks DNS #429

CecileRobertMichon opened this issue Nov 9, 2020 · 15 comments · Fixed by #443
Assignees

Comments

@CecileRobertMichon
Copy link
Contributor

#426 disabled ubuntu 20.04 as it started failing due to apt updates failure. This issue is to track investigating and re-enabling it.

/assign

@kkeshavamurthy
Copy link
Member

Seems like this might be related to DNS resolution issue.

On 20.04 machine on Azure:

packer@pkrvmfiv41tmnbe:~$ uname -a
Linux pkrvmfiv41tmnbe 5.4.0-1031-azure #32-Ubuntu SMP Tue Oct 6 09:47:33 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
packer@pkrvmfiv41tmnbe:~$ cat /etc/issue
Ubuntu 20.04.1 LTS \n \l

packer@pkrvmfiv41tmnbe:~$ cat /etc/resolv.conf
nameserver 127.0.0.53
options edns0 trust-ad

packer@pkrvmfiv41tmnbe:~$ ping www.google.com
ping: www.google.com: Temporary failure in name resolution
packer@pkrvmfiv41tmnbe:~$

On a 18.04 machine:

nameserver 127.0.0.53
options edns0
search gzwhnwhgwakubkuho0xo4vbl0b.xx.internal.cloudapp.net

@kkeshavamurthy
Copy link
Member

After a reboot, the resolv.conf is populated with the search domain and everything works.

packer@pkrvmfiv41tmnbe:~$ cat /etc/resolv.conf
nameserver 127.0.0.53
options edns0 trust-ad
search p1bbnpkytbnexleipe4eijnfec.xx.internal.cloudapp.net

packer@pkrvmfiv41tmnbe:~$ ping www.google.com
PING www.google.com (172.217.3.164) 56(84) bytes of data.
64 bytes from sea15s11-in-f4.1e100.net (172.217.3.164): icmp_seq=1 ttl=116 time=4.82 ms
^C
--- www.google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 4.820/4.820/4.820/0.000 ms

packer@pkrvmfiv41tmnbe:~$ sudo apt update
Get:1 http://security.ubuntu.com/ubuntu focal-security InRelease [107 kB]
Hit:2 http://us.archive.ubuntu.com/ubuntu focal InRelease
Get:3 http://us.archive.ubuntu.com/ubuntu focal-updates InRelease [111 kB]
Get:4 http://us.archive.ubuntu.com/ubuntu focal-backports InRelease [98.3 kB]
Fetched 317 kB in 1s (336 kB/s)
Reading package lists... Done
Building dependency tree
Reading state information... Done
All packages are up to date.

@johnsonshi
Copy link

Just deployed Ubuntu 20.04 LTS Gen 1 and Gen 2 VMs on Azure (single VM deployment that isn't part of a K8S deployment).

Ubuntu 20.04 LTS Gen 1 Latest Image: Canonical:0001-com-ubuntu-server-focal:20_04-lts:latest

johsh@johsh20201118145236:~$ uname -a
Linux johsh20201118145236 5.4.0-1031-azure #32-Ubuntu SMP Tue Oct 6 09:47:33 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
johsh@johsh20201118145236:~$ cat /etc/resolv.conf
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "resolvectl status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 127.0.0.53
options edns0
search f5qiwrldssiuxezyceeucje22h.xx.internal.cloudapp.net
johsh@johsh20201118145236:~$ ping www.google.com
PING www.google.com (172.217.14.228) 56(84) bytes of data.
64 bytes from sea30s02-in-f4.1e100.net (172.217.14.228): icmp_seq=1 ttl=116 time=4.85 ms
64 bytes from sea30s02-in-f4.1e100.net (172.217.14.228): icmp_seq=2 ttl=116 time=5.01 ms
64 bytes from sea30s02-in-f4.1e100.net (172.217.14.228): icmp_seq=3 ttl=116 time=4.95 ms
^C
--- www.google.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 4.851/4.936/5.006/0.064 ms

Ubuntu 20.04 LTS Gen 2 Latest Image: Canonical:0001-com-ubuntu-server-focal:20_04-lts-gen2:latest

johsh@johsh20201118145249:~$ uname -a
Linux johsh20201118145249 5.4.0-1031-azure #32-Ubuntu SMP Tue Oct 6 09:47:33 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
johsh@johsh20201118145249:~$ cat /etc/resolv.conf
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "resolvectl status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 127.0.0.53
options edns0
search fw2b4inrksxeveljnpbdirdpac.xx.internal.cloudapp.net
johsh@johsh20201118145249:~$ ping www.google.com
PING www.google.com (216.58.193.68) 56(84) bytes of data.
64 bytes from sea15s07-in-f68.1e100.net (216.58.193.68): icmp_seq=1 ttl=116 time=4.66 ms
64 bytes from sea15s07-in-f68.1e100.net (216.58.193.68): icmp_seq=2 ttl=116 time=4.76 ms
64 bytes from sea15s07-in-f68.1e100.net (216.58.193.68): icmp_seq=3 ttl=116 time=4.71 ms
^C
--- www.google.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2004ms
rtt min/avg/max/mdev = 4.663/4.711/4.757/0.038 ms

@kkeshavamurthy
Copy link
Member

Looking into it a bit further, the DNS entry in resolv.conf is removed after doing a dist-upgrade. Not sure what changed with Ubuntu recently. A reboot after dist-upgrade will re-populate the dns entry and things seems to work fine.

@CecileRobertMichon
Copy link
Contributor Author

@kkeshavamurthy are you able to repro the same thing on a simple Azure 20.04 VM not built through image-builder?

@kkeshavamurthy
Copy link
Member

@kkeshavamurthy are you able to repro the same thing on a simple Azure 20.04 VM not built through image-builder?

Yup. Same issue on a vanilla Ubuntu 20.04 vm. dist-upgrade messes up resolve.conf

@johnsonshi
Copy link

Confirmed @kkeshavamurthy. Just did a repro and performed apt dist-upgrade, and /etc/resolv.conf was indeed wiped on both 20.04 LTS Gen 1 and Gen 2 latest images.

dist-upgrade performs an aggressive upgrade of packages in the system, and it may change dependencies, remove, or add packages along the way (see https://askubuntu.com/questions/770135/apt-full-upgrade-versus-apt-get-dist-upgrade)

Right now we need to narrow down which package is causing resolv.conf to be wiped during the package upgrades. I'm trying to find out the culprit package.

@johnsonshi
Copy link

@CecileRobertMichon @kkeshavamurthy @anhvoms

A normal apt upgrade in both of the latest images for Gen 1 and Gen 2 Ubuntu 20.04 LTS also wipes the /etc/resolv.conf file.

@johnsonshi
Copy link

An apt dist-upgrade or even a normal apt upgrade does not overwrite the /etc/resolv.conf file in Ubuntu 18.04.

@johnsonshi
Copy link

johnsonshi commented Nov 19, 2020

Alright I've managed to isolate what circumstances /etc/resolv.conf gets overwritten.

On Ubuntu, resolv.conf and network name resolution is handled by systemd-resolved, which is a component service of systemd.

On the latest Ubuntu 20.04 images, @kkeshavamurthy reported that performing an apt dist-upgrade caused the /etc/resolv.conf file to be overwritten (causing the search option for to disappear within the file, which breaks DNS). I also tried performing a simple apt upgrade, which also caused /etc/resolv.conf to be overwritten.

I inspected which packages are upgradable to narrow down the culprit package. Two upgradable packages that were likely culprits were netplan.io and systemd.

Upgrading netplan.io only does not cause /etc/resolv.conf to be overwritten (see https://paste.ubuntu.com/p/NHQ7Wv37mC/)

Upgrading systemd only causes /etc/resolv.conf to be overwritten (see https://paste.ubuntu.com/p/36xbMMxYqm/).

I deployed an old Ubuntu 18.04 LTS image (with an out-of-date systemd package). I upgraded systemd on that image but it didn't overwrite /etc/resolv.conf.

This issue seems constrained to the latest Ubuntu 20.04 Gen 1 and Gen 2 images when systemd is upgraded. We'll work with upstream Canonical and internal teams to fix this bug upstream.

// @CecileRobertMichon // @anhvoms

@johnsonshi
Copy link

johnsonshi commented Nov 19, 2020

I've further narrowed down the faulty upgrade. The symptom (/etc/resolv.conf overwritten) only exhibits itself on a very specific upgrade path.

On an older Ubuntu 20.04 LTS Gen 1 VM Image:

  • Canonical:0001-com-ubuntu-server-focal:20_04-lts:20.04.202006270
  • Upgrade of systemd: systemd/focal-updates 245.4-4ubuntu3.3 amd64 [upgradable from: 245.4-4ubuntu3.1]
  • /etc/resolv.conf is not overwritten when systemd goes through that upgrade.
  • https://paste.ubuntu.com/p/7frQfgTXkf/

On a recent Ubuntu 20.04 LTS Gen 1 VM Image:

  • Canonical:0001-com-ubuntu-server-focal:20_04-lts:20.04.202010260
  • Upgrade of systemd: systemd/focal-updates 245.4-4ubuntu3.3 amd64 [upgradable from: 245.4-4ubuntu3.2]
  • /etc/resolv.conf is overwritten when systemd goes through that upgrade.
  • https://paste.ubuntu.com/p/NmXb4PKj3f/

Seems like upgrading from systemd 245.4-4ubuntu3.2 to systemd 245.4-4ubuntu3.3 (from the focal-updates repo) causes /etc/resolv.conf to be overwritten, which causes /etc/resolv.conf to not have the search option, which causes DNS to break.

// @kkeshavamurthy // @CecileRobertMichon // @anhvoms

@CecileRobertMichon CecileRobertMichon changed the title Re-enable Azure ubuntu 20.04 Ubuntu 20.04 apt update breaks DNS Nov 20, 2020
@johnsonshi
Copy link

@linuxelf001 Hey Rakesh, has upstream Canonical rolled out a systemd fix to the Ubuntu images? apt update is a very common scenario that users run when bootstrapping and setting up a VM, so this bug may have a very big impact.

@linuxelf001
Copy link

linuxelf001 commented Nov 28, 2020

Hi @johnsonshi. There is a new image (0.04.202011230) which has systemd (245.4-4ubuntu3.3). For old image, one workaround at this time is to reboot the VM before upgrading the systemd package. If reboot is done, then the DNS search entry is not touched in resolv.conf file.

Issue is reported/discussed here - https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1902960.

@kkeshavamurthy
Copy link
Member

The new image seems to be working fine. I was able to build Ubuntu 20.04 VHD without hitting the DNS issue. @CecileRobertMichon +1 to turn CI for Ubuntu 20.04 back on.

@CecileRobertMichon
Copy link
Contributor Author

Awesome, @kkeshavamurthy want to open the PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants