Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System crashes on S3 with 6.1.4 kernel, 6.1.1 worked #7980

Closed
marmarek opened this issue Jan 10, 2023 · 7 comments · Fixed by QubesOS/qubes-linux-kernel#710
Closed

System crashes on S3 with 6.1.4 kernel, 6.1.1 worked #7980

marmarek opened this issue Jan 10, 2023 · 7 comments · Fixed by QubesOS/qubes-linux-kernel#710
Labels
affects-4.1 This issue affects Qubes OS 4.1. C: kernel C: power management diagnosed Technical diagnosis has been performed (see issue comments). P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. pr submitted A pull request has been submitted for this issue. r4.1-dom0-stable r4.2-host-stable T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.

Comments

@marmarek
Copy link
Member

How to file a helpful issue

Qubes OS release

R4.2 (R4.1 likely too)

Brief summary

Dom0 crashes on S3 (suspend or resume, unsure which one yet)

https://openqa.qubes-os.org/tests/57938

It happens on both Intel and AMD.

Steps to reproduce

Suspend the system, then wake it up.

Expected behavior

System resumes normally

Actual behavior

[  471.704882] PM: suspend entry (deep)
[  471.712245] Filesystems sync: 0.007 seconds
[  471.713535] Freezing user space processes ... (elapsed 0.001 seconds) done.
[  471.715107] OOM killer disabled.
[  471.715110] Freezing remaining freezable tasks ... (elapsed 0.108 seconds) done.
[  471.823298] printk: Suspending console(s) (use no_console_suspend to debug)
(XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.
@marmarek marmarek added T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists. C: kernel P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. labels Jan 10, 2023
@marmarek marmarek added this to the Release 4.1 updates milestone Jan 10, 2023
@andrewdavidwong andrewdavidwong added needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. C: power management labels Jan 11, 2023
@GWeck
Copy link

GWeck commented Jan 11, 2023

I can confirm this for the following configuration:

  • HP Elitebook 840 G4 with
    • Intel Core i5-7200U Processor
    • 32 GB RAM
    • Intel HD Graphics 620
    • SAMSUNG SSD 870
  • Qubes R4.1 with:
    • Kernel latest
    • Xen 4.14-5
    • Kernel 6.1.4-1
      crashes on Suspend

@marmarek
Copy link
Member Author

Happens with 6.1.3 kernel too.

Crash message extracted from pstore
<6>[  348.284004] PM: suspend entry (deep)
<6>[  348.289532] Filesystems sync: 0.005 seconds
<6>[  348.291545] Freezing user space processes ... (elapsed 0.000 seconds) done.
<6>[  348.292457] OOM killer disabled.
<6>[  348.292462] Freezing remaining freezable tasks ... (elapsed 0.104 seconds) done.
<6>[  348.396612] printk: Suspending console(s) (use no_console_suspend to debug)
<6>[  348.749228] PM: suspend devices took 0.352 seconds
<6>[  348.769713] ACPI: EC: interrupt blocked
<1>[  348.816077] BUG: kernel NULL pointer dereference, address: 000000000000001c
<1>[  348.816080] #PF: supervisor read access in kernel mode
<1>[  348.816081] #PF: error_code(0x0000) - not-present page
<6>[  348.816083] PGD 0 P4D 0 
<4>[  348.816086] Oops: 0000 [#1] PREEMPT SMP NOPTI
<4>[  348.816089] CPU: 0 PID: 6764 Comm: systemd-sleep Not tainted 6.1.3-1.fc32.qubes.x86_64 #1
dmesg-efi-165681053804001:
Oops#1 Part4
<5>[  289.488793] audit: type=1100 audit(1656810479.377:296): pid=6693 uid=0 auid=4294967295 ses=4294967295 msg='op=pubkey_auth grantors=auth-key acct="root" exe="/usr/sbin/sshd" hostname=? addr=127.0.0.1 terminal=? res=success'
<5>[  289.488835] audit: type=2404 audit(1656810479.377:297): pid=6693 uid=0 auid=4294967295 ses=4294967295 msg='op=negotiate kind=auth-key fp=SHA256:08:74:2f:c8:b4:07:d2:9a:6b:2f:c8:7b:b5:e7:fa:47:4c:15:6a:cd:b2:d0:40:b1:00:46:0f:0e:de:e3:f8:7c exe="/usr/sbin/sshd" hostname=? addr=127.0.0.1 terminal=? res=success'
<5>[  289.491602] audit: type=1101 audit(1656810479.380:298): pid=6693 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_unix acct="root" exe="/usr/sbin/sshd" hostname=127.0.0.1 addr=127.0.0.1 terminal=ssh res=success'
<5>[  289.491865] audit: type=2404 audit(1656810479.380:299): pid=6693 uid=0 auid=4294967295 ses=4294967295 msg='op=destroy kind=session fp=? direction=both spid=6694 suid=74 rport=58070 laddr=127.0.0.1 lport=22  exe="/usr/sbin/sshd" hostname=? addr=127.0.0.1 terminal=? res=success'
<5>[  289.494418] audit: type=1103 audit(1656810479.383:300): pid=6693 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_env,pam_unix acct="root" exe="/usr/sbin/sshd" hostname=127.0.0.1 addr=127.0.0.1 terminal=ssh res=success'
<5>[  289.494624] audit: type=1006 audit(1656810479.383:301): pid=6693 uid=0 old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=2 res=1
<5>[  289.494640] audit: type=1300 audit(1656810479.383:301): arch=c000003e syscall=1 success=yes exit=1 a0=3 a1=7ffe2936ccf0 a2=1 a3=7ffe2936ca07 items=0 ppid=2162 pid=6693 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=2 comm="sshd" exe="/usr/sbin/sshd" key=(null)
dmesg-efi-165681053803002:
Panic#2 Part3
<4>[  348.816092] Hardware name: Star Labs StarBook/StarBook, BIOS 8.01 07/03/2022
<4>[  348.816093] RIP: e030:acpi_get_wakeup_address+0xc/0x20
<4>[  348.816100] Code: 44 00 00 48 8b 05 04 a3 82 02 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 48 8b 05 fc 9d 82 02 <8b> 40 1c c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f
<4>[  348.816103] RSP: e02b:ffffc90042537d08 EFLAGS: 00010246
<4>[  348.816105] RAX: 0000000000000000 RBX: 0000000000000003 RCX: 20c49ba5e353f7cf
<4>[  348.816106] RDX: 000000000000cd19 RSI: 000000000002ee9a RDI: 002a051ed42d7694
<4>[  348.816108] RBP: 0000000000000003 R08: ffffc90042537ca0 R09: ffffffff82c5e468
<4>[  348.816110] R10: 0000000000007ff0 R11: 0000000000000000 R12: 0000000000000000
<4>[  348.816111] R13: fffffffffffffff2 R14: ffff88812206e6c0 R15: ffff88812206e6e0
<4>[  348.816121] FS:  00007cb49b01eb80(0000) GS:ffff888189400000(0000) knlGS:0000000000000000
<4>[  348.816123] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  348.816124] CR2: 000000000000001c CR3: 000000012231a000 CR4: 0000000000050660
<4>[  348.816131] Call Trace:
<4>[  348.816133]  <TASK>
<4>[  348.816134]  acpi_pm_prepare+0x1a/0x50
<4>[  348.816141]  suspend_enter+0x94/0x360
<4>[  348.816146]  suspend_devices_and_enter+0x198/0x2b0
<4>[  348.816150]  enter_state+0x18d/0x1f5
<4>[  348.816155]  pm_suspend.cold+0x20/0x6b
<4>[  348.816159]  state_store+0x27/0x60
<4>[  348.816163]  kernfs_fop_write_iter+0x125/0x1c0
<4>[  348.816169]  new_sync_write+0x105/0x190
<4>[  348.816176]  vfs_write+0x211/0x2a0
<4>[  348.816180]  ksys_write+0x67/0xe0
<4>[  348.816183]  do_syscall_64+0x59/0x90
<4>[  348.816188]  ? do_syscall_64+0x69/0x90
<4>[  348.816192]  ? exc_page_fault+0x76/0x170
<4>[  348.816195]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
dmesg-efi-165681053803001:
Oops#1 Part3
<4>[  348.257406] kauditd_printk_skb: 23 callbacks suppressed
<5>[  348.257408] audit: type=1130 audit(1656810538.144:318): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=qubes-suspend comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
<6>[  348.284004] PM: suspend entry (deep)
<6>[  348.289532] Filesystems sync: 0.005 seconds
<6>[  348.291545] Freezing user space processes ... (elapsed 0.000 seconds) done.
<6>[  348.292457] OOM killer disabled.
<6>[  348.292462] Freezing remaining freezable tasks ... (elapsed 0.104 seconds) done.
<6>[  348.396612] printk: Suspending console(s) (use no_console_suspend to debug)
<6>[  348.749228] PM: suspend devices took 0.352 seconds
<6>[  348.769713] ACPI: EC: interrupt blocked
<1>[  348.816077] BUG: kernel NULL pointer dereference, address: 000000000000001c
<1>[  348.816080] #PF: supervisor read access in kernel mode
<1>[  348.816081] #PF: error_code(0x0000) - not-present page
<6>[  348.816083] PGD 0 P4D 0 
<4>[  348.816086] Oops: 0000 [#1] PREEMPT SMP NOPTI
<4>[  348.816089] CPU: 0 PID: 6764 Comm: systemd-sleep Not tainted 6.1.3-1.fc32.qubes.x86_64 #1
<4>[  348.816092] Hardware name: Star Labs StarBook/StarBook, BIOS 8.01 07/03/2022
<4>[  348.816093] RIP: e030:acpi_get_wakeup_address+0xc/0x20
<4>[  348.816100] Code: 44 00 00 48 8b 05 04 a3 82 02 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 48 8b 05 fc 9d 82 02 <8b> 40 1c c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f
<4>[  348.816103] RSP: e02b:ffffc90042537d08 EFLAGS: 00010246
<4>[  348.816105] RAX: 0000000000000000 RBX: 0000000000000003 RCX: 20c49ba5e353f7cf
<4>[  348.816106] RDX: 000000000000cd19 RSI: 000000000002ee9a RDI: 002a051ed42d7694
dmesg-efi-165681053802002:
Panic#2 Part2
<4>[  348.816200] RIP: 0033:0x7cb49c1412f7
<4>[  348.816203] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
<4>[  348.816204] RSP: 002b:00007ffc125f63f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
<4>[  348.816206] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007cb49c1412f7
<4>[  348.816208] RDX: 0000000000000004 RSI: 00007ffc125f64e0 RDI: 0000000000000004
<4>[  348.816209] RBP: 00007ffc125f64e0 R08: 00005c83d772bca0 R09: 000000000000000d
<4>[  348.816210] R10: 00005c83d7727eb0 R11: 0000000000000246 R12: 0000000000000004
<4>[  348.816211] R13: 00005c83d77272d0 R14: 0000000000000004 R15: 00007cb49c213700
<4>[  348.816213]  </TASK>
<4>[  348.816214] Modules linked in: loop vfat fat snd_hda_codec_hdmi snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi iTCO_wdt intel_pmc_bxt ee1004 iTCO_vendor_support intel_rapl_msr snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device iwlwifi snd_pcm pcspkr joydev processor_thermal_device_pci_legacy processor_thermal_device snd_timer snd cfg80211 processor_thermal_rfim i2c_i801 processor_thermal_mbox i2c_smbus idma64 rfkill processor_thermal_rapl soundcore intel_rapl_common int340x_thermal_zone intel_soc_dts_iosf igen6_edac intel_hid intel_pmc_core intel_scu_pltdrv sparse_keymap fuse xenfs ip_tables dm_thin_pool
dmesg-efi-165681053802001:
Oops#1 Part2
<4>[  348.816108] RBP: 0000000000000003 R08: ffffc90042537ca0 R09: ffffffff82c5e468
<4>[  348.816110] R10: 0000000000007ff0 R11: 0000000000000000 R12: 0000000000000000
<4>[  348.816111] R13: fffffffffffffff2 R14: ffff88812206e6c0 R15: ffff88812206e6e0
<4>[  348.816121] FS:  00007cb49b01eb80(0000) GS:ffff888189400000(0000) knlGS:0000000000000000
<4>[  348.816123] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  348.816124] CR2: 000000000000001c CR3: 000000012231a000 CR4: 0000000000050660
<4>[  348.816131] Call Trace:
<4>[  348.816133]  <TASK>
<4>[  348.816134]  acpi_pm_prepare+0x1a/0x50
<4>[  348.816141]  suspend_enter+0x94/0x360
<4>[  348.816146]  suspend_devices_and_enter+0x198/0x2b0
<4>[  348.816150]  enter_state+0x18d/0x1f5
<4>[  348.816155]  pm_suspend.cold+0x20/0x6b
<4>[  348.816159]  state_store+0x27/0x60
<4>[  348.816163]  kernfs_fop_write_iter+0x125/0x1c0
<4>[  348.816169]  new_sync_write+0x105/0x190
<4>[  348.816176]  vfs_write+0x211/0x2a0
<4>[  348.816180]  ksys_write+0x67/0xe0
<4>[  348.816183]  do_syscall_64+0x59/0x90
<4>[  348.816188]  ? do_syscall_64+0x69/0x90
<4>[  348.816192]  ? exc_page_fault+0x76/0x170
<4>[  348.816195]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
<4>[  348.816200] RIP: 0033:0x7cb49c1412f7
<4>[  348.816203] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
<4>[  348.816204] RSP: 002b:00007ffc125f63f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
<4>[  348.816206] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007cb49c1412f7
<4>[  348.816208] RDX: 0000000000000004 RSI: 00007ffc125f64e0 RDI: 0000000000000004
<4>[  348.816209] RBP: 00007ffc125f64e0 R08: 00005c83d772bca0 R09: 000000000000000d
dmesg-efi-165681053801002:
Panic#2 Part1
<4>[  348.816259]  dm_persistent_data dm_bio_prison dm_crypt i915 crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic drm_buddy nvme video wmi drm_display_helper nvme_core xhci_pci xhci_pci_renesas ghash_clmulni_intel hid_multitouch sha512_ssse3 serio_raw nvme_common cec xhci_hcd ttm i2c_hid_acpi i2c_hid pinctrl_tigerlake xen_acpi_processor xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn uinput
<4>[  348.816281] CR2: 000000000000001c
<4>[  348.816283] ---[ end trace 0000000000000000 ]---
<4>[  348.867991] RIP: e030:acpi_get_wakeup_address+0xc/0x20
<4>[  348.867996] Code: 44 00 00 48 8b 05 04 a3 82 02 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 48 8b 05 fc 9d 82 02 <8b> 40 1c c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f
<4>[  348.867998] RSP: e02b:ffffc90042537d08 EFLAGS: 00010246
<4>[  348.867999] RAX: 0000000000000000 RBX: 0000000000000003 RCX: 20c49ba5e353f7cf
<4>[  348.868000] RDX: 000000000000cd19 RSI: 000000000002ee9a RDI: 002a051ed42d7694
<4>[  348.868001] RBP: 0000000000000003 R08: ffffc90042537ca0 R09: ffffffff82c5e468
<4>[  348.868001] R10: 0000000000007ff0 R11: 0000000000000000 R12: 0000000000000000
<4>[  348.868002] R13: fffffffffffffff2 R14: ffff88812206e6c0 R15: ffff88812206e6e0
<4>[  348.868008] FS:  00007cb49b01eb80(0000) GS:ffff888189400000(0000) knlGS:0000000000000000
<4>[  348.868009] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  348.868009] CR2: 000000000000001c CR3: 000000012231a000 CR4: 0000000000050660
<0>[  348.868014] Kernel panic - not syncing: Fatal exception
<0>[  348.868031] Kernel Offset: disabled
dmesg-efi-165681053801001:
Oops#1 Part1
<4>[  348.816210] R10: 00005c83d7727eb0 R11: 0000000000000246 R12: 0000000000000004
<4>[  348.816211] R13: 00005c83d77272d0 R14: 0000000000000004 R15: 00007cb49c213700
<4>[  348.816213]  </TASK>
<4>[  348.816214] Modules linked in: loop vfat fat snd_hda_codec_hdmi snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi iTCO_wdt intel_pmc_bxt ee1004 iTCO_vendor_support intel_rapl_msr snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device iwlwifi snd_pcm pcspkr joydev processor_thermal_device_pci_legacy processor_thermal_device snd_timer snd cfg80211 processor_thermal_rfim i2c_i801 processor_thermal_mbox i2c_smbus idma64 rfkill processor_thermal_rapl soundcore intel_rapl_common int340x_thermal_zone intel_soc_dts_iosf igen6_edac intel_hid intel_pmc_core intel_scu_pltdrv sparse_keymap fuse xenfs ip_tables dm_thin_pool
<4>[  348.816259]  dm_persistent_data dm_bio_prison dm_crypt i915 crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic drm_buddy nvme video wmi drm_display_helper nvme_core xhci_pci xhci_pci_renesas ghash_clmulni_intel hid_multitouch sha512_ssse3 serio_raw nvme_common cec xhci_hcd ttm i2c_hid_acpi i2c_hid pinctrl_tigerlake xen_acpi_processor xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn uinput
<4>[  348.816281] CR2: 000000000000001c
<4>[  348.816283] ---[ end trace 0000000000000000 ]---

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel-latest (including package kernel-latest-6.1.5-1.fc32.qubes) has been pushed to the r4.1 testing repository for dom0.
To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

@andrewdavidwong andrewdavidwong added diagnosed Technical diagnosis has been performed (see issue comments). pr submitted A pull request has been submitted for this issue. and removed needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. labels Jan 14, 2023
@GWeck
Copy link

GWeck commented Jan 14, 2023

I checked with the 6.1.5 kernel. No more crashes on Suspend and Resume - all o.k.

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel-latest (including package kernel-latest-6.1.7-1.fc32.qubes) has been pushed to the r4.1 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel (including package kernel-6.1.26-1.qubes.fc32) has been pushed to the r4.1 testing repository for dom0.
To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel (including package kernel-6.1.35-1.qubes.fc32) has been pushed to the r4.1 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

@andrewdavidwong andrewdavidwong added the affects-4.1 This issue affects Qubes OS 4.1. label Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-4.1 This issue affects Qubes OS 4.1. C: kernel C: power management diagnosed Technical diagnosis has been performed (see issue comments). P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. pr submitted A pull request has been submitted for this issue. r4.1-dom0-stable r4.2-host-stable T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants