gui-agent: die when xorg fails to start #176

meithecatte · 2023-02-28T00:02:43Z

OpenQA test summary

Complete test suite and dependencies: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.2&build=2023030619-4.2&flavor=pull-requests

New failures, excluding unstable

Compared to: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.2&build=2023021823-4.2&flavor=update

system_tests_whonix
- whonix_torbrowser: unnamed test (unknown)
- whonix_torbrowser: Failed (test died)
  # Test died: no candidate needle with tag(s) 'anon-whonix-tor-brows...
- whonix_torbrowser: unnamed test (unknown)
system_tests_basic_vm_qrexec_gui
- TC_03_QvmRevertTemplateChanges: test_000_revert_linux (error)
  qubes.exc.QubesMemoryError: Not enough memory to start domain 'test...
- TC_30_Gui_daemon: test_000_clipboard (error)
  qubes.exc.QubesMemoryError: Not enough memory to start domain 'test...
- TC_00_AppVM_debian-11: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_debian-11: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
system_tests_network
- VmNetworking_debian-11: test_100_late_xldevd_startup (error)
  raise exceptions.TimeoutError() from exc... TimeoutError
- VmNetworking_debian-11: test_204_fake_ip_proxy (failure)
  self.assertEqual(self.run_cmd(self.proxy, se... AssertionError: 1 != 0
system_tests_pvgrub_salt_storage
- TC_41_HVMGrub_debian-11: test_010_template_based_vm (error)
  qubes.exc.QubesMemoryError: Not enough memory to start domain 'test...
system_tests_splitgpg
- TC_10_Thunderbird_debian-11: test_000_send_receive_default (failure + cleanup)
  dogtail.tree.SearchError: child of [desktop frame | main]: "Thunder...
- TC_10_Thunderbird_fedora-37: test_000_send_receive_default (failure + cleanup)
  dogtail.tree.SearchError: child of [desktop frame | main]: "Thunder...
system_tests_guivm_gui_interactive
- update_guivm: Failed (test died)
  # Test died: command '(set -o pipefail; qubesctl --all --show-outpu...
system_tests_usbproxy
- TC_20_USBProxy_core3_whonix-ws-16: test_030_detach (failure)
  AssertionError: <AppVM at 0x706a3cbdce90 name='test-inst-frontend' ...
system_tests_network_ipv6
- VmIPv6Networking_debian-11: test_203_fake_ip_inter_vm_allow (error)
  raise exceptions.TimeoutError() from exc... TimeoutError
- VmIPv6Networking_debian-11: test_211_custom_ip_proxy (error)
  raise exceptions.TimeoutError() from exc... TimeoutError
system_tests_network_updates
- VmUpdates_debian-11: test_110_update_via_proxy_qubes_vm_update (failure)
  ^^^^^^^^^^^^^... StopIteration
- VmUpdates_debian-11: test_111_update_via_proxy_qubes_vm_update_cli (failure)
  ^^^^^^^^^^^^^... StopIteration
- VmUpdates_debian-11: test_120_updates_available_notification_qubes_vm_update (failure)
  ^^^^^^^^^^^^^... StopIteration
- VmUpdates_debian-11: test_121_updates_available_notification_qubes_vm_update_cli (failure)
  ^^^^^^^^^^^^^... StopIteration
- VmUpdates_fedora-37: test_110_update_via_proxy_qubes_vm_update (failure)
  ^^^^^^^^^^^^^... StopIteration
- VmUpdates_fedora-37: test_111_update_via_proxy_qubes_vm_update_cli (failure)
  ^^^^^^^^^^^^^... StopIteration
- VmUpdates_fedora-37: test_120_updates_available_notification_qubes_vm_update (failure)
  ^^^^^^^^^^^^^... StopIteration
- VmUpdates_fedora-37: test_121_updates_available_notification_qubes_vm_update_cli (failure)
  ^^^^^^^^^^^^^... StopIteration
system_tests_dispvm
- TC_20_DispVM_fedora-37: test_100_open_in_dispvm (failure)
  self.assertEqual(test_txt_content.s... AssertionError: b'' != b'test1'
system_tests_qwt_win7@hw1
- windows_install: Failed (test died)
  # Test died: command './install.sh' failed at /usr/lib/os-autoinst/...
system_tests_basic_vm_qrexec_gui_zfs
- TC_00_Basic: test_202_udev_block_exclude_default (failure)
  AssertionError: '7d60680b-393b-4e2e-858a-8e58f358ffbb' unexpectedly...
- TC_03_QvmRevertTemplateChanges: test_000_revert_linux (failure)
  AssertionError: '583166f7a65890adbad26952ed8782b595cb3b8c' != 'd332...
- TC_05_StandaloneVM_debian-11-pool: test_101_resize_root_img_online (failure)
  AssertionError: libvirt event impl drain timeout
- TC_00_AppVM_debian-11-pool: test_130_qrexec_filemove_disk_full (failure)
  AssertionError: libvirt event impl drain timeout
- TC_00_AppVM_debian-11-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_debian-11-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-gw-16-pool: test_105_qrexec_filemove (error)
  qubes.exc.QubesVMError: Cannot connect to qrexec agent for 90 secon...
- TC_00_AppVM_whonix-gw-16-pool: test_300_bug_1028_gui_memory_pinning (failure)
  AssertionError: Dom0 window doesn't match VM window content
- TC_00_AppVM_whonix-ws-16-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
system_tests_basic_vm_qrexec_gui_btrfs
- TC_00_AppVM_debian-11-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_debian-11-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
system_tests_basic_vm_qrexec_gui_ext4
- TC_00_AppVM_debian-11-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_debian-11-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
system_tests_basic_vm_qrexec_gui_xfs
- TC_00_AppVM_debian-11-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_debian-11-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
system_tests_basic_vm_qrexec_gui@hw1
- TC_00_AppVM_debian-11: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_debian-11: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
system_tests_gui_tools@hw1
- qubesmanager_backuprestore: unnamed test (unknown)
- qubesmanager_backuprestore: Failed (test died)
  # Test died: no candidate needle with tag(s) 'qubes-backup' matched...
system_tests_gui_tools
- qubesmanager_backuprestore: unnamed test (unknown)
- qubesmanager_backuprestore: Failed (test died)
  # Test died: no candidate needle with tag(s) 'restore-success' matc...

Failed tests

73 failures

system_tests_whonix
- whonix_torbrowser: unnamed test (unknown)
- whonix_torbrowser: Failed (test died)
  # Test died: no candidate needle with tag(s) 'anon-whonix-tor-brows...
- whonix_torbrowser: unnamed test (unknown)
system_tests_basic_vm_qrexec_gui
- TC_03_QvmRevertTemplateChanges: test_000_revert_linux (error)
  qubes.exc.QubesMemoryError: Not enough memory to start domain 'test...
- TC_30_Gui_daemon: test_000_clipboard (error)
  qubes.exc.QubesMemoryError: Not enough memory to start domain 'test...
- TC_00_AppVM_debian-11: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_debian-11: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
system_tests_network
- VmNetworking_debian-11: test_100_late_xldevd_startup (error)
  raise exceptions.TimeoutError() from exc... TimeoutError
- VmNetworking_debian-11: test_204_fake_ip_proxy (failure)
  self.assertEqual(self.run_cmd(self.proxy, se... AssertionError: 1 != 0
system_tests_pvgrub_salt_storage
- TC_41_HVMGrub_debian-11: test_010_template_based_vm (error)
  qubes.exc.QubesMemoryError: Not enough memory to start domain 'test...
system_tests_splitgpg
- TC_10_Thunderbird_debian-11: test_000_send_receive_default (failure + cleanup)
  dogtail.tree.SearchError: child of [desktop frame | main]: "Thunder...
- TC_10_Thunderbird_fedora-37: test_000_send_receive_default (failure + cleanup)
  dogtail.tree.SearchError: child of [desktop frame | main]: "Thunder...
- TC_10_Thunderbird_whonix-ws-16: test_020_send_receive_inline_with_attachment (failure)
  dogtail.tree.SearchError: descendent of [application | Thunderbird]...
system_tests_guivm_gui_interactive
- update_guivm: Failed (test died)
  # Test died: command '(set -o pipefail; qubesctl --all --show-outpu...
system_tests_usbproxy
- TC_20_USBProxy_core3_whonix-ws-16: test_030_detach (failure)
  AssertionError: <AppVM at 0x706a3cbdce90 name='test-inst-frontend' ...
system_tests_network_ipv6
- VmIPv6Networking_debian-11: test_203_fake_ip_inter_vm_allow (error)
  raise exceptions.TimeoutError() from exc... TimeoutError
- VmIPv6Networking_debian-11: test_211_custom_ip_proxy (error)
  raise exceptions.TimeoutError() from exc... TimeoutError
system_tests_network_updates
- VmUpdates_debian-11: test_110_update_via_proxy_qubes_vm_update (failure)
  ^^^^^^^^^^^^^... StopIteration
- VmUpdates_debian-11: test_111_update_via_proxy_qubes_vm_update_cli (failure)
  ^^^^^^^^^^^^^... StopIteration
- VmUpdates_debian-11: test_120_updates_available_notification_qubes_vm_update (failure)
  ^^^^^^^^^^^^^... StopIteration
- VmUpdates_debian-11: test_121_updates_available_notification_qubes_vm_update_cli (failure)
  ^^^^^^^^^^^^^... StopIteration
- VmUpdates_fedora-37: test_110_update_via_proxy_qubes_vm_update (failure)
  ^^^^^^^^^^^^^... StopIteration
- VmUpdates_fedora-37: test_111_update_via_proxy_qubes_vm_update_cli (failure)
  ^^^^^^^^^^^^^... StopIteration
- VmUpdates_fedora-37: test_120_updates_available_notification_qubes_vm_update (failure)
  ^^^^^^^^^^^^^... StopIteration
- VmUpdates_fedora-37: test_121_updates_available_notification_qubes_vm_update_cli (failure)
  ^^^^^^^^^^^^^... StopIteration
system_tests_dispvm
- TC_20_DispVM_fedora-37: test_100_open_in_dispvm (failure)
  self.assertEqual(test_txt_content.s... AssertionError: b'' != b'test1'
- TC_20_DispVM_whonix-ws-16: test_100_open_in_dispvm (failure)
  AssertionError: libvirt event impl drain timeout
system_tests_qwt_win10@hw1
- windows_install: Failed (test died)
  # Test died: command './install.sh' failed at /usr/lib/os-autoinst/...
system_tests_qwt_win7@hw1
- windows_install: Failed (test died)
  # Test died: command './install.sh' failed at /usr/lib/os-autoinst/...
system_tests_basic_vm_qrexec_gui_zfs
- TC_00_Basic: test_202_udev_block_exclude_default (failure)
  AssertionError: '7d60680b-393b-4e2e-858a-8e58f358ffbb' unexpectedly...
- TC_03_QvmRevertTemplateChanges: test_000_revert_linux (failure)
  AssertionError: '583166f7a65890adbad26952ed8782b595cb3b8c' != 'd332...
- TC_05_StandaloneVM_debian-11-pool: test_101_resize_root_img_online (failure)
  AssertionError: libvirt event impl drain timeout
- TC_00_AppVM_debian-11-pool: test_130_qrexec_filemove_disk_full (failure)
  AssertionError: libvirt event impl drain timeout
- TC_00_AppVM_debian-11-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_debian-11-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-gw-16-pool: test_105_qrexec_filemove (error)
  qubes.exc.QubesVMError: Cannot connect to qrexec agent for 90 secon...
- TC_00_AppVM_whonix-gw-16-pool: test_300_bug_1028_gui_memory_pinning (failure)
  AssertionError: Dom0 window doesn't match VM window content
- TC_00_AppVM_whonix-ws-16-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
system_tests_basic_vm_qrexec_gui_btrfs
- TC_00_AppVM_debian-11-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_debian-11-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
system_tests_basic_vm_qrexec_gui_ext4
- TC_00_AppVM_debian-11-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_debian-11-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
system_tests_basic_vm_qrexec_gui_xfs
- TC_00_AppVM_debian-11-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_debian-11-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16-pool: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16-pool: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
system_tests_basic_vm_qrexec_gui@hw1
- TC_00_AppVM_debian-11: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_debian-11: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_fedora-37: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16: test_220_audio_play (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
- TC_00_AppVM_whonix-ws-16: test_223_audio_play_hvm (error)
  subprocess.CalledProcessError: Command '['pkill', 'parecord']' retu...
system_tests_gui_tools@hw1
- qubesmanager_backuprestore: unnamed test (unknown)
- qubesmanager_backuprestore: Failed (test died)
  # Test died: no candidate needle with tag(s) 'qubes-backup' matched...
system_tests_gui_tools
- qubesmanager_backuprestore: unnamed test (unknown)
- qubesmanager_backuprestore: Failed (test died)
  # Test died: no candidate needle with tag(s) 'restore-success' matc...

Fixed failures

Compared to: https://openqa.qubes-os.org/tests/60652#dependencies

7 fixed

system_tests_network
- VmNetworking_debian-11: test_112_reattach_after_provider_shutdown (error + timed out)
  qubes.exc.QubesVMShutdownTimeoutError: Domain shutdown timed out: '...
system_tests_pvgrub_salt_storage
- StorageFile: test_001_non_volatile (error)
  subprocess.CalledProcessError: Command '/usr/lib/qubes/destroy-snap...
system_tests_network_ipv6
- VmIPv6Networking_debian-11: test_020_simple_proxyvm_nm (failure)
  AssertionError: 1 != 0 : nm-applet window not found
system_tests_network_updates
- TC_11_QvmTemplateMgmtVM_whonix-gw-16: test_000_template_list (failure)
  qvm-template: error: No matching templates to list
system_tests_qwt_win10@hw1
- windows_install: wait_serial (wait serial expected)
  # wait_serial expected: qr/Rt7qO-\d+-/...
system_tests_basic_vm_qrexec_gui@hw1
- TC_00_Basic: test_203_udev_block_exclude_varlibqubes (error)
  subprocess.CalledProcessError: Command '/usr/lib/qubes/destroy-snap...
- TC_00_Basic: test_204_udev_block_exclude_custom_file (error)
  subprocess.CalledProcessError: Command '/usr/lib/qubes/destroy-snap...

Unstable tests

system_tests_update
update/Failed (1/5 times with errors)
- job 55329 # Test died: command '(set -o pipefail; qubesctl --show-output stat...
system_tests_update@hw1
update/Failed (1/5 times with errors)
- job 55329 # Test died: command '(set -o pipefail; qubesctl --show-output stat...
system_tests_gui_tools@hw1
qubesmanager_vmsettings/ (1/2 times with errors)
- job 60669 None
qubesmanager_vmsettings/Failed (1/2 times with errors)
- job 60669 # Test died: no candidate needle with tag(s) 'vm-settings-devices-s...
system_tests_gui_tools
qubesmanager_vmsettings/ (1/2 times with errors)
- job 60669 None
qubesmanager_vmsettings/Failed (1/2 times with errors)
- job 60669 # Test died: no candidate needle with tag(s) 'vm-settings-devices-s...

DemiMarie · 2023-03-04T16:03:50Z

gui-agent/vmside.c

+static void handle_sigchld()
+{
+    fprintf(stderr, "Xorg died unexpectedly, exiting!\n");
+    exit(1);
+}
+


You can use pselect() or the self-pipe trick.

DemiMarie · 2023-03-04T16:03:57Z

gui-agent/vmside.c

@@ -2255,6 +2261,7 @@ int main(int argc, char **argv)
    int wait_fds[2];

    parse_args(&g, argc, argv);
+    signal(SIGCHLD, handle_sigchld);


That should be fixed too.

meithecatte · 2023-03-04T21:22:02Z

Hmm, the approach of the signal handler just setting some kind of flag won't work, as in the motivating case, the signal gets delivered while the gui agent is stuck in mkghandles. And to be honest, I don't know how much I'd trust it to not get stuck in Xlib when Xorg dies, between the calls to select at which we could handle this.

I considered rewriting the signal handler for SIGCHLD to use signal-safe functions, but snprintf is not one of them, and using _exit isn't very nice either.

Another option would be to have the SIGCHLD handler attempt notifying us via a self-pipe, but also set an alarm in case that fails? And in that case, I guess we don't have to feel as bad about using _exit...

And then there's the option of starting up another thread, just in case Xorg dies on us, and handling this there?

I don't think there is an elegant solution here. Unix was a mistake.

DemiMarie · 2023-03-05T02:49:59Z

Hmm, the approach of the signal handler just setting some kind of flag won't work, as in the motivating case, the signal gets delivered while the gui agent is stuck in mkghandles. And to be honest, I don't know how much I'd trust it to not get stuck in Xlib when Xorg dies, between the calls to select at which we could handle this.

Xorg exiting should cause I/O to fail, so if Xlib hangs that is an Xlib bug. Might be better to port the whole thing to XCB, which should at least be somewhat predictable.

I considered rewriting the signal handler for SIGCHLD to use signal-safe functions, but snprintf is not one of them, and using _exit isn't very nice either.

It gets worse: due to race conditions, it isn’t safe to exit until all MSG_WINDOW_DUMP messages have been acknowledged. So the event loop needs to keep running a bit longer.

Another option would be to have the SIGCHLD handler attempt notifying us via a self-pipe, but also set an alarm in case that fails? And in that case, I guess we don't have to feel as bad about using _exit...

Self-pipe is safe, alarm isn’t.

And then there's the option of starting up another thread, just in case Xorg dies on us, and handling this there?

Does the agent call fork() and then do anything that isn’t async-signal-safe before execve()? If so, that will need to be dealt with first.

I don't think there is an elegant solution here. Unix was a mistake.

Yes, it was.

meithecatte · 2023-03-05T13:38:05Z

Self-pipe is safe, alarm isn’t.

alarm(2) is listed in signal-safety(7), though?

Does the agent call fork() and then do anything that isn’t async-signal-safe before execve()? If so, that will need to be dealt with first.

Could you expand on this? I don't really understand why.

It gets worse: due to race conditions, it isn’t safe to exit until all MSG_WINDOW_DUMP messages have been acknowledged. So the event loop needs to keep running a bit longer.

What happens when we exit too early? I don't think we can rely on this either way, as this is happening across a security boundary.

I don't think there is an elegant solution here. Unix was a mistake.

Yes, it was.

I'm glad we agree.

DemiMarie · 2023-03-05T14:44:20Z

Self-pipe is safe, alarm isn’t.

alarm(2) is listed in signal-safety(7), though?

See below. The tl;dr is that we really do not want to exit uncleanly.

Does the agent call fork() and then do anything that isn’t async-signal-safe before execve()? If so, that will need to be dealt with first.

Could you expand on this? I don't really understand why.

fork() interacts badly with locks. Any locks that were held in the parent process will remain held in the child process, but the threads that would unlock them do not exist in the child. Since e.g. malloc() uses locks internally, this could lead to a deadlock in the child process. Therefore, POSIX states that if the parent process is multi-threaded, the child process after fork() is only allowed to use async-signal-safe interfaces until execve(). Under glibc, one can generally get away with malloc(), but it’s best not to rely on that.

It gets worse: due to race conditions, it isn’t safe to exit until all MSG_WINDOW_DUMP messages have been acknowledged. So the event loop needs to keep running a bit longer.

What happens when we exit too early? I don't think we can rely on this either way, as this is happening across a security boundary.

See this GUI daemon commit, which will (hopefully) be merged eventually. In short, while there is no security problem, it could cause a guest-wide hang or loss of network connectivity. Ideally, the agent would go even further, and wait for all windows to be unmapped on the agent side.

I don't think there is an elegant solution here. Unix was a mistake.

Yes, it was.

I'm glad we agree.

Unix got some things right, but a lot of things wrong.

marmarek · 2023-03-05T15:10:34Z

Therefore, POSIX states that if the parent process is multi-threaded, the child process after fork() is only allowed to use async-signal-safe interfaces until execve().

Citation needed.

Anyway, that's one of the thing such application must take care of. More complex application (which this isn't really) have APIs to register callbacks around fork (before, after in parent, after in child etc). Anyway, I don't think any of this applies here, because we aren't talking about interacting with things after fork() (but - if we would, xenstore may need some special care, as it may use threads).

See this GUI daemon commit, which will (hopefully) be merged eventually. In short, while there is no security problem, it could cause a guest-wide hang or loss of network connectivity. Ideally, the agent would go even further, and wait for all windows to be unmapped on the agent side.

But you do realize we are talking about handling premature Xorg exit here, right? At this point all grants are unmapped already. So, there is no point in complicating things in this case.

meithecatte · 2023-03-05T15:52:49Z

But you do realize we are talking about handling premature Xorg exit here, right? At this point all grants are unmapped already. So, there is no point in complicating things in this case.

Say Xorg just randomly segfaults. It's a pile of C code so it's not that farfetched. What part of the system unmaps the grants in this case?

DemiMarie · 2023-03-05T16:23:50Z

See this GUI daemon commit, which will (hopefully) be merged eventually. In short, while there is no security problem, it could cause a guest-wide hang or loss of network connectivity. Ideally, the agent would go even further, and wait for all windows to be unmapped on the agent side.

But you do realize we are talking about handling premature Xorg exit here, right? At this point all grants are unmapped already. So, there is no point in complicating things in this case.

Whoops! I forgot that the grants are mapped by Xorg, not the agent process.

marmarek · 2023-03-06T00:50:21Z

What part of the system unmaps the grants in this case?

Kernel, on FD release.

meithecatte · 2023-03-07T02:41:44Z

I pushed a much larger diff, that should handle all this properly.

DemiMarie

The check for the return code of

gui-agent/vmside.c

DemiMarie · 2023-03-07T02:46:24Z

gui-agent/vmside.c

+    if (atomic_load(&terminating)) {
+        exit(0);


Should the exit status really be zero here?

This happens if we get SIGTERM. Not sure if exit(0) is what we should do in this case, but it is what the previous code did. What would you suggest?

Not sure tbh.

Maybe we would want to reset the SIGTERM sigaction and send SIGTERM to ourselves, so that the parent gets told we were killed by SIGTERM? Seems like unnecessary complexity though, if I'm being honest.

marmarek · 2023-03-08T01:40:11Z

Uhm, I blinked and suddenly gui-agent got another thread. I must admit I don't really like it... I'd much prefer the approach with a flag and checking it where necessary. In case of waiting for Xorg startup, I guess that would be mostly wait_for_unix_socket() (currently it will exit(1) if accept fails, including EINTR case).

marmarek added the openqa-pending label Feb 28, 2023

DemiMarie suggested changes Feb 28, 2023

View reviewed changes

DemiMarie suggested changes Mar 4, 2023

View reviewed changes

gui-agent: die when xorg fails to start

328dada

meithecatte force-pushed the handle-xorg-crash branch from f640072 to 328dada Compare March 7, 2023 02:30

DemiMarie reviewed Mar 7, 2023

View reviewed changes

fixup! gui-agent: die when xorg fails to start

d29eb36

marmarek removed the openqa-pending label Mar 8, 2023

marmarek mentioned this pull request Dec 2, 2024

Improve handling early startup issues #219

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gui-agent: die when xorg fails to start #176

gui-agent: die when xorg fails to start #176

meithecatte commented Feb 28, 2023 •

edited

Loading

DemiMarie left a comment

DemiMarie Feb 28, 2023

meithecatte Mar 3, 2023

DemiMarie Mar 4, 2023

DemiMarie Feb 28, 2023

DemiMarie Feb 28, 2023

meithecatte Mar 3, 2023

DemiMarie Mar 4, 2023

marmarek commented Feb 28, 2023

marmarek commented Mar 1, 2023

marmarek commented Mar 1, 2023

qubesos-bot commented Mar 1, 2023 •

edited

Loading

DemiMarie Mar 4, 2023

DemiMarie Mar 4, 2023

meithecatte commented Mar 4, 2023

DemiMarie commented Mar 5, 2023

meithecatte commented Mar 5, 2023

DemiMarie commented Mar 5, 2023

marmarek commented Mar 5, 2023

meithecatte commented Mar 5, 2023

DemiMarie commented Mar 5, 2023

marmarek commented Mar 6, 2023

meithecatte commented Mar 7, 2023

DemiMarie left a comment

DemiMarie Mar 7, 2023

meithecatte Mar 7, 2023

DemiMarie Mar 7, 2023

meithecatte Mar 7, 2023

marmarek commented Mar 8, 2023 •

edited

Loading

	static void handle_sigchld()
	static void handle_sigchld(void)

gui-agent: die when xorg fails to start #176

Are you sure you want to change the base?

gui-agent: die when xorg fails to start #176

Conversation

meithecatte commented Feb 28, 2023 • edited Loading

DemiMarie left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marmarek commented Feb 28, 2023

marmarek commented Mar 1, 2023

marmarek commented Mar 1, 2023

qubesos-bot commented Mar 1, 2023 • edited Loading

OpenQA test summary

New failures, excluding unstable

Failed tests

Fixed failures

Unstable tests

Choose a reason for hiding this comment

Choose a reason for hiding this comment

meithecatte commented Mar 4, 2023

DemiMarie commented Mar 5, 2023

meithecatte commented Mar 5, 2023

DemiMarie commented Mar 5, 2023

marmarek commented Mar 5, 2023

meithecatte commented Mar 5, 2023

DemiMarie commented Mar 5, 2023

marmarek commented Mar 6, 2023

meithecatte commented Mar 7, 2023

DemiMarie left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marmarek commented Mar 8, 2023 • edited Loading

meithecatte commented Feb 28, 2023 •

edited

Loading

qubesos-bot commented Mar 1, 2023 •

edited

Loading

marmarek commented Mar 8, 2023 •

edited

Loading