-
Couldn't load subscription status.
- Fork 293
PR1007 #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PR1007 #4
Conversation
|
I just rebased this branch to separate the changes into more logical blocks. I might follow this up with a big whitespace change soon. Sorry to Vincent for blowing away his previous comments in the rebase. |
|
Thanks for rebasing into logical commits. Please only do rebases when your branch isn't merged into the xen-org repository, yet. |
|
Mike: could you add "Signed-off-by: ..." lines to the commit comments? (I hope this is easy to do retrospectively) |
|
That's easy to do. But do we need them? The commits do have authors that git knows about. A reviewed-by line may be useful. But I don't know where we should but it. Perhaps in the merge commits? Or in the pull-request discussion? (Or in the individual commits?) |
|
Dave, signed-off-by lines have been added. FYI, doing a 'git commit --signoff' will give you signed-off-by lines automatically. |
…ent network" since this (a) is needed by other stuff; and (b) happens to cause VM.start to fail if any other VIF exists. Signed-off-by: David Scott <dave.scott@eu.citrix.com>
…ement-interface breaks and none of the dom0 interfaces are brought up so ssh doesn't work. After 'xe host-management-disable' /etc/xensource-inventory:MANAGEMENT_INTERFACE='' Signed-off-by: David Scott <dave.scott@eu.citrix.com>
…t 'n' refers to a plug ordering. The previous meaning was 'PCI bus device' number which was only meaningful for PV guests (and of unknown use). The HVM PCI hotplug protocol doesn't seem to support this anyway. If we need it again, we can tweak the syntax. Signed-off-by: David Scott <dave.scott@eu.citrix.com>
…onstant bridge name, so it can be referenced from global firewall rules. Signed-off-by: David Scott <dave.scott@eu.citrix.com>
…use Xc.physinfo and max_cpu_id + 1 Signed-off-by: David Scott <dave.scott@eu.citrix.com>
This fixes the converse of the problem addressed by commit fe19e820 "xenserver: Kill bond master's dhclient when bringing up bond slave". In that commit's log message, I claimed that the converse was not a problem, but I was wrong. I must have screwed up in testing, because it really is a problem. This commit fixes it. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> CC: Dominic Curran <dominic.curran@citrix.com> Reported-by: Michael Mao <mmao@nicira.com> Bug xapi-project#2668.
This patch defensively guarantees that the first id in xs-network-uuids will belong to the primary network (as opposed to a vlan). Given that the primary network id comes first, it parses xs-network-ids and only copies the primary id to bridge-id when monitor-external-ids is run. Feature xapi-project#3647 Signed-off-by: Ethan Jackson <ethan@nicira.com> Reviewed-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
This commit adds bond_mode configuration to Interface Reconfigure so that it may be changed using standard XenServer commands. Signed-off-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
This commit makes necessary changes to Interface Reconfigure to allow miimon bond-detect-mode and bond-miimon-interval to be changed using xapi. Signed-off-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
…nto ovs-2011-02-14
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Due to some recent changes in the database backend, some fields which used to get computed dynamically on the fly are not cached. Hence when reverting a VM to a snapshot/checkpoint, we should avoid copying some extra fields from the snapshot metadata. Signed-off-by: Zheng Li <zheng.li@eu.citrix.com>
Based on draft by Lars Kurth. Signed-off-by: Matthias Goergens <matthias.goergens@gmail.com>
…into mpg-readme
When saving the database cache, InterfaceReconfigure can crash if unexpected attributes are in an object's other-config column. This commit causes it to skip that attribute and log a warning. Signed-off-by: Ethan Jackson <ethan@nicira.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Dominic Curran <Dominic.Curran@citrix.com> Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
This commit allows xapi to set the fail mode through the vswitch-controller-fail-mode other-config setting in the Pool object. Signed-off-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
InterfaceReconfigure assumes the pool is in the configuration cache. This is always true except when upgrading from an older version of openvswitch which does record the pool. This can cause upgrades to lock up. Signed-off-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
Interface reconfigure can crash when setting fail_mode if an expected other_config setting is not set. Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
This patch allows the Pool object's vswitch-controller-fail-mode setting to be overridden on by Network object. Signed-off-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
Makes required changes to interface reconfigure to allow LACP configuration from xapi. Conforms to XenServer style bonding configuration which is slightly different from OVS. Signed-off-by: Ethan Jackson <ethan@nicira.com> Bug xapi-project#4213. Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
Commit 404c169247b1c3ac2ebad887f0421478a6cef924 breaks compatibility with XenServers older than 5.6 FP1. This commit removes the last vestiges of support for those older XenServer versions. Signed-off-by: Justin Pettit <jpettit@nicira.com> Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
Remove extra and legacy args from the bootloader invocation
Signed-off-by: Si Beaumont <simon.beaumont@citrix.com>
CP-5500: Removed Powershell snapin installer; ship it as a module instea...
Fix the autogenerated OASIS files
Remove almost everything from .gitignore (yay obuild!)
CA-121668: Move plugins to init level 23
Interface tidyup and documentation
Do not use 'finally' from package xapi-stdext-pervasives.
Update to new xentore interface in v1.2.3
Open VHD files RO where possible
xapi-project:master catching up with upstream djs55:master
0.9.2 release
Remove autogenerated setup.ml and introduce ocaml .gitignore
CP-13874: pass in `name_label` and `name_description`
CA-322784 Fix varstored-guard logging is verbose
RFC: move Xenstore_watch into ezxenstore
Using an off-cpu flamegraph I identified that concurrent PAM calls are slow due to a call to `sleep(1)`. `pam_authenticate` calls `crypt_r` which calls `NSSLOW_Init` which on first use will try to initialize the just `dlopen`-ed library. If it encounters a race condition it does a `sleep(1)`. This race condition can be quite reliably reproduced when performing a lot of PAM authentications from multiple threads in parallel. GDB can also be used to confirm this by putting a breakpoint on `sleep`: ``` #0 __sleep (seconds=seconds@entry=1) at ../sysdeps/unix/sysv/linux/sleep.c:42 #1 0x00007ffff1548e22 in freebl_RunLoaderOnce () at lowhash_vector.c:122 #2 0x00007ffff1548f31 in freebl_InitVector () at lowhash_vector.c:131 #3 NSSLOW_Init () at lowhash_vector.c:148 #4 0x00007ffff1b8f09a in __sha512_crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=0x7ffff31e17b8 "dIJbsXKc0", xapi-project#5 0x00007ffff1b8d070 in __crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=<optimized out>, xapi-project#6 0x00007ffff1dc9abc in verify_pwd_hash (p=p@entry=0x7fffd8005a60 "pamtest-edvint", hash=<optimized out>, nullok=nullok@entry=0) at passverify.c:111 xapi-project#7 0x00007ffff1dc9139 in _unix_verify_password (pamh=pamh@entry=0x7fffd8002910, name=0x7fffd8002ab0 "pamtest-edvint", p=0x7fffd8005a60 "pamtest-edvint", ctrl=ctrl@entry=8389156) at support.c:777 xapi-project#8 0x00007ffff1dc6556 in pam_sm_authenticate (pamh=0x7fffd8002910, flags=<optimized out>, argc=<optimized out>, argv=<optimized out>) at pam_unix_auth.c:178 xapi-project#9 0x00007ffff7bcef1a in _pam_dispatch_aux (use_cached_chain=<optimized out>, resumed=<optimized out>, h=<optimized out>, flags=1, pamh=0x7fffd8002910) at pam_dispatch.c:110 xapi-project#10 _pam_dispatch (pamh=pamh@entry=0x7fffd8002910, flags=1, choice=choice@entry=1) at pam_dispatch.c:426 xapi-project#11 0x00007ffff7bce7e0 in pam_authenticate (pamh=0x7fffd8002910, flags=flags@entry=1) at pam_auth.c:34 xapi-project#12 0x00000000005ae567 in XA_mh_authorize (username=username@entry=0x7fffd80028d0 "pamtest-edvint", password=password@entry=0x7fffd80028f0 "pamtest-edvint", error=error@entry=0x7ffff31e1be8) at xa_auth.c:83 xapi-project#13 0x00000000005adf20 in stub_XA_mh_authorize (username=<optimized out>, password=<optimized out>) at xa_auth_stubs.c:42 xapi-project#14 0x00000000004a0a6a in camlDune__exe__Bench_pam__pam_authenticate$27_320 () at ocaml/tests/bench/pam/bench_pam.ml:63 xapi-project#15 0x00000000004a1113 in camlEzbechamel_concurrent__worker_loop_359 () at ocaml/tests/bench/lib/concurrent/ezbechamel_concurrent.ml:36 xapi-project#16 0x00000000005935b9 in camlStdlib__Fun__protect_317 () xapi-project#17 0x00000000004a1955 in camlThread__fun_850 () xapi-project#18 0x00000000005d6401 in caml_start_program () xapi-project#19 0x00000000005cd0fd in caml_callback_exn () xapi-project#20 0x00000000005af810 in caml_thread_start () xapi-project#21 0x00007ffff79b7e25 in start_thread (arg=0x7ffff31e2700) at pthread_create.c:308 xapi-project#22 0x00007ffff71dbbad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 ``` `pam_start` and `pam_end` doesn't help here, because on `pam_end` the library is `dlclose`-ed, so on next `pam_authenticate` it will have to go through the initialization code again. (This initialization code would've belonged into `pam_start`, not `pam_authenticate`, but there are several layers here including a call to `crypt_r`). To avoid this link with `libcrypt` and call `crypt_r` once ourselves (and ensure it loads `libfreeblpriv3` by using the sha512 prefix). That way the library will stay loaded (we'll hold a reference count on it), and the `dlclose` done by PAM won't unload it. Confirmed that there are no `sleep` calls now, and the results are also visible when running the benchmark targeted to the with and without fix code: ``` ╭─────────────────────────────────────────────────┬───────────────────────────┬───────────────────────────┬───────────────────────────╮ │name │ major-allocated │ minor-allocated │ monotonic-clock │ ├─────────────────────────────────────────────────┼───────────────────────────┼───────────────────────────┼───────────────────────────┤ │ concurrent authenticate (sleep fix, actual):8 │ 0.0000 mjw/run│ 50.0000 mnw/run│ 27043467.0000 ns/run│ ╰─────────────────────────────────────────────────┴───────────────────────────┴───────────────────────────┴───────────────────────────╯ ╭────────────────────────────────────────┬───────────────────────────┬───────────────────────────┬───────────────────────────╮ │name │ major-allocated │ minor-allocated │ monotonic-clock │ ├────────────────────────────────────────┼───────────────────────────┼───────────────────────────┼───────────────────────────┤ │ concurrent authenticate (no reuse):8 │ 0.0000 mjw/run│ 50.0000 mnw/run│ 1029831372.0000 ns/run│ ╰────────────────────────────────────────┴───────────────────────────┴───────────────────────────┴───────────────────────────╯ ``` Without this fix using 2 threads to perform PAM authentication would result in a 38x slowdown compared to using no threads at all (which is what XAPI currently does). Signed-off-by: Edwin Török <edwin.torok@cloud.com>
This enables PAM to be used in multithreaded mode (currently XAPI has a global lock around auth). Using an off-cpu flamegraph I identified that concurrent PAM calls are slow due to a call to `sleep(1)`. `pam_authenticate` calls `crypt_r` which calls `NSSLOW_Init` which on first use will try to initialize the just `dlopen`-ed library. If it encounters a race condition it does a `sleep(1)`. This race condition can be quite reliably reproduced when performing a lot of PAM authentications from multiple threads in parallel. GDB can also be used to confirm this by putting a breakpoint on `sleep`: ``` #0 __sleep (seconds=seconds@entry=1) at ../sysdeps/unix/sysv/linux/sleep.c:42 #1 0x00007ffff1548e22 in freebl_RunLoaderOnce () at lowhash_vector.c:122 #2 0x00007ffff1548f31 in freebl_InitVector () at lowhash_vector.c:131 #3 NSSLOW_Init () at lowhash_vector.c:148 #4 0x00007ffff1b8f09a in __sha512_crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=0x7ffff31e17b8 "dIJbsXKc0", xapi-project#5 0x00007ffff1b8d070 in __crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=<optimized out>, xapi-project#6 0x00007ffff1dc9abc in verify_pwd_hash (p=p@entry=0x7fffd8005a60 "pamtest-edvint", hash=<optimized out>, nullok=nullok@entry=0) at passverify.c:111 xapi-project#7 0x00007ffff1dc9139 in _unix_verify_password (pamh=pamh@entry=0x7fffd8002910, name=0x7fffd8002ab0 "pamtest-edvint", p=0x7fffd8005a60 "pamtest-edvint", ctrl=ctrl@entry=8389156) at support.c:777 xapi-project#8 0x00007ffff1dc6556 in pam_sm_authenticate (pamh=0x7fffd8002910, flags=<optimized out>, argc=<optimized out>, argv=<optimized out>) at pam_unix_auth.c:178 xapi-project#9 0x00007ffff7bcef1a in _pam_dispatch_aux (use_cached_chain=<optimized out>, resumed=<optimized out>, h=<optimized out>, flags=1, pamh=0x7fffd8002910) at pam_dispatch.c:110 xapi-project#10 _pam_dispatch (pamh=pamh@entry=0x7fffd8002910, flags=1, choice=choice@entry=1) at pam_dispatch.c:426 xapi-project#11 0x00007ffff7bce7e0 in pam_authenticate (pamh=0x7fffd8002910, flags=flags@entry=1) at pam_auth.c:34 xapi-project#12 0x00000000005ae567 in XA_mh_authorize (username=username@entry=0x7fffd80028d0 "pamtest-edvint", password=password@entry=0x7fffd80028f0 "pamtest-edvint", error=error@entry=0x7ffff31e1be8) at xa_auth.c:83 xapi-project#13 0x00000000005adf20 in stub_XA_mh_authorize (username=<optimized out>, password=<optimized out>) at xa_auth_stubs.c:42 ``` `pam_start` and `pam_end` doesn't help here, because on `pam_end` the library is `dlclose`-ed, so on next `pam_authenticate` it will have to go through the initialization code again. (This initialization code would've belonged into `pam_start`, not `pam_authenticate`, but there are several layers here including a call to `crypt_r`). Upstream has fixed this problem >5 years ago by switching to libxcrypt instead. Signed-off-by: Edwin Török <edwin.torok@cloud.com>
This enables PAM to be used in multithreaded mode (currently XAPI has a global lock around auth). Using an off-cpu flamegraph I identified that concurrent PAM calls are slow due to a call to `sleep(1)`. `pam_authenticate` calls `crypt_r` which calls `NSSLOW_Init` which on first use will try to initialize the just `dlopen`-ed library. If it encounters a race condition it does a `sleep(1)`. This race condition can be quite reliably reproduced when performing a lot of PAM authentications from multiple threads in parallel. GDB can also be used to confirm this by putting a breakpoint on `sleep`: ``` #0 __sleep (seconds=seconds@entry=1) at ../sysdeps/unix/sysv/linux/sleep.c:42 #1 0x00007ffff1548e22 in freebl_RunLoaderOnce () at lowhash_vector.c:122 #2 0x00007ffff1548f31 in freebl_InitVector () at lowhash_vector.c:131 #3 NSSLOW_Init () at lowhash_vector.c:148 #4 0x00007ffff1b8f09a in __sha512_crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=0x7ffff31e17b8 "dIJbsXKc0", xapi-project#5 0x00007ffff1b8d070 in __crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=<optimized out>, xapi-project#6 0x00007ffff1dc9abc in verify_pwd_hash (p=p@entry=0x7fffd8005a60 "pamtest-edvint", hash=<optimized out>, nullok=nullok@entry=0) at passverify.c:111 xapi-project#7 0x00007ffff1dc9139 in _unix_verify_password (pamh=pamh@entry=0x7fffd8002910, name=0x7fffd8002ab0 "pamtest-edvint", p=0x7fffd8005a60 "pamtest-edvint", ctrl=ctrl@entry=8389156) at support.c:777 xapi-project#8 0x00007ffff1dc6556 in pam_sm_authenticate (pamh=0x7fffd8002910, flags=<optimized out>, argc=<optimized out>, argv=<optimized out>) at pam_unix_auth.c:178 xapi-project#9 0x00007ffff7bcef1a in _pam_dispatch_aux (use_cached_chain=<optimized out>, resumed=<optimized out>, h=<optimized out>, flags=1, pamh=0x7fffd8002910) at pam_dispatch.c:110 xapi-project#10 _pam_dispatch (pamh=pamh@entry=0x7fffd8002910, flags=1, choice=choice@entry=1) at pam_dispatch.c:426 xapi-project#11 0x00007ffff7bce7e0 in pam_authenticate (pamh=0x7fffd8002910, flags=flags@entry=1) at pam_auth.c:34 xapi-project#12 0x00000000005ae567 in XA_mh_authorize (username=username@entry=0x7fffd80028d0 "pamtest-edvint", password=password@entry=0x7fffd80028f0 "pamtest-edvint", error=error@entry=0x7ffff31e1be8) at xa_auth.c:83 xapi-project#13 0x00000000005adf20 in stub_XA_mh_authorize (username=<optimized out>, password=<optimized out>) at xa_auth_stubs.c:42 ``` `pam_start` and `pam_end` doesn't help here, because on `pam_end` the library is `dlclose`-ed, so on next `pam_authenticate` it will have to go through the initialization code again. (This initialization code would've belonged into `pam_start`, not `pam_authenticate`, but there are several layers here including a call to `crypt_r`). Upstream has fixed this problem >5 years ago by switching to libxcrypt instead. Signed-off-by: Edwin Török <edwin.torok@cloud.com>
This enables PAM to be used in multithreaded mode (currently XAPI has a global lock around auth). Using an off-cpu flamegraph I identified that concurrent PAM calls are slow due to a call to `sleep(1)`. `pam_authenticate` calls `crypt_r` which calls `NSSLOW_Init` which on first use will try to initialize the just `dlopen`-ed library. If it encounters a race condition it does a `sleep(1)`. This race condition can be quite reliably reproduced when performing a lot of PAM authentications from multiple threads in parallel. GDB can also be used to confirm this by putting a breakpoint on `sleep`: ``` #0 __sleep (seconds=seconds@entry=1) at ../sysdeps/unix/sysv/linux/sleep.c:42 #1 0x00007ffff1548e22 in freebl_RunLoaderOnce () at lowhash_vector.c:122 #2 0x00007ffff1548f31 in freebl_InitVector () at lowhash_vector.c:131 #3 NSSLOW_Init () at lowhash_vector.c:148 #4 0x00007ffff1b8f09a in __sha512_crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=0x7ffff31e17b8 "dIJbsXKc0", xapi-project#5 0x00007ffff1b8d070 in __crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=<optimized out>, xapi-project#6 0x00007ffff1dc9abc in verify_pwd_hash (p=p@entry=0x7fffd8005a60 "pamtest-edvint", hash=<optimized out>, nullok=nullok@entry=0) at passverify.c:111 xapi-project#7 0x00007ffff1dc9139 in _unix_verify_password (pamh=pamh@entry=0x7fffd8002910, name=0x7fffd8002ab0 "pamtest-edvint", p=0x7fffd8005a60 "pamtest-edvint", ctrl=ctrl@entry=8389156) at support.c:777 xapi-project#8 0x00007ffff1dc6556 in pam_sm_authenticate (pamh=0x7fffd8002910, flags=<optimized out>, argc=<optimized out>, argv=<optimized out>) at pam_unix_auth.c:178 xapi-project#9 0x00007ffff7bcef1a in _pam_dispatch_aux (use_cached_chain=<optimized out>, resumed=<optimized out>, h=<optimized out>, flags=1, pamh=0x7fffd8002910) at pam_dispatch.c:110 xapi-project#10 _pam_dispatch (pamh=pamh@entry=0x7fffd8002910, flags=1, choice=choice@entry=1) at pam_dispatch.c:426 xapi-project#11 0x00007ffff7bce7e0 in pam_authenticate (pamh=0x7fffd8002910, flags=flags@entry=1) at pam_auth.c:34 xapi-project#12 0x00000000005ae567 in XA_mh_authorize (username=username@entry=0x7fffd80028d0 "pamtest-edvint", password=password@entry=0x7fffd80028f0 "pamtest-edvint", error=error@entry=0x7ffff31e1be8) at xa_auth.c:83 xapi-project#13 0x00000000005adf20 in stub_XA_mh_authorize (username=<optimized out>, password=<optimized out>) at xa_auth_stubs.c:42 ``` `pam_start` and `pam_end` doesn't help here, because on `pam_end` the library is `dlclose`-ed, so on next `pam_authenticate` it will have to go through the initialization code again. (This initialization code would've belonged into `pam_start`, not `pam_authenticate`, but there are several layers here including a call to `crypt_r`). Upstream has fixed this problem >5 years ago by switching to libxcrypt instead. Signed-off-by: Edwin Török <edwin.torok@cloud.com>
Temporary change to give extra details when int_of_string fails.
Oasisify. Ossify? No, that's not quite right...
|
The Pytype check is complete, you can check the results of the job here. |
pytype_reporter extracted 49 problem reports from pytype output.You can check the results of the job here |
Backport of 3b52b72 This enables PAM to be used in multithreaded mode (currently XAPI has a global lock around auth). Using an off-cpu flamegraph I identified that concurrent PAM calls are slow due to a call to `sleep(1)`. `pam_authenticate` calls `crypt_r` which calls `NSSLOW_Init` which on first use will try to initialize the just `dlopen`-ed library. If it encounters a race condition it does a `sleep(1)`. This race condition can be quite reliably reproduced when performing a lot of PAM authentications from multiple threads in parallel. GDB can also be used to confirm this by putting a breakpoint on `sleep`: ``` #0 __sleep (seconds=seconds@entry=1) at ../sysdeps/unix/sysv/linux/sleep.c:42 xapi-project#1 0x00007ffff1548e22 in freebl_RunLoaderOnce () at lowhash_vector.c:122 xapi-project#2 0x00007ffff1548f31 in freebl_InitVector () at lowhash_vector.c:131 xapi-project#3 NSSLOW_Init () at lowhash_vector.c:148 xapi-project#4 0x00007ffff1b8f09a in __sha512_crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=0x7ffff31e17b8 "dIJbsXKc0", xapi-project#5 0x00007ffff1b8d070 in __crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=<optimized out>, xapi-project#6 0x00007ffff1dc9abc in verify_pwd_hash (p=p@entry=0x7fffd8005a60 "pamtest-edvint", hash=<optimized out>, nullok=nullok@entry=0) at passverify.c:111 xapi-project#7 0x00007ffff1dc9139 in _unix_verify_password (pamh=pamh@entry=0x7fffd8002910, name=0x7fffd8002ab0 "pamtest-edvint", p=0x7fffd8005a60 "pamtest-edvint", ctrl=ctrl@entry=8389156) at support.c:777 xapi-project#8 0x00007ffff1dc6556 in pam_sm_authenticate (pamh=0x7fffd8002910, flags=<optimized out>, argc=<optimized out>, argv=<optimized out>) at pam_unix_auth.c:178 xapi-project#9 0x00007ffff7bcef1a in _pam_dispatch_aux (use_cached_chain=<optimized out>, resumed=<optimized out>, h=<optimized out>, flags=1, pamh=0x7fffd8002910) at pam_dispatch.c:110 xapi-project#10 _pam_dispatch (pamh=pamh@entry=0x7fffd8002910, flags=1, choice=choice@entry=1) at pam_dispatch.c:426 xapi-project#11 0x00007ffff7bce7e0 in pam_authenticate (pamh=0x7fffd8002910, flags=flags@entry=1) at pam_auth.c:34 xapi-project#12 0x00000000005ae567 in XA_mh_authorize (username=username@entry=0x7fffd80028d0 "pamtest-edvint", password=password@entry=0x7fffd80028f0 "pamtest-edvint", error=error@entry=0x7ffff31e1be8) at xa_auth.c:83 xapi-project#13 0x00000000005adf20 in stub_XA_mh_authorize (username=<optimized out>, password=<optimized out>) at xa_auth_stubs.c:42 ``` `pam_start` and `pam_end` doesn't help here, because on `pam_end` the library is `dlclose`-ed, so on next `pam_authenticate` it will have to go through the initialization code again. (This initialization code would've belonged into `pam_start`, not `pam_authenticate`, but there are several layers here including a call to `crypt_r`). Upstream has fixed this problem >5 years ago by switching to libxcrypt instead. Signed-off-by: Edwin Török <edwin.torok@cloud.com> Signed-off-by: Christian Lindig <christian.lindig@cloud.com>
Backport of 3b52b72 This enables PAM to be used in multithreaded mode (currently XAPI has a global lock around auth). Using an off-cpu flamegraph I identified that concurrent PAM calls are slow due to a call to `sleep(1)`. `pam_authenticate` calls `crypt_r` which calls `NSSLOW_Init` which on first use will try to initialize the just `dlopen`-ed library. If it encounters a race condition it does a `sleep(1)`. This race condition can be quite reliably reproduced when performing a lot of PAM authentications from multiple threads in parallel. GDB can also be used to confirm this by putting a breakpoint on `sleep`: ``` #0 __sleep (seconds=seconds@entry=1) at ../sysdeps/unix/sysv/linux/sleep.c:42 #1 0x00007ffff1548e22 in freebl_RunLoaderOnce () at lowhash_vector.c:122 #2 0x00007ffff1548f31 in freebl_InitVector () at lowhash_vector.c:131 #3 NSSLOW_Init () at lowhash_vector.c:148 #4 0x00007ffff1b8f09a in __sha512_crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=0x7ffff31e17b8 "dIJbsXKc0", #5 0x00007ffff1b8d070 in __crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=<optimized out>, #6 0x00007ffff1dc9abc in verify_pwd_hash (p=p@entry=0x7fffd8005a60 "pamtest-edvint", hash=<optimized out>, nullok=nullok@entry=0) at passverify.c:111 #7 0x00007ffff1dc9139 in _unix_verify_password (pamh=pamh@entry=0x7fffd8002910, name=0x7fffd8002ab0 "pamtest-edvint", p=0x7fffd8005a60 "pamtest-edvint", ctrl=ctrl@entry=8389156) at support.c:777 #8 0x00007ffff1dc6556 in pam_sm_authenticate (pamh=0x7fffd8002910, flags=<optimized out>, argc=<optimized out>, argv=<optimized out>) at pam_unix_auth.c:178 #9 0x00007ffff7bcef1a in _pam_dispatch_aux (use_cached_chain=<optimized out>, resumed=<optimized out>, h=<optimized out>, flags=1, pamh=0x7fffd8002910) at pam_dispatch.c:110 #10 _pam_dispatch (pamh=pamh@entry=0x7fffd8002910, flags=1, choice=choice@entry=1) at pam_dispatch.c:426 #11 0x00007ffff7bce7e0 in pam_authenticate (pamh=0x7fffd8002910, flags=flags@entry=1) at pam_auth.c:34 #12 0x00000000005ae567 in XA_mh_authorize (username=username@entry=0x7fffd80028d0 "pamtest-edvint", password=password@entry=0x7fffd80028f0 "pamtest-edvint", error=error@entry=0x7ffff31e1be8) at xa_auth.c:83 #13 0x00000000005adf20 in stub_XA_mh_authorize (username=<optimized out>, password=<optimized out>) at xa_auth_stubs.c:42 ``` `pam_start` and `pam_end` doesn't help here, because on `pam_end` the library is `dlclose`-ed, so on next `pam_authenticate` it will have to go through the initialization code again. (This initialization code would've belonged into `pam_start`, not `pam_authenticate`, but there are several layers here including a call to `crypt_r`). Upstream has fixed this problem >5 years ago by switching to libxcrypt instead. Signed-off-by: Edwin Török <edwin.torok@cloud.com> Signed-off-by: Christian Lindig <christian.lindig@cloud.com>
Patch for PR1007. Please see commit message for more details.