Skip to content

Conversation

@mcclurmc
Copy link
Contributor

@mcclurmc mcclurmc commented Feb 3, 2011

Patch for PR1007. Please see commit message for more details.

@mcclurmc
Copy link
Contributor Author

mcclurmc commented Feb 3, 2011

I just rebased this branch to separate the changes into more logical blocks. I might follow this up with a big whitespace change soon. Sorry to Vincent for blowing away his previous comments in the rebase.

@matthiasgoergens
Copy link
Contributor

Thanks for rebasing into logical commits.

Please only do rebases when your branch isn't merged into the xen-org repository, yet.

@djs55
Copy link
Contributor

djs55 commented Feb 3, 2011

Mike: could you add "Signed-off-by: ..." lines to the commit comments? (I hope this is easy to do retrospectively)

@matthiasgoergens
Copy link
Contributor

That's easy to do. But do we need them? The commits do have authors that git knows about.

A reviewed-by line may be useful. But I don't know where we should but it. Perhaps in the merge commits? Or in the pull-request discussion? (Or in the individual commits?)

@mcclurmc
Copy link
Contributor Author

mcclurmc commented Feb 4, 2011

Dave, signed-off-by lines have been added. FYI, doing a 'git commit --signoff' will give you signed-off-by lines automatically.

David Scott and others added 25 commits February 9, 2011 14:17
…ent network" since this (a) is needed by other stuff; and (b) happens to cause VM.start to fail if any other VIF exists.

Signed-off-by: David Scott <dave.scott@eu.citrix.com>
…ement-interface breaks and none of the dom0 interfaces are brought up so ssh doesn't work.

After 'xe host-management-disable' /etc/xensource-inventory:MANAGEMENT_INTERFACE=''

Signed-off-by: David Scott <dave.scott@eu.citrix.com>
…t 'n' refers to a plug ordering.

The previous meaning was 'PCI bus device' number which was only meaningful for PV guests (and of unknown use). The HVM PCI hotplug protocol doesn't seem to support this anyway. If we need it again, we can tweak the syntax.

Signed-off-by: David Scott <dave.scott@eu.citrix.com>
…onstant bridge name, so it can be referenced from global firewall rules.

Signed-off-by: David Scott <dave.scott@eu.citrix.com>
…use Xc.physinfo and max_cpu_id + 1

Signed-off-by: David Scott <dave.scott@eu.citrix.com>
This fixes the converse of the problem addressed by commit fe19e820
"xenserver: Kill bond master's dhclient when bringing up bond slave".  In
that commit's log message, I claimed that the converse was not a problem,
but I was wrong.  I must have screwed up in testing, because it really is
a problem.  This commit fixes it.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
CC: Dominic Curran <dominic.curran@citrix.com>
Reported-by: Michael Mao <mmao@nicira.com>
Bug xapi-project#2668.
This patch defensively guarantees that the first id in
xs-network-uuids will belong to the primary network (as opposed to
a vlan).  Given that the primary network id comes first, it parses
xs-network-ids and only copies the primary id to bridge-id when
monitor-external-ids is run.

Feature xapi-project#3647

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Reviewed-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
This commit adds bond_mode configuration to Interface Reconfigure
so that it may be changed using standard XenServer commands.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
This commit makes necessary changes to Interface Reconfigure to
allow miimon bond-detect-mode and bond-miimon-interval to be
changed using xapi.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Due to some recent changes in the database backend, some fields which used to get computed dynamically on the fly are not cached. Hence when reverting a VM to a snapshot/checkpoint, we should avoid copying some extra fields from the snapshot metadata.

Signed-off-by: Zheng Li <zheng.li@eu.citrix.com>
Based on draft by Lars Kurth.

Signed-off-by: Matthias Goergens <matthias.goergens@gmail.com>
When saving the database cache, InterfaceReconfigure can crash if
unexpected attributes are in an object's other-config column.  This
commit causes it to skip that attribute and log a warning.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Dominic Curran <Dominic.Curran@citrix.com>
Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
This commit allows xapi to set the fail mode through the
vswitch-controller-fail-mode other-config setting in the Pool
object.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
InterfaceReconfigure assumes the pool is in the configuration
cache.  This is always true except when upgrading from an older
version of openvswitch which does record the pool.  This can cause
upgrades to lock up.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
Interface reconfigure can crash when setting fail_mode if an
expected other_config setting is not set.

Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
This patch allows the Pool object's vswitch-controller-fail-mode
setting to be overridden on by Network object.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
Makes required changes to interface reconfigure to allow LACP
configuration from xapi.  Conforms to XenServer style bonding
configuration which is slightly different from OVS.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Bug xapi-project#4213.
Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
Commit 404c169247b1c3ac2ebad887f0421478a6cef924 breaks compatibility with
XenServers older than 5.6 FP1.  This commit removes the last vestiges of
support for those older XenServer versions.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Dominic Curran <dominic.curran@citrix.com>
koushikcgit pushed a commit to koushikcgit/xen-api that referenced this pull request Aug 7, 2015
Remove extra and legacy args from the bootloader invocation
lippirk referenced this pull request in lippirk/xen-api Jul 20, 2020
Signed-off-by: Si Beaumont <simon.beaumont@citrix.com>
kc284 pushed a commit to kc284/xen-api that referenced this pull request Mar 11, 2021
CP-5500: Removed Powershell snapin installer; ship it as a module instea...
robhoes pushed a commit to robhoes/xen-api that referenced this pull request Jul 6, 2021
Fix the autogenerated OASIS files
robhoes pushed a commit that referenced this pull request Sep 14, 2021
Remove almost everything from .gitignore (yay obuild!)
robhoes pushed a commit that referenced this pull request Sep 14, 2021
robhoes pushed a commit that referenced this pull request Sep 14, 2021
CA-121668: Move plugins to init level 23
robhoes pushed a commit that referenced this pull request Sep 14, 2021
Interface tidyup and documentation
robhoes pushed a commit that referenced this pull request Sep 14, 2021
Do not use 'finally' from package xapi-stdext-pervasives.
psafont referenced this pull request in psafont/xen-api Sep 20, 2021
Update to new xentore interface in v1.2.3
robhoes pushed a commit to robhoes/xen-api that referenced this pull request Sep 21, 2021
robhoes pushed a commit to robhoes/xen-api that referenced this pull request Sep 21, 2021
xapi-project:master catching up with upstream djs55:master
robhoes pushed a commit to robhoes/xen-api that referenced this pull request Sep 21, 2021
psafont referenced this pull request in psafont/xen-api Oct 19, 2021
robhoes pushed a commit to robhoes/xen-api that referenced this pull request Jan 10, 2022
Remove autogenerated setup.ml and introduce ocaml .gitignore
psafont referenced this pull request in psafont/xen-api Jan 10, 2022
Sync opam file with xs-opam
robhoes pushed a commit to robhoes/xen-api that referenced this pull request Jan 10, 2022
CP-13874: pass in `name_label` and `name_description`
edwintorok referenced this pull request in edwintorok/xen-api Jan 10, 2022
CA-322784 Fix varstored-guard logging is verbose
psafont referenced this pull request in psafont/xen-api Mar 17, 2023
RFC: move Xenstore_watch into ezxenstore
edwintorok referenced this pull request in edwintorok/xen-api Oct 10, 2023
Using an off-cpu flamegraph I identified that concurrent PAM calls are slow due to a call to `sleep(1)`.
`pam_authenticate` calls `crypt_r` which calls `NSSLOW_Init` which on first use will try to initialize the just `dlopen`-ed library.
If it encounters a race condition it does a `sleep(1)`. This race condition can be quite reliably reproduced when performing a lot of PAM authentications from multiple threads in parallel.

GDB can also be used to confirm this by putting a breakpoint on `sleep`:
```
 #0  __sleep (seconds=seconds@entry=1) at ../sysdeps/unix/sysv/linux/sleep.c:42
 #1  0x00007ffff1548e22 in freebl_RunLoaderOnce () at lowhash_vector.c:122
 #2  0x00007ffff1548f31 in freebl_InitVector () at lowhash_vector.c:131
 #3  NSSLOW_Init () at lowhash_vector.c:148
 #4  0x00007ffff1b8f09a in __sha512_crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=0x7ffff31e17b8 "dIJbsXKc0",
 xapi-project#5  0x00007ffff1b8d070 in __crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=<optimized out>,
 xapi-project#6  0x00007ffff1dc9abc in verify_pwd_hash (p=p@entry=0x7fffd8005a60 "pamtest-edvint", hash=<optimized out>, nullok=nullok@entry=0) at passverify.c:111
 xapi-project#7  0x00007ffff1dc9139 in _unix_verify_password (pamh=pamh@entry=0x7fffd8002910, name=0x7fffd8002ab0 "pamtest-edvint", p=0x7fffd8005a60 "pamtest-edvint", ctrl=ctrl@entry=8389156) at support.c:777
 xapi-project#8  0x00007ffff1dc6556 in pam_sm_authenticate (pamh=0x7fffd8002910, flags=<optimized out>, argc=<optimized out>, argv=<optimized out>) at pam_unix_auth.c:178
 xapi-project#9  0x00007ffff7bcef1a in _pam_dispatch_aux (use_cached_chain=<optimized out>, resumed=<optimized out>, h=<optimized out>, flags=1, pamh=0x7fffd8002910) at pam_dispatch.c:110
 xapi-project#10 _pam_dispatch (pamh=pamh@entry=0x7fffd8002910, flags=1, choice=choice@entry=1) at pam_dispatch.c:426
 xapi-project#11 0x00007ffff7bce7e0 in pam_authenticate (pamh=0x7fffd8002910, flags=flags@entry=1) at pam_auth.c:34
 xapi-project#12 0x00000000005ae567 in XA_mh_authorize (username=username@entry=0x7fffd80028d0 "pamtest-edvint", password=password@entry=0x7fffd80028f0 "pamtest-edvint", error=error@entry=0x7ffff31e1be8) at xa_auth.c:83
 xapi-project#13 0x00000000005adf20 in stub_XA_mh_authorize (username=<optimized out>, password=<optimized out>) at xa_auth_stubs.c:42
 xapi-project#14 0x00000000004a0a6a in camlDune__exe__Bench_pam__pam_authenticate$27_320 () at ocaml/tests/bench/pam/bench_pam.ml:63
 xapi-project#15 0x00000000004a1113 in camlEzbechamel_concurrent__worker_loop_359 () at ocaml/tests/bench/lib/concurrent/ezbechamel_concurrent.ml:36
 xapi-project#16 0x00000000005935b9 in camlStdlib__Fun__protect_317 ()
 xapi-project#17 0x00000000004a1955 in camlThread__fun_850 ()
 xapi-project#18 0x00000000005d6401 in caml_start_program ()
 xapi-project#19 0x00000000005cd0fd in caml_callback_exn ()
 xapi-project#20 0x00000000005af810 in caml_thread_start ()
 xapi-project#21 0x00007ffff79b7e25 in start_thread (arg=0x7ffff31e2700) at pthread_create.c:308
 xapi-project#22 0x00007ffff71dbbad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
```

`pam_start` and `pam_end` doesn't help here, because on `pam_end` the library is `dlclose`-ed, so on next `pam_authenticate` it will have to go through the initialization code again.
(This initialization code would've belonged into `pam_start`, not `pam_authenticate`, but there are several layers here including a call to `crypt_r`).

To avoid this link with `libcrypt` and call `crypt_r` once ourselves (and ensure it loads `libfreeblpriv3` by using the sha512 prefix).
That way the library will stay loaded (we'll hold a reference count on it), and the `dlclose` done by PAM won't unload it.

Confirmed that there are no `sleep` calls now, and the results are also visible when running the benchmark targeted to the with and without fix code:
```
╭─────────────────────────────────────────────────┬───────────────────────────┬───────────────────────────┬───────────────────────────╮
│name                                             │  major-allocated          │  minor-allocated          │  monotonic-clock          │
├─────────────────────────────────────────────────┼───────────────────────────┼───────────────────────────┼───────────────────────────┤
│  concurrent authenticate (sleep fix, actual):8  │             0.0000 mjw/run│            50.0000 mnw/run│       27043467.0000 ns/run│
╰─────────────────────────────────────────────────┴───────────────────────────┴───────────────────────────┴───────────────────────────╯

╭────────────────────────────────────────┬───────────────────────────┬───────────────────────────┬───────────────────────────╮
│name                                    │  major-allocated          │  minor-allocated          │  monotonic-clock          │
├────────────────────────────────────────┼───────────────────────────┼───────────────────────────┼───────────────────────────┤
│  concurrent authenticate (no reuse):8  │             0.0000 mjw/run│            50.0000 mnw/run│     1029831372.0000 ns/run│
╰────────────────────────────────────────┴───────────────────────────┴───────────────────────────┴───────────────────────────╯
```

Without this fix using 2 threads to perform PAM authentication would result in a 38x slowdown compared to using no threads at all (which is what XAPI currently does).

Signed-off-by: Edwin Török <edwin.torok@cloud.com>
edwintorok referenced this pull request in edwintorok/xen-api Oct 11, 2023
This enables PAM to be used in multithreaded mode (currently XAPI has a global lock around auth).

Using an off-cpu flamegraph I identified that concurrent PAM calls are slow due to a call to `sleep(1)`.
`pam_authenticate` calls `crypt_r` which calls `NSSLOW_Init` which on first use will try to initialize the just `dlopen`-ed library.
If it encounters a race condition it does a `sleep(1)`. This race condition can be quite reliably reproduced when performing a lot of PAM authentications from multiple threads in parallel.

GDB can also be used to confirm this by putting a breakpoint on `sleep`:
```
  #0  __sleep (seconds=seconds@entry=1) at ../sysdeps/unix/sysv/linux/sleep.c:42
  #1  0x00007ffff1548e22 in freebl_RunLoaderOnce () at lowhash_vector.c:122
  #2  0x00007ffff1548f31 in freebl_InitVector () at lowhash_vector.c:131
  #3  NSSLOW_Init () at lowhash_vector.c:148
  #4  0x00007ffff1b8f09a in __sha512_crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=0x7ffff31e17b8 "dIJbsXKc0",
  xapi-project#5  0x00007ffff1b8d070 in __crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=<optimized out>,
  xapi-project#6  0x00007ffff1dc9abc in verify_pwd_hash (p=p@entry=0x7fffd8005a60 "pamtest-edvint", hash=<optimized out>, nullok=nullok@entry=0) at passverify.c:111
  xapi-project#7  0x00007ffff1dc9139 in _unix_verify_password (pamh=pamh@entry=0x7fffd8002910, name=0x7fffd8002ab0 "pamtest-edvint", p=0x7fffd8005a60 "pamtest-edvint", ctrl=ctrl@entry=8389156) at support.c:777
  xapi-project#8  0x00007ffff1dc6556 in pam_sm_authenticate (pamh=0x7fffd8002910, flags=<optimized out>, argc=<optimized out>, argv=<optimized out>) at pam_unix_auth.c:178
  xapi-project#9  0x00007ffff7bcef1a in _pam_dispatch_aux (use_cached_chain=<optimized out>, resumed=<optimized out>, h=<optimized out>, flags=1, pamh=0x7fffd8002910) at pam_dispatch.c:110
  xapi-project#10 _pam_dispatch (pamh=pamh@entry=0x7fffd8002910, flags=1, choice=choice@entry=1) at pam_dispatch.c:426
  xapi-project#11 0x00007ffff7bce7e0 in pam_authenticate (pamh=0x7fffd8002910, flags=flags@entry=1) at pam_auth.c:34
  xapi-project#12 0x00000000005ae567 in XA_mh_authorize (username=username@entry=0x7fffd80028d0 "pamtest-edvint", password=password@entry=0x7fffd80028f0 "pamtest-edvint", error=error@entry=0x7ffff31e1be8) at xa_auth.c:83
  xapi-project#13 0x00000000005adf20 in stub_XA_mh_authorize (username=<optimized out>, password=<optimized out>) at xa_auth_stubs.c:42
```

`pam_start` and `pam_end` doesn't help here, because on `pam_end` the library is `dlclose`-ed, so on next `pam_authenticate` it will have to go through the initialization code again.
(This initialization code would've belonged into `pam_start`, not `pam_authenticate`, but there are several layers here including a call to `crypt_r`).
Upstream has fixed this problem >5 years ago by switching to libxcrypt instead.

Signed-off-by: Edwin Török <edwin.torok@cloud.com>
edwintorok referenced this pull request in edwintorok/xen-api Oct 11, 2023
This enables PAM to be used in multithreaded mode (currently XAPI has a global lock around auth).

Using an off-cpu flamegraph I identified that concurrent PAM calls are slow due to a call to `sleep(1)`.
`pam_authenticate` calls `crypt_r` which calls `NSSLOW_Init` which on first use will try to initialize the just `dlopen`-ed library.
If it encounters a race condition it does a `sleep(1)`. This race condition can be quite reliably reproduced when performing a lot of PAM authentications from multiple threads in parallel.

GDB can also be used to confirm this by putting a breakpoint on `sleep`:
```
  #0  __sleep (seconds=seconds@entry=1) at ../sysdeps/unix/sysv/linux/sleep.c:42
  #1  0x00007ffff1548e22 in freebl_RunLoaderOnce () at lowhash_vector.c:122
  #2  0x00007ffff1548f31 in freebl_InitVector () at lowhash_vector.c:131
  #3  NSSLOW_Init () at lowhash_vector.c:148
  #4  0x00007ffff1b8f09a in __sha512_crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=0x7ffff31e17b8 "dIJbsXKc0",
  xapi-project#5  0x00007ffff1b8d070 in __crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=<optimized out>,
  xapi-project#6  0x00007ffff1dc9abc in verify_pwd_hash (p=p@entry=0x7fffd8005a60 "pamtest-edvint", hash=<optimized out>, nullok=nullok@entry=0) at passverify.c:111
  xapi-project#7  0x00007ffff1dc9139 in _unix_verify_password (pamh=pamh@entry=0x7fffd8002910, name=0x7fffd8002ab0 "pamtest-edvint", p=0x7fffd8005a60 "pamtest-edvint", ctrl=ctrl@entry=8389156) at support.c:777
  xapi-project#8  0x00007ffff1dc6556 in pam_sm_authenticate (pamh=0x7fffd8002910, flags=<optimized out>, argc=<optimized out>, argv=<optimized out>) at pam_unix_auth.c:178
  xapi-project#9  0x00007ffff7bcef1a in _pam_dispatch_aux (use_cached_chain=<optimized out>, resumed=<optimized out>, h=<optimized out>, flags=1, pamh=0x7fffd8002910) at pam_dispatch.c:110
  xapi-project#10 _pam_dispatch (pamh=pamh@entry=0x7fffd8002910, flags=1, choice=choice@entry=1) at pam_dispatch.c:426
  xapi-project#11 0x00007ffff7bce7e0 in pam_authenticate (pamh=0x7fffd8002910, flags=flags@entry=1) at pam_auth.c:34
  xapi-project#12 0x00000000005ae567 in XA_mh_authorize (username=username@entry=0x7fffd80028d0 "pamtest-edvint", password=password@entry=0x7fffd80028f0 "pamtest-edvint", error=error@entry=0x7ffff31e1be8) at xa_auth.c:83
  xapi-project#13 0x00000000005adf20 in stub_XA_mh_authorize (username=<optimized out>, password=<optimized out>) at xa_auth_stubs.c:42
```

`pam_start` and `pam_end` doesn't help here, because on `pam_end` the library is `dlclose`-ed, so on next `pam_authenticate` it will have to go through the initialization code again.
(This initialization code would've belonged into `pam_start`, not `pam_authenticate`, but there are several layers here including a call to `crypt_r`).
Upstream has fixed this problem >5 years ago by switching to libxcrypt instead.

Signed-off-by: Edwin Török <edwin.torok@cloud.com>
edwintorok referenced this pull request in edwintorok/xen-api Oct 12, 2023
This enables PAM to be used in multithreaded mode (currently XAPI has a global lock around auth).

Using an off-cpu flamegraph I identified that concurrent PAM calls are slow due to a call to `sleep(1)`.
`pam_authenticate` calls `crypt_r` which calls `NSSLOW_Init` which on first use will try to initialize the just `dlopen`-ed library.
If it encounters a race condition it does a `sleep(1)`. This race condition can be quite reliably reproduced when performing a lot of PAM authentications from multiple threads in parallel.

GDB can also be used to confirm this by putting a breakpoint on `sleep`:
```
  #0  __sleep (seconds=seconds@entry=1) at ../sysdeps/unix/sysv/linux/sleep.c:42
  #1  0x00007ffff1548e22 in freebl_RunLoaderOnce () at lowhash_vector.c:122
  #2  0x00007ffff1548f31 in freebl_InitVector () at lowhash_vector.c:131
  #3  NSSLOW_Init () at lowhash_vector.c:148
  #4  0x00007ffff1b8f09a in __sha512_crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=0x7ffff31e17b8 "dIJbsXKc0",
  xapi-project#5  0x00007ffff1b8d070 in __crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=<optimized out>,
  xapi-project#6  0x00007ffff1dc9abc in verify_pwd_hash (p=p@entry=0x7fffd8005a60 "pamtest-edvint", hash=<optimized out>, nullok=nullok@entry=0) at passverify.c:111
  xapi-project#7  0x00007ffff1dc9139 in _unix_verify_password (pamh=pamh@entry=0x7fffd8002910, name=0x7fffd8002ab0 "pamtest-edvint", p=0x7fffd8005a60 "pamtest-edvint", ctrl=ctrl@entry=8389156) at support.c:777
  xapi-project#8  0x00007ffff1dc6556 in pam_sm_authenticate (pamh=0x7fffd8002910, flags=<optimized out>, argc=<optimized out>, argv=<optimized out>) at pam_unix_auth.c:178
  xapi-project#9  0x00007ffff7bcef1a in _pam_dispatch_aux (use_cached_chain=<optimized out>, resumed=<optimized out>, h=<optimized out>, flags=1, pamh=0x7fffd8002910) at pam_dispatch.c:110
  xapi-project#10 _pam_dispatch (pamh=pamh@entry=0x7fffd8002910, flags=1, choice=choice@entry=1) at pam_dispatch.c:426
  xapi-project#11 0x00007ffff7bce7e0 in pam_authenticate (pamh=0x7fffd8002910, flags=flags@entry=1) at pam_auth.c:34
  xapi-project#12 0x00000000005ae567 in XA_mh_authorize (username=username@entry=0x7fffd80028d0 "pamtest-edvint", password=password@entry=0x7fffd80028f0 "pamtest-edvint", error=error@entry=0x7ffff31e1be8) at xa_auth.c:83
  xapi-project#13 0x00000000005adf20 in stub_XA_mh_authorize (username=<optimized out>, password=<optimized out>) at xa_auth_stubs.c:42
```

`pam_start` and `pam_end` doesn't help here, because on `pam_end` the library is `dlclose`-ed, so on next `pam_authenticate` it will have to go through the initialization code again.
(This initialization code would've belonged into `pam_start`, not `pam_authenticate`, but there are several layers here including a call to `crypt_r`).
Upstream has fixed this problem >5 years ago by switching to libxcrypt instead.

Signed-off-by: Edwin Török <edwin.torok@cloud.com>
edwintorok referenced this pull request in edwintorok/xen-api Jan 22, 2024
Temporary change to give extra details when int_of_string fails.
edwintorok referenced this pull request in edwintorok/xen-api Jan 22, 2024
Oasisify. Ossify? No, that's not quite right...
edwintorok referenced this pull request in edwintorok/xen-api Jan 22, 2024
@github-actions
Copy link

github-actions bot commented Jan 24, 2024

The Pytype check is complete, you can check the results of the job here.

@github-actions
Copy link

github-actions bot commented Jan 24, 2024

pytype_reporter extracted 49 problem reports from pytype output.

You can check the results of the job here

lindig pushed a commit to lindig/xen-api that referenced this pull request Feb 28, 2024
Backport of 3b52b72

This enables PAM to be used in multithreaded mode (currently XAPI has a global lock around auth).

Using an off-cpu flamegraph I identified that concurrent PAM calls are slow due to a call to `sleep(1)`.
`pam_authenticate` calls `crypt_r` which calls `NSSLOW_Init` which on first use will try to initialize the just `dlopen`-ed library.
If it encounters a race condition it does a `sleep(1)`. This race condition can be quite reliably reproduced when performing a lot of PAM authentications from multiple threads in parallel.

GDB can also be used to confirm this by putting a breakpoint on `sleep`:
```
  #0  __sleep (seconds=seconds@entry=1) at ../sysdeps/unix/sysv/linux/sleep.c:42
  xapi-project#1  0x00007ffff1548e22 in freebl_RunLoaderOnce () at lowhash_vector.c:122
  xapi-project#2  0x00007ffff1548f31 in freebl_InitVector () at lowhash_vector.c:131
  xapi-project#3  NSSLOW_Init () at lowhash_vector.c:148
  xapi-project#4  0x00007ffff1b8f09a in __sha512_crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=0x7ffff31e17b8 "dIJbsXKc0",
  xapi-project#5  0x00007ffff1b8d070 in __crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=<optimized out>,
  xapi-project#6  0x00007ffff1dc9abc in verify_pwd_hash (p=p@entry=0x7fffd8005a60 "pamtest-edvint", hash=<optimized out>, nullok=nullok@entry=0) at passverify.c:111
  xapi-project#7  0x00007ffff1dc9139 in _unix_verify_password (pamh=pamh@entry=0x7fffd8002910, name=0x7fffd8002ab0 "pamtest-edvint", p=0x7fffd8005a60 "pamtest-edvint", ctrl=ctrl@entry=8389156) at support.c:777
  xapi-project#8  0x00007ffff1dc6556 in pam_sm_authenticate (pamh=0x7fffd8002910, flags=<optimized out>, argc=<optimized out>, argv=<optimized out>) at pam_unix_auth.c:178
  xapi-project#9  0x00007ffff7bcef1a in _pam_dispatch_aux (use_cached_chain=<optimized out>, resumed=<optimized out>, h=<optimized out>, flags=1, pamh=0x7fffd8002910) at pam_dispatch.c:110
  xapi-project#10 _pam_dispatch (pamh=pamh@entry=0x7fffd8002910, flags=1, choice=choice@entry=1) at pam_dispatch.c:426
  xapi-project#11 0x00007ffff7bce7e0 in pam_authenticate (pamh=0x7fffd8002910, flags=flags@entry=1) at pam_auth.c:34
  xapi-project#12 0x00000000005ae567 in XA_mh_authorize (username=username@entry=0x7fffd80028d0 "pamtest-edvint", password=password@entry=0x7fffd80028f0 "pamtest-edvint", error=error@entry=0x7ffff31e1be8) at xa_auth.c:83
  xapi-project#13 0x00000000005adf20 in stub_XA_mh_authorize (username=<optimized out>, password=<optimized out>) at xa_auth_stubs.c:42
```

`pam_start` and `pam_end` doesn't help here, because on `pam_end` the library is `dlclose`-ed, so on next `pam_authenticate` it will have to go through the initialization code again.
(This initialization code would've belonged into `pam_start`, not `pam_authenticate`, but there are several layers here including a call to `crypt_r`).
Upstream has fixed this problem >5 years ago by switching to libxcrypt instead.

Signed-off-by: Edwin Török <edwin.torok@cloud.com>
Signed-off-by: Christian Lindig <christian.lindig@cloud.com>
lindig pushed a commit that referenced this pull request Feb 29, 2024
Backport of 3b52b72

This enables PAM to be used in multithreaded mode (currently XAPI has a global lock around auth).

Using an off-cpu flamegraph I identified that concurrent PAM calls are slow due to a call to `sleep(1)`.
`pam_authenticate` calls `crypt_r` which calls `NSSLOW_Init` which on first use will try to initialize the just `dlopen`-ed library.
If it encounters a race condition it does a `sleep(1)`. This race condition can be quite reliably reproduced when performing a lot of PAM authentications from multiple threads in parallel.

GDB can also be used to confirm this by putting a breakpoint on `sleep`:
```
  #0  __sleep (seconds=seconds@entry=1) at ../sysdeps/unix/sysv/linux/sleep.c:42
  #1  0x00007ffff1548e22 in freebl_RunLoaderOnce () at lowhash_vector.c:122
  #2  0x00007ffff1548f31 in freebl_InitVector () at lowhash_vector.c:131
  #3  NSSLOW_Init () at lowhash_vector.c:148
  #4  0x00007ffff1b8f09a in __sha512_crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=0x7ffff31e17b8 "dIJbsXKc0",
  #5  0x00007ffff1b8d070 in __crypt_r (key=key@entry=0x7fffd8005a60 "pamtest-edvint", salt=<optimized out>,
  #6  0x00007ffff1dc9abc in verify_pwd_hash (p=p@entry=0x7fffd8005a60 "pamtest-edvint", hash=<optimized out>, nullok=nullok@entry=0) at passverify.c:111
  #7  0x00007ffff1dc9139 in _unix_verify_password (pamh=pamh@entry=0x7fffd8002910, name=0x7fffd8002ab0 "pamtest-edvint", p=0x7fffd8005a60 "pamtest-edvint", ctrl=ctrl@entry=8389156) at support.c:777
  #8  0x00007ffff1dc6556 in pam_sm_authenticate (pamh=0x7fffd8002910, flags=<optimized out>, argc=<optimized out>, argv=<optimized out>) at pam_unix_auth.c:178
  #9  0x00007ffff7bcef1a in _pam_dispatch_aux (use_cached_chain=<optimized out>, resumed=<optimized out>, h=<optimized out>, flags=1, pamh=0x7fffd8002910) at pam_dispatch.c:110
  #10 _pam_dispatch (pamh=pamh@entry=0x7fffd8002910, flags=1, choice=choice@entry=1) at pam_dispatch.c:426
  #11 0x00007ffff7bce7e0 in pam_authenticate (pamh=0x7fffd8002910, flags=flags@entry=1) at pam_auth.c:34
  #12 0x00000000005ae567 in XA_mh_authorize (username=username@entry=0x7fffd80028d0 "pamtest-edvint", password=password@entry=0x7fffd80028f0 "pamtest-edvint", error=error@entry=0x7ffff31e1be8) at xa_auth.c:83
  #13 0x00000000005adf20 in stub_XA_mh_authorize (username=<optimized out>, password=<optimized out>) at xa_auth_stubs.c:42
```

`pam_start` and `pam_end` doesn't help here, because on `pam_end` the library is `dlclose`-ed, so on next `pam_authenticate` it will have to go through the initialization code again.
(This initialization code would've belonged into `pam_start`, not `pam_authenticate`, but there are several layers here including a call to `crypt_r`).
Upstream has fixed this problem >5 years ago by switching to libxcrypt instead.

Signed-off-by: Edwin Török <edwin.torok@cloud.com>
Signed-off-by: Christian Lindig <christian.lindig@cloud.com>
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants