Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test unicast DNS-SD as well as multicast, with both Avahi and mDNSResponder, on Ubuntu 20.04 #250

Merged
merged 6 commits into from
Apr 27, 2022

Conversation

garethsb
Copy link
Contributor

Test unicast DNS-SD as well as multicast, with both Avahi and mDNSResponder, on Ubuntu 20.04

  • Add matrix.dns_sd_mode as 'unicast' or 'multicast', used in job name and results filenames
  • For now, unicast DNS-SD testing is not supported on Windows and macOS
  • Add api.testsuite.nmos.tv and mocks.testsuite.nmos.tv to /etc/hosts (as per https://github.com/AMWA-TV/nmos-testing/blob/master/test_data/BCP00301/README.md#hosts-files)
  • Stomp on /etc/resolv.conf (systemd-resolve --set-dns didn't replace only add) to ensure that only the mock DNS server is used and restart the Avahi or mDNSResponder daemon (reloading and/or invalidating/flushing systemd-resolve and nscd caches between test suite runs was found to be unnecessary)
  • Configure nmos-cpp-node to timeout registration requests (after 5s) and retry long-running DNS-SD queries (after 10s), well before the testing tool's DNS_SD_ADVERT_TIMEOUT (30s by default)
  • Run the testing tool with elevated permissions when testing unicast DNS-SD (unfortunately necessitating quick hack of pip install-ing packages as root)
  • Note the domain names passed to run_nmos_testing.sh mustn't have trailing dots to avoid "[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: Hostname mismatch, certificate is not valid for 'api.testsuite.nmos.tv.'. (_ssk.c:1131)" from e.g. IS-07-02 test_05

Unrelated changes:

  • Update to conan-cmake 0.18.1 to support Visual Studio 2022 (a.k.a. "17", or "MSVC 19.31.31105.0" as now installed on GitHub windows-latest runnner)
  • However, switch to windows-2019 rather than windows-latest (now an alias for windows-2022) because there aren't yet any Conan binary packages for VS 2022 and testssl.sh is consistently failing with "Fatal error: No IPv4/IPv6 address(es) for "nmos-api.local" available" on the newer virtual environment
  • Add host_addresses to config because on Windows I'm seeing "No matching mDNS announcement" failures in IS-04-02 and IS-04-03 and a second "Registered address" in the nodeoutput log for the "vEthernet (nat)" adapter, despite it being disabled in the "windows setup" step...
  • Use latest https://github.com/DannyBen/kojo which means whitespace-only lines due to indented imports are gone

@garethsb
Copy link
Contributor Author

garethsb commented Apr 22, 2022

Commit caabf8e resolves failures for Avahi unicast via the suggestion made by @wsneijers in #128, without any apparent effect on the mDNSResponder results, by partially reverting 6a26bdb. However, we still need to do a bit of digging...

@garethsb garethsb force-pushed the patch-5 branch 7 times, most recently from 1c8028a to 9f9a163 Compare April 27, 2022 12:17
@garethsb
Copy link
Contributor Author

garethsb commented Apr 27, 2022

@lo-simon This is as far as I'm going to take this for now. I've resquashed all the commits to tell a sensible story, with the main work to enable unicast DNS-SD testing in the first two commits separate from the bug fix for Avahi. I've finally excluded the mDNSResponder unicast job because of the intermittent failures. But I believe the local testing we've done with unicast DNS-SD with mDNSResponder after the bug fix is encouraging... As discussed offline, there is no difference between the build steps for unicast or multicast DNS-SD testing, so it might be better done as two testing steps on each job in the matrix... but keeping the test results separate is useful, so we decided to leave it this way for now at least.

Despite the *** buffer overflow detected *** in mdnsd that the GitHub Actions runs intermittently showed, we can't recommend using Avahi in every circumstance, due to the issue with getaddrinfo when multiple addresses are available for a given host name described in #99 (comment).

FWIW, I did a lot of debugging using tmate:

    - name: Setup tmate session
      uses: mxschmitt/action-tmate@v3

Another useful snippet when testing Avahi is that if your system is one of the ones where avahi-daemon keeps coming back from the dead, you probably already tried:

systemctl stop avahi-daemon

Maybe even:

systemctl stop avahi-daemon.socket avahi-daemon.service

But what finally stops it could be:

systemctl mask avahi-daemon

And finally, while I remember, I also played with the positive and negative TTL values in the dns_base.zone file in the testing tool based on discussion around AMWA-TV/nmos-testing#228 but the following, for example, made no discernable difference:

$TTL 10s
{{ domain }}.  IN SOA   ns.{{ domain }}. postmaster.{{ domain }}. ( 2007120710 10s 5s 20s 5s )

…ponder, on Ubuntu 20.04

* Add matrix.dns_sd_mode as 'unicast' or 'multicast', used in job name and results filenames
* For now, unicast DNS-SD testing is not supported on Windows and macOS
* Add api.testsuite.nmos.tv and mocks.testsuite.nmos.tv to /etc/hosts (as per https://github.com/AMWA-TV/nmos-testing/blob/master/test_data/BCP00301/README.md#hosts-files)
* Stomp on /etc/resolv.conf (systemd-resolve --set-dns didn't replace only add) to ensure that only the mock DNS server is used and restart the Avahi or mDNSResponder daemon (reloading and/or invalidating/flushing systemd-resolve and nscd caches between test suite runs was found to be unnecessary)
* Configure nmos-cpp-node to timeout registration requests (after 5s) and retry long-running DNS-SD queries (after 10s), well before the testing tool's DNS_SD_ADVERT_TIMEOUT (30s by default)
* Run the testing tool with elevated permissions when testing unicast DNS-SD (unfortunately necessitating quick hack of pip install-ing packages as root, with --upgrade to workaround incompatibilities with e.g. cryptography in dist-packages)
* Note the domain names passed to run_nmos_testing.sh mustn't have trailing dots to avoid "[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: Hostname mismatch, certificate is not valid for 'api.testsuite.nmos.tv.'. (_ssk.c:1131)" from e.g. IS-07-02 test_05

Unrelated changes:
* Update to conan-cmake 0.18.1 to support Visual Studio 2022 (a.k.a. "17", or "MSVC 19.31.31105.0" as now installed on GitHub windows-latest runnner)
* However, switch to windows-2019 rather than windows-latest (now an alias for windows-2022) because there aren't any Conan binary packages for VS 2022 and testssl.sh is consistently failing with "Fatal error: No IPv4/IPv6 address(es) for "nmos-api.local" available" on the newer virtual environment
* Add host_addresses to config because on Windows I'm seeing "No matching mDNS announcement" failures in IS-04-02 and IS-04-03 and a second "Registered address" in the nodeoutput log for the "vEthernet (nat)" adapter, despite it being disabled in the "windows setup" step...
* Stop and restart mDNS daemons when testing
* Launch mdnsd with -debug for the duration
* Log mDNS announcements to demonstrate mdnsd messages about "excessive update rate" are based on its interpretation of the spec (targeting 6s interval rather than 1s interval)
… resolver intermittently picking another DNS server
…s in at least IS-04-01 test_15, test_16 and test_21 (see sony#128)
E.g.
* Bad service type in ._nmos-registration._tcp...
* Excessive update rate for nmos-cpp_node_nmos-api-...
* mDNSPosix.c:mDNSPlatformSetAllowSleep(): NOT IMPLEMENTED!
* setsockopt - SO_RECV_ANYIF: Protocol not available
@lo-simon lo-simon merged commit 71f9fe0 into sony:master Apr 27, 2022
@garethsb garethsb deleted the patch-5 branch April 27, 2022 16:37
garethsb added a commit to garethsb/nmos-cpp that referenced this pull request Apr 28, 2022
* for build and dependencies, e.g. sony#197, sony#198, sony#207, sony#211, sony#215, sony#229, sony#230, sony#235, sony#243
* for SDP parser/generator, e.g. sony#201, sony#205, sony#219, sony#241, sony#242, sony#244
* for RQL, e.g. sony#224
* for CI tests, e.g. sony#218, sony#231, sony#239, sony#250
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants