Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ansible 2.10 + Collections support #715

Merged
merged 140 commits into from
Dec 24, 2020

Conversation

s1113950
Copy link
Collaborator

@s1113950 s1113950 commented May 8, 2020

Fixes #652
Fixes #731

Summary of changes:

  1. Rename some monkeypatches because Ansible 2.10 changed how stuff is called (for this reason I think we should stick with 0.3.0 being backwards-incompatible and do a separate 0.2.10 release for everything currently in master)
  2. Hack the ActionModuleMixin base class because collection support didn't have an ActionBase for some reason. This hopefully can be removed because I initially added it in when I was still on Ansible's devel branch, pre-2.10.
  3. Use Ansible's collections logic to determine where collections are installed, and then tell the master where they are via its sys.path so it can load them later when it looks for where dependencies live
  4. Handle the fake module __synthetic__ thing that Ansible added to collections modules

TODO:

"The full traceback is:",
        "Traceback (most recent call last):",
        "  File \"/mitogen-test/.venv3/lib/python3.8/site-packages/ansible/executor/task_executor.py\", line 158, in run",
        "    res = self._execute()",
        "  File \"/mitogen-test/.venv3/lib/python3.8/site-packages/ansible/executor/task_executor.py\", line 663, in _execute",
        "    result = self._handler.run(task_vars=variables)",
        "  File \"/mitogen-test/.venv3/lib/python3.8/site-packages/ansible/plugins/action/normal.py\", line 46, in run",
        "    result = merge_hash(result, self._execute_module(task_vars=task_vars, wrap_async=wrap_async))",
        "  File \"/mitogen-test/.venv3/lib/python3.8/site-packages/ansible/plugins/action/__init__.py\", line 826, in _execute_module",
        "    self._make_tmp_path()",
        "  File \"/mitogen-test/.venv3/lib/python3.8/site-packages/ansible/plugins/action/__init__.py\", line 388, in _make_tmp_path",
        "    tmpdir = self._remote_expand_user(self.get_shell_option('remote_tmp', default='~/.ansible/tmp'), sudoable=False)",
        "  File \"/mitogen-test/.venv3/lib/python3.8/site-packages/ansible/plugins/action/__init__.py\", line 709, in _remote_expand_user",
        "    data = self._low_level_execute_command(cmd, sudoable=False)",
        "  File \"/mitogen-test/.venv3/lib/python3.8/site-packages/ansible/plugins/action/__init__.py\", line 1121, in _low_level_execute_command",
        "    rc, stdout, stderr = self._connection.exec_command(cmd, in_data=in_data, sudoable=sudoable)",
        "  File \"/private/tmp/mitogen/ansible_mitogen/connection.py\", line 996, in exec_command",
        "    rc, stdout, stderr = self.get_chain().call(",
        "  File \"/private/tmp/mitogen/ansible_mitogen/connection.py\", line 942, in get_chain",
        "    self._connect()",
        "  File \"/private/tmp/mitogen/ansible_mitogen/connection.py\", line 838, in _connect",
        "    inventory_name, stack = self._build_stack()",
        "  File \"/private/tmp/mitogen/ansible_mitogen/connection.py\", line 774, in _build_stack",
        "    inventory_name=self.get_task_var('inventory_hostname'),",
        "  File \"/private/tmp/mitogen/ansible_mitogen/connection.py\", line 649, in get_task_var",
        "    task_vars = self._get_task_vars()",
        "  File \"/private/tmp/mitogen/ansible_mitogen/connection.py\", line 598, in _get_task_vars",
        "    for line in traceback.format_stack():",
        "NameError: name 'traceback' is not defined",

possibly broken from here?: ansible/ansible@8c2754e , self._connection is set to the old Mitogen connection obj, despite strategy being set to linear

  • Peg to ansible 2.10 now that it's released, rather than cloning from github
  • Fix up the tests to only test 2.10 versions of stuff
  • Add support for collections installed to site-packages (what happens with pip install ansible==2.10.0) instead of via ansible-galaxy collection install
  • fix a Debian stretch (oldstable) connection error
  • Add documentation explaining the changes
  • Remove NOTE sections in the code where backwards compat breaks, since it's been decided it's gonna break
  • [ ] Add more test cases around different common collection usages test cases use installed collections in them, so I think we're covered here
  • [ ] Anything else that Ansible 2.10 might have broken fixed everything that's been reported so far, and will fix future things in future releases
  • [ ] get debops working: Ansible 2.10 + Collections support #715 (comment) it might not be 2.10 compatible just yet for our use case -> disabling debops because ran into Fatal error in pki-role: lookup plugin (task_src) not found debops/debops#1521
  • See if collections processing can be done more efficiently

@cognifloyd
Copy link
Contributor

Hmm. Fascinating.
the mixin is using ansible.plugins.action.ActionBase but the error message I saw said it could not find ansible_collections.ansible.builtin.plugins.action. So, we're hitting something where ansible is prepending ansible_collections. and then trying to look it up.

    class ActionModuleMixin(ansible.plugins.action.ActionBase):
AttributeError: module 'ansible_collections.ansible.builtin.plugins.action' has no attribute 'ActionBase'

In ansible/ansible#52194 there was a line that prefixed modules with ansible_collections., but I don't see it in the stable-2.9 branch.

This is a concerning comment "preventing controller-side code injection during collection loading"... I wonder if that's what's breaking this.

        # this loader implements key functionality for Ansible collections
        # * implicit distributed namespace packages for the root Ansible namespace (no pkgutil.extend_path hackery reqd)
        # * implicit package support for Python 2.7 (no need for __init__.py in collections, except to use standard Py2.7 tooling)
        # * preventing controller-side code injection during collection loading
        # * (default loader would execute arbitrary package code from all __init__.py's)

I'll keep looking for how ansible_collections. is getting prepended.

@s1113950
Copy link
Collaborator Author

s1113950 commented May 9, 2020

Here's where it gets prepended:
https://github.com/ansible/ansible/pull/52194/files#diff-279e661b06196c21d04f8226e65ee458R24

They alias to ansible's modules and build ansible_collections.* there

@s1113950
Copy link
Collaborator Author

s1113950 commented May 9, 2020

so https://github.com/ansible/ansible/pull/52194/files#diff-279e661b06196c21d04f8226e65ee458R27 specifically is why we see the error about ansible_collections.ansible.builtin.plugins.action, because that's pointing to ansible.plugins.action, but if you check out the collection_loader it doesn't have an actionbase.
They put the collection loader stuff in a util file and use it as a special importer path rather than a specific loader type: https://github.com/ansible/ansible/pull/52194/files#diff-cb51550dcfdbd27a3f224be2de15e2e8R822

@s1113950
Copy link
Collaborator Author

s1113950 commented May 9, 2020

Ok wat. I'm able to dynamically load things now from sys.modules that are collections, but in this example: https://github.com/alikins/collection_inspect/tree/master/plugins only action and modules appear in sys.modules:

(Epdb) dir(sys.modules["ansible_collections.alikins.collection_inspect.plugins"])
['__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'action', 'modules']

so I'm gonna have to go the full file path route now possibly, or I'll try one of the other *method classes used to detect source code! :D challenge_accepted

@cognifloyd
Copy link
Contributor

So, that collection has:

  • action ✔️
  • callbackthis might require whitelisting in ansible.cfg?
  • filter
  • lookup
  • module_utilscomments below
  • modules ✔️ this uses a special AnsibleFlatMapLoader
  • varsI don't expect this one to work until 2.10

Both action and module_utils are special cases in _SYNTHETIC_PACKAGES. Both are maps, but module_utils has graft=True, so maybe that is the difference?
graft is used in AnsibleCollectionLoader.__init__ to "pre-inject grafted package maps" and again in AnsibleCollectionLoader._find_module. In there, it should be doing import_module("ansible.module_utils" + ".whatever").

Ah. I suspect the only way module_utils would get loaded on the controller side is if one of the other plugins in the collection imports something from it. AnsiballZ (lib/ansible/executor/module_common.py) takes care of detecting module_utils to shove down to the store.

Here's a significant change that adjusted how AnsiballZ collects files to be able to load module_utils with relative imports from collections into the AnsiballZ wrapper: ansible/ansible#61196

@cognifloyd
Copy link
Contributor

Oh, and there's a CollectionModuleInfo.get_source() method there too. You had a comment about getting the source, maybe using that will help.

@s1113950
Copy link
Collaborator Author

Thanks for the info 👍

@nitzmahone
Copy link

nitzmahone commented May 14, 2020

FYI: you might not want to go too crazy here just yet- 2.10 changes the internals on this stuff quite a bit for 2.9 backwards compatibility with collectionized content that's no longer in Ansible core. See ansible/ansible#67684 which is just about to land in devel. Off the top of my head, I don't see anything here that 67684 should break, but the child_module calculation may be incomplete, since our loader's iter_modules impl doesn't currently include the redirected stuff, only the stuff on disk. I'm not familiar with how that gets used, so it might or might not matter for what you're doing.

@s1113950
Copy link
Collaborator Author

Thanks for the heads up!

@cognifloyd
Copy link
Contributor

Does that mean that mitogen should only start supporting collections with 2.10?
I suspect that ansible's moving modules/plugins from ansible core to collections in 2.10 might require (many?) other changes in ansible_mitogen as well (and maybe mitogen itself too). The 2.10 move might be a forcing function: without collections support, ansible_mitogen will not be very useful.

Plus, with mitogen's philosophy of supporting very old python, and very old ansible versions, what does this mean for ansible_mitogen supporting ansible 2.8 (w/ tech preview of collections) or 2.9 (w/ support for collections)?

@nitzmahone
Copy link

@cognifloyd Not sure if you're asking me or others on this PR. I've got no opinion- someone on our team pointed this PR out to me since I'm doing most of the work on the core engine's collections support (which is the part you're most likely to be impacted by). I will say that 2.10 will be the first release where everyone gets exposure to collections. Only the bleeding-edge folks and partners developing collections were doing much with them in 2.8/2.9, but the 2.10 community distributions of Ansible will all be collection-ized internally (ansible-base + all the collections required to maintain 2.9 compatibility "installed" to sys.path, likely).

@cognifloyd
Copy link
Contributor

@nitzmahone I was talking more about the path forward for mitogen. Given that ansible mitogen advertises support for ansible 2.3 to 2.8 (see here), do we need to worry about not supporting collections with ansible 2.8 or 2.9? At this point, the answer is probably "no" we don't need to support collections there. We'll need to document that and get the website updated (is that in this repo? if it is, I can submit a PR).

I understand that there are a lot of changes with the 2.10 release. I'd really like to try out mitogen with collections.
So, @s1113950, what if we adjusted the tests temporarily to pull in ansible devel branch to start figuring out what is needed with all the new collections stuff. That way we can release the next version of mitogen with 2.10 support soon after 2.10 is released.

@s1113950
Copy link
Collaborator Author

Website gets updated in this way in the code: #714

Using devel makes sense to me. If a ton of stuff is gonna change it doesn't make much sense to me to spend a ton of time now getting things to work on 2.9 just for it to change a bunch in 2.10. I'll adjust tests on this PR to point to the latest hash of devel and go from there!

I don't know what to do about the backwards compat issue @cognifloyd. We already have quite a few places in the code that do hackish things to support super old Ansible... I wonder if it makes sense to only support ansible 2.10+ with mitogen 0.3.0+ and tell folks to use mitogen < 0.3 if they want <= ansible 2.9, depending on the kinds of changes that 2.10 brings? We could do a 0.2.10 release with everything leading up to 2.10 support (like ansible_python_interpreter detection) and then do a version change to 0.3 for ansible 2.10+ perhaps 🤔 WDYT?

@cognifloyd
Copy link
Contributor

cognifloyd commented May 16, 2020

I guess we need @dw to weigh in and say whether this would work for him and how he's using it in Operon. From everything I've read, backwards compat is super important for him.

Personally, I'd be fine with 2.10+ in 0.3

@s1113950
Copy link
Collaborator Author

Agreed 👍

@s1113950
Copy link
Collaborator Author

@cognifloyd David is cool with bumping the version number for collections support :)

@cognifloyd
Copy link
Contributor

By bumping the version number, you mean bumping the minimum ansible version, right?
Sounds awesome! All in on 2.10!

@extmind
Copy link

extmind commented Nov 21, 2020

@cnkk @extmind fixed the import error, please give it a try smile . The bug is with the setup module in Ansible, but specifically only for Pythons 3.5.1-3.5.3 (I tested a bunch of Python3 versions the first time Mitogen had this error). Ansible changed the signature of the setup module in 2.10, which broke the hack.

Thanks for tracking this down. Small addition: When executing a playbook, ansible reported a slightly different error that I was able to circumvent by adding ansible.legacy.setup to the set of broken setup modules.

@s1113950
Copy link
Collaborator Author

@moreati @dw are you cool with the updates? If so, I'd like to finalize the PR, add some documentation, and tag a release. Big finalization question is based on the changes required, does it make sense to break backwards compat. I've NOTE:-ed every place I could find in the PR of some type of api-like breaking change (ie ansible renaming get to get_with_context).

I vote doing 2 releases: 1 with everything in master except for this PR, and then another with this PR in it. That way if someone wants to use an older version of Ansible then they could use the older Mitogen version, and we wouldn't have to sacrifice speed and code readability by maintaining older Ansible API changes hanging around.

mitogen/master.py Show resolved Hide resolved
tests/requirements.txt Show resolved Hide resolved
@moreati
Copy link
Member

moreati commented Nov 25, 2020

@moreati @dw are you cool with the updates?

Mostly. I've updated my review comments and I think two still need addressing. Additionally I see there are 5 remaining subtasks

I vote doing 2 releases: 1 with everything in master

I see the following blockers to releasing current master

  1. Travis tests are currently red
  2. I don't know the status of Ansible 2.9 support in current master

@s1113950
Copy link
Collaborator Author

@moreati thanks for updating your comments, I've responded to both 👍

Additionally I see there are 5 remaining subtasks

I updated the subtasks, now all that's left is to remove the NOTE comments and add some documentation (along with a changelog entry)

Travis tests are currently red

There's a commit somewhere in this PR that fixes Travis tests iirc. I'll find it and open a new PR with it.

I don't know the status of Ansible 2.9 support in current master

There's been multiple github tickets filed of people trying to fix our README with adding 2.9 in the list of supported Ansible versions, so I think we're g2g on it 👍 everything should be good with that version except for collections, but collections weren't really a first class feature in Ansible 2.9 (#715 (comment)) so I don't think we need to include collection support in 2.9.

.ci/ci_lib.py Show resolved Hide resolved
tar xvf sshpass-1.05.tar.gz && \
cd sshpass-1.05 && \
./configure && \
sudo make install")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be one of the test fixes that is split out from this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old sshpass version doesn't work anymore on the test setup, so I'm fine splitting it out but then that PR would need to land before this one does. WDYT?

@@ -18,75 +18,64 @@ cache:

install:
- grep -Erl git-lfs\|couchdb /etc/apt | sudo xargs rm -v
- pip install -U pip==20.2.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pip >= 20.2 is what triggers the UnicodeDecodeError you had to work around in ansible_install.py

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, nice find! I'm in favor of passing the encoding on command install rather than pinning to an older version of pip, so the fix isn't lost in the future. I'm imagining a scenario where a new pip version comes along that we want, and then we try and upgrade to it and run into the potential UnicodeDecodeError again, but forget how to resolve it. For that reason, I'm inclined to keep the encoding as-is in ansible-install.py. WDYT?

.travis.yml Show resolved Hide resolved
@geigi
Copy link

geigi commented Dec 8, 2020

Hi :)

I just encountered an issue trying this branch. Here is my setup:

ansible 2.10.3
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/user/venv/ansible/lib/python3.8/site-packages/ansible
  executable location = /home/user/venv/ansible/bin/ansible
  python version = 3.8.6 (default, Sep 25 2020, 09:36:53) [GCC 10.2.0]

I did run my playbook like this:

ANSIBLE_STRATEGY_PLUGINS=/home/user/git/mitogen/ansible_mitogen/plugins/strategy ANSIBLE_STRATEGY=mitogen_linear ansible-playbook -i inventory playbook.yml -l "$HOST$"

This is the error message:

ERROR! [mux  13615] 09:05:39.967788 E mitogen.[ssh.$DELEGATE_HOST$]: ExternalContext.main() crashed
Traceback (most recent call last):
  File "<stdin>", line 3989, in main
  File "<stdin>", line 3726, in run
  File "<stdin>", line 633, in _profile_hook
  File "<stdin>", line 3715, in _dispatch_calls
  File "<stdin>", line 3661, in _dispatch_one
  File "<stdin>", line 3648, in _parse_request
  File "<stdin>", line 668, in import_module
  File "<frozen importlib._bootstrap>", line 2237, in _find_and_load
  File "<frozen importlib._bootstrap>", line 2226, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1191, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1161, in _load_backward_compatible
  File "<stdin>", line 1508, in load_module
  File "master:/home/user/git/mitogen/ansible_mitogen/target.py", line 85, in <module>
    import ansible_mitogen.runner
  File "<frozen importlib._bootstrap>", line 2237, in _find_and_load
  File "<frozen importlib._bootstrap>", line 2226, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1191, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1161, in _load_backward_compatible
  File "<stdin>", line 1508, in load_module
  File "master:/home/user/git/mitogen/ansible_mitogen/runner.py", line 85, in <module>
    import ansible.module_utils.basic
  File "<frozen importlib._bootstrap>", line 2237, in _find_and_load
  File "<frozen importlib._bootstrap>", line 2226, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1191, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1161, in _load_backward_compatible
  File "<stdin>", line 1508, in load_module
  File "master:/home/user/venv/ansible/lib/python3.8/site-packages/ansible/module_utils/basic.py", line 272, in <module>
    sys.exit(1)
SystemExit: 1
fatal: [$HOST$]: UNREACHABLE! => {"changed": false, "msg": "Channel was disconnected while connection attempt was in progress; this may be caused by an abnormal Ansible exit, or due to an unreliable target.", "unreachable": true}

The task where the error occurred is a delegate_to task to host $DELEGATE_HOST$ (Debian 8, Python 2.7.9 and Python 3.4.2). Just before the task where the crash occured, exactly the same task is executed fine on a different host (Debian 7, Python 2.7.3, no python3). Without mitogen the playbook runs just fine.

Thanks for your work so far and let me know if you need more information!

EDIT: setting ansible_python_interpreter: auto_legacy fixes the issue

@s1113950
Copy link
Collaborator Author

@geigi thanks for testing! :)

Your issue happens because Ansible itself threw an system exit, the relevant code is here:
https://github.com/ansible/ansible/blob/v2.10.3/lib/ansible/module_utils/basic.py#L264-L272

Setting ansible_python_interpreter: auto_legacy told Mitogen/Ansible how to detect which python version to run with. What I believe happened was that without setting auto_legacy, a bad Python version was detected and Ansible 💥 which caused Mitogen to 💥 .
Mitogen defaults to the auto method if it can't determine what Python version to use, instead of auto_legacy.

Also, I made a different ticket here because I think if possible, Mitogen should error nicer and say Ansible 💥 instead of throwing a massive traceback.

@geigi
Copy link

geigi commented Dec 11, 2020

Thanks for the explanation 😊
One thing is not clear to me yet:

Running the same playbook without mitogen and without the auto_legacy option worked just fine and no system exit is thrown by ansible. Why does the system exit only occur when using mitogen?

EDIT: I've tested it multiple times to make sure it wasn't a fluke

@s1113950
Copy link
Collaborator Author

From what I can tell, looks like Ansible tries auto_legacy mode first when detecting an interpreter: https://github.com/ansible/ansible/blob/v2.10.0/lib/ansible/executor/interpreter_discovery.py#L115 which would explain why it doesn't system exit but Mitogen does (Mitogen defaults to auto instead of auto_legacy).

Since Mitogen relies on sending the interpreter discovery mode to Ansible and auto_legacy is the legacy mode of detecting an interpreter, Mitogen defaults to the auto way of detecting an interpreter. Ansible throws a sys.exit(1) if the detected python version is too old/unsupported, and that big traceback you found is what Mitogen currently does if Ansible throws a sys.exit(1).

@geigi
Copy link

geigi commented Dec 14, 2020

Good to know that the default for ansible_python_interpreter is different with mutagen :) Thanks!

@s1113950
Copy link
Collaborator Author

s1113950 commented Dec 24, 2020

@moreati I made changes/commented on your latest review. I'm going to make an 0.3.0-rc.0 branch now with this PR in it, as well as a 0.2-release branch and tag it with 0.2.10-rc.0 that will contain everything in master except for this PR in it as discussed previously in the PR thread. Can't make a 0.2.10 tag yet because tests needed fixing on old master with some of what's in this PR. There's still more to do here as well (docs, changelog entries I forgot to make, somehow testing bug here, etc), but I'd love to get something merged before Dec 31st 2020 because ⬇️ ⬇️ ⬇️

PSA: travis.org is being EOL EOY, which will affect Mitogen since our required tests are on travis.org. I don't have access to the settings of Mitogen, so I can't change this. @dw can you point tests to travis.com instead? See migration from travis.org to travis.com

@s1113950 s1113950 merged commit 9463728 into mitogen-hq:master Dec 24, 2020
@s1113950 s1113950 deleted the collectionsSupport branch December 24, 2020 01:22
@s1113950
Copy link
Collaborator Author

s1113950 commented Dec 24, 2020

New releases made:
https://github.com/dw/mitogen/releases/tag/v0.2.10-rc.0
https://github.com/dw/mitogen/releases/tag/v0.3.0-rc.0

TL;DR of this PR thread: Mitogen has 2 releases now due to a bunch of API-breaking changes Ansible 2.10 made, 1 with Ansible < 2.10 and one Ansible 2.10+. Please tag any Github issues related to either release candidate with the new labels affects-0.2 or affects-0.3 🙏

@jfcabral
Copy link

Thank you for you hard work! It’s great to see mitogen on the verge of supporting Ansible v2.10! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ansible 2.10 support Support for ansible collections import hook