-
Notifications
You must be signed in to change notification settings - Fork 883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(wsl): Properly assemble multipart data #5538
fix(wsl): Properly assemble multipart data #5538
Conversation
Defaulting to empty dict instead of empty bytes Exercise that with a test case in which Landscape data is empty That was how I figured the issue: - We are not assemblying the binary list correctly when passing multipart user-data. More to come.
To properly assemble a multipart data we need raw bytes If both are bytes just pass them as they are. If either user or agent data is dict we need to assemble back the raw binary data (including the content type stenza) (Maybe we should reload the files in their binary form instead?)
@blackboxsw gentle ping. Could this PR be reviewed? |
If either agent or user/Landscape data is not a #cloud-config, then we cannot apply the merge rules of this datasource, but rather we need to pass them along so cloud-init internals can handle them. With the current implementation we attempt to pass them as a list of bytes. That has lots of limitations and breaks cloud-init internal handlers in many ways. For example a user data YAML file containing a syntax error would trigger a nasty traceback and it would be hard to find in the logs that the error was due a ill-formed YAML (thanks jibel for showing me that). Investigating edge cases like that I also realized we couldn't support jinja templates of any sort with that implementation. So, if one of the merge candidates is not cloud-config, the best way to feed cloud-init with them is by pretending the user wrote an #include file. That should allow all cloud-init features to just work. Previously we didn't check if the YAML we loaded was actually cloud-config. It could be a jinja template of a YAML cloud-config, in which case merging wouldn't make sense. Thus we now explicitely check before loading the YAML as a dict. A jinja template of a cloud-config for example goes raw into cloud-init, which is exactly what we want. UserDataProcessor's will handle the template rendering and post-processing just fine.
All related to the same issue: how we were passing the multipart user data into cloud=init.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @CarlosNihelton . I appreciate that your approach here is trying to move away from load_yaml_or_bin which I think adds to clarity of the logic we are using in WSL datasource. I'd like to provide you with a constructive patch suggestion tomorrow that may keep things simple if we can as the use of ConfigData as written feels like it involves a lot of method calls to interact properly with that layer. I wanted to give you an update is this is now foremost on my plate for review.
cloudinit/sources/DataSourceWSL.py
Outdated
if not os.path.isfile(path.as_posix()): | ||
raise FileNotFoundError(f"Config file {path} not found") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're already checking isfile at all callsites or an IOError handling if FileNotFound which util.load_binary_file raises for you. No need to recheck in init. It just adds duplicated work.
if not os.path.isfile(path.as_posix()): | |
raise FileNotFoundError(f"Config file {path} not found") |
cloudinit/sources/DataSourceWSL.py
Outdated
self._data = dict() | ||
return | ||
|
||
if self._raw_data.startswith("#cloud-config"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid hard-coding that starting string as there may be other header coments grown in the future that indicate a text/cloud-config part.
It's probably best to use the helper from cloudinit.handlers import type_from_starts_with
if self._raw_data.startswith("#cloud-config"): | |
if "text/cloud-config" == type_from_starts_with(self._raw_data): |
cloudinit/sources/DataSourceWSL.py
Outdated
if not os.path.isfile(path.as_posix()): | ||
raise FileNotFoundError(f"Config file {path} not found") | ||
|
||
self._raw_data: str = util.load_binary_file(path).decode("utf-8") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm struggling with us calling this _raw_data when it's already been UTF-8 decoded as that's specifically what cloudinit.user_data::UserDataProcessor.process does for us when cloud-init proper attempts to process user_data provided by all other datasources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed decoding is mostly not necessary. I was puzzled with that because self.userdata_raw
is declared in the superclass as Optional[str]
, but yeah, we can remove the decoding without any issues for the linters (except mypy :) ) and tests in use as well as behavior.
cloudinit/sources/DataSourceWSL.py
Outdated
raise FileNotFoundError(f"Config file {path} not found") | ||
|
||
self._raw_data: str = util.load_binary_file(path).decode("utf-8") | ||
self._raw_data = self._raw_data.lstrip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we lstriping raw_data? Shouldn't we be passing this around unaltered and undecoded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lstrip was for later checking if it started with #cloud-config
stenza, which is no longer necessary as we are now comparing the MIME type with the output of cloudinit.handlers.type_from_starts_with()
.
We already check at the call sites except for one where we already expect a FileNotFoundError which would be raised by util.load_binary_file anyway.
Leverage cloudinit.handlers.type_from_starts_with and compare the MIME type instead. Makes it more future proof.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for patience @CarlosNihelton on my review here. I've tried consolidating the suggestions into a single diff which I have pushed.
The changes include:
-refactoring ConfigData to avoid a raw
method and overrides to empty files as dict().
- provide ConfigData.is_cloud_config
- refaction _get_data conditionals and merge logic into it's own helper function
merge_agent_landscape_data
- add unittests for merge_agent_landscape_data
- use util.load_text_file instead of util.load_binary.
The consolidated diff is below:
--- a/cloudinit/sources/DataSourceWSL.py
+++ b/cloudinit/sources/DataSourceWSL.py
@@ -149,26 +149,16 @@ class ConfigData:
retaining its raw representation alongside its file path"""
def __init__(self, path: PurePath):
- self._raw_data: str = util.load_binary_file(path).decode("utf-8")
+ self.raw: str = util.load_text_file(path)
self.path: PurePath = path
- self._data: Optional[dict] = None
- if len(self._raw_data) == 0:
- self._data = dict()
- return
+ self.config_dict: Optional[dict] = None
- if "text/cloud-config" == type_from_starts_with(self._raw_data):
- self._data = util.load_yaml(self._raw_data)
+ if "text/cloud-config" == type_from_starts_with(self.raw):
+ self.config_dict = util.load_yaml(self.raw)
- def is_dict(self) -> bool:
- return self._data is not None
-
- def as_dict(self) -> dict:
- assert self._data is not None
- return self._data
-
- def raw(self) -> str:
- return self._raw_data
+ def is_cloud_config(self) -> bool:
+ return self.config_dict is not None
def load_instance_metadata(
@@ -186,7 +176,7 @@ def load_instance_metadata(
)
try:
- metadata = util.load_yaml(util.load_binary_file(metadata_path))
+ metadata = util.load_yaml(util.load_text_file(metadata_path))
except FileNotFoundError:
LOG.debug(
"No instance metadata found at %s. Using default instance-id.",
@@ -220,6 +210,12 @@ def load_ubuntu_pro_data(
)
landscape_data = None
if os.path.isfile(landscape_path):
+ LOG.debug(
+ "Landscape configuration found: %s. Organization policy "
+ "ignores any local user-data in %s.",
+ landscape_path,
+ cloud_init_data_dir(user_home),
+ )
landscape_data = ConfigData(landscape_path)
agent_path = PurePath(os.path.join(pro_dir, AGENT_DATA_FILE))
@@ -230,6 +226,89 @@ def load_ubuntu_pro_data(
return agent_data, landscape_data
+def merge_agent_landscape_data(
+ agent_data: Optional[ConfigData], user_data: Optional[ConfigData]
+) -> Optional[str]:
+ """Merge agent.yaml and <instance>.user-data provided by WSL and Landscape.
+
+ When merging is not possible, provide #include directive to allow cloud-init to merge
+ separate parts.
+ """
+ # Ignore agent_data if None or empty
+ if agent_data is None or len(agent_data.raw) == 0:
+ if user_data is None or len(user_data.raw) == 0:
+ return None
+ return user_data.raw
+
+ # Ignore user_data if None or empty
+ if user_data is None or len(user_data.raw) == 0:
+ if agent_data is None or len(agent_data.raw) == 0:
+ return None
+ return agent_data.raw
+
+ # If both are found but we cannot reliably model both data files as
+ # cloud-config dicts, then we cannot merge them ourselves, so we should
+ # pass the data as if the user had written an include file
+ # for cloud-init to handle internally. We explicitely prioritize the
+ # agent data, to ensure cloud-init would handle it even in the presence
+ # of syntax errors in user data (agent data is autogenerated).
+ # It's possible that the effects caused by the user data would override
+ # the agent data, but that's the user's ultimately responsibility.
+ # The alternative of writing the user data first would make it possible
+ # for the agent data to be skipped in the presence of syntax errors in
+ # user data.
+
+ if not all([agent_data.is_cloud_config(), user_data.is_cloud_config()]):
+ LOG.debug(
+ "Unable to merge {agent_data.path} and {user_data.path}. "
+ "Providing as separate user-data #include."
+ )
+ return "#include\n%s\n%s\n" % (
+ agent_data.path.as_posix(),
+ user_data.path.as_posix(),
+ )
+
+ # We only care about overriding top-level config keys entirely, so we
+ # can just iterate over the top level keys and write over them if the
+ # agent provides them instead.
+ # That's the reason for not using util.mergemanydict().
+ merged: dict = {}
+ user_tags: str = ""
+ overridden_keys: typing.List[str] = []
+ if isinstance(user_data.config_dict, dict):
+ merged = user_data.config_dict
+ user_tags = (
+ merged.get("landscape", {}).get("client", {}).get("tags", "")
+ )
+ if isinstance(agent_data.config_dict, dict):
+ if user_data:
+ LOG.debug("Merging both user_data and agent.yaml configs.")
+ agent = agent_data.config_dict
+ for key in agent:
+ if key in merged:
+ overridden_keys.append(key)
+ merged[key] = agent[key]
+ if overridden_keys:
+ LOG.debug(
+ (
+ " agent.yaml overrides config keys: "
+ ", ".join(overridden_keys)
+ )
+ )
+ if user_tags and merged.get("landscape", {}).get("client"):
+ LOG.debug(
+ "Landscape client conf updated with user-data"
+ " landscape.client.tags: %s",
+ user_tags,
+ )
+ merged["landscape"]["client"]["tags"] = user_tags
+
+ return (
+ "#cloud-config\n# WSL datasouce Merged agent.yaml and user_data\n%s"
+ % yaml.dump(merged).strip()
+ )
+
+
class DataSourceWSL(sources.DataSource):
dsname = "WSL"
@@ -334,73 +413,7 @@ class DataSourceWSL(sources.DataSource):
if not any([user_data, agent_data]):
return False
- # If only user_data was found
- if agent_data is None:
- assert user_data is not None
- self.userdata_raw = user_data.raw()
- return True
-
- # If only agent data was found
- if user_data is None:
- assert agent_data is not None
- self.userdata_raw = agent_data.raw()
- return True
-
- # If both are found but we cannot reliably model both data files as
- # cloud-config dicts, then we cannot merge them ourselves, so we should
- # pass the data as if the user had written an include file
- # for cloud-init to handle internally. We explicitely prioritize the
- # agent data, to ensure cloud-init would handle it even in the presence
- # of syntax errors in user data (agent data is autogenerated).
- # It's possible that the effects caused by the user data would override
- # the agent data, but that's the user's ultimately responsibility.
- # The alternative of writing the user data first would make it possible
- # for the agent data to be skipped in the presence of syntax errors in
- # user data.
- assert user_data is not None and agent_data is not None
- if not agent_data.is_dict() or not user_data.is_dict():
- self.userdata_raw = "#include\n%s\n%s\n" % (
- agent_data.path.as_posix(),
- user_data.path.as_posix(),
- )
- return True
-
- # We only care about overriding modules entirely, so we can just
- # iterate over the top level keys and write over them if the agent
- # provides them instead.
- # That's the reason for not using util.mergemanydict().
- merged: dict = {}
- user_tags: str = ""
- overridden_keys: typing.List[str] = []
- if user_data:
- merged = user_data.as_dict()
- user_tags = (
- merged.get("landscape", {}).get("client", {}).get("tags", "")
- )
- if agent_data:
- if user_data:
- LOG.debug("Merging both user_data and agent.yaml configs.")
- agent = agent_data.as_dict()
- for key in agent:
- if key in merged:
- overridden_keys.append(key)
- merged[key] = agent[key]
- if overridden_keys:
- LOG.debug(
- (
- " agent.yaml overrides config keys: "
- ", ".join(overridden_keys)
- )
- )
- if user_tags and merged.get("landscape", {}).get("client"):
- LOG.debug(
- "Landscape client conf updated with user-data"
- " landscape.client.tags: %s",
- user_tags,
- )
- merged["landscape"]["client"]["tags"] = user_tags
-
- self.userdata_raw = "#cloud-config\n%s" % yaml.dump(merged)
+ self.userdata_raw = merge_agent_landscape_data(agent_data, user_data)
return True
diff --git a/tests/unittests/sources/test_wsl.py b/tests/unittests/sources/test_wsl.py
index 34789d6e8..809684d6b 100644
--- a/tests/unittests/sources/test_wsl.py
+++ b/tests/unittests/sources/test_wsl.py
@@ -53,6 +53,29 @@ GOOD_MOUNTS = {
SAMPLE_LINUX_DISTRO = ("ubuntu", "24.04", "noble")
SAMPLE_LINUX_DISTRO_NO_VERSION_ID = ("debian", "", "trixie")
+AGENT_SAMPLE = """\
+#cloud-config
+landscape:
+ host:
+ url: landscape.canonical.com:6554
+ client:
+ account_name: agenttest
+ url: https://landscape.canonical.com/message-system
+ ping_url: https://landscape.canonical.com/ping
+ tags: wsl
+ubuntu_pro:
+ token: testtoken
+"""
+
+LANDSCAPE_SAMPLE = """\
+#cloud-config
+landscape:
+ client:
+ account_name: landscapetest
+ tags: tag_aiml,tag_dev
+locale: en_GB.UTF-8
+"""
+
class TestWSLHelperFunctions:
@mock.patch("cloudinit.util.subp.subp")
@@ -251,6 +274,57 @@ def join_payloads_from_content_type(
return content
+class TestMergeAgentLandscapeData:
+ @pytest.mark.parametrize(
+ "agent_yaml,landscape_user_data,expected",
+ (
+ pytest.param(
+ None, None, None, id="none_when_both_agent_and_ud_none"
+ ),
+ pytest.param(
+ None, "", None, id="none_when_agent_none_and_ud_empty"
+ ),
+ pytest.param(
+ "", None, None, id="none_when_agent_empty_and_ud_none"
+ ),
+ pytest.param("", "", None, id="none_when_both_agent_and_ud_empty"),
+ pytest.param(
+ AGENT_SAMPLE, "", AGENT_SAMPLE, id="agent_only_when_ud_empty"
+ ),
+ pytest.param(
+ "",
+ LANDSCAPE_SAMPLE,
+ LANDSCAPE_SAMPLE,
+ id="ud_only_when_agent_empty",
+ ),
+ pytest.param(
+ "#cloud-config\nlandscape:\n client: {account_name: agent}\n",
+ LANDSCAPE_SAMPLE,
+ "#cloud-config\n# WSL datasouce Merged agent.yaml and user_data\n"
+ + "\n".join(LANDSCAPE_SAMPLE.splitlines()[1:]).replace(
+ "landscapetest", "agent"
+ ),
+ id="merge_agent_and_landscape_ud_when_both_present",
+ ),
+ ),
+ )
+ def test_merged_data_excludes_empty_or_none(
+ self, agent_yaml, landscape_user_data, expected, tmpdir
+ ):
+ agent_data = user_data = None
+ if agent_yaml is not None:
+ agent_path = tmpdir.join("agent.yaml")
+ agent_path.write(agent_yaml)
+ agent_data = wsl.ConfigData(agent_path)
+ if landscape_user_data is not None:
+ landscape_ud_path = tmpdir.join("instance_name.user_data")
+ landscape_ud_path.write(landscape_user_data)
+ user_data = wsl.ConfigData(landscape_ud_path)
+ assert expected == wsl.merge_agent_landscape_data(
+ agent_data, user_data
+ )
+
+
class TestWSLDataSource:
@pytest.fixture(autouse=True)
def setup(self, mocker, tmpdir):
@@ -505,14 +579,7 @@ package_update: true"""
ubuntu_pro_tmp = tmpdir.join(".ubuntupro", ".cloud-init")
os.makedirs(ubuntu_pro_tmp, exist_ok=True)
landscape_file = ubuntu_pro_tmp.join("%s.user-data" % INSTANCE_NAME)
- landscape_file.write(
- """#cloud-config
-landscape:
- client:
- account_name: landscapetest
- tags: tag_aiml,tag_dev
-locale: en_GB.UTF-8"""
- )
+ landscape_file.write(LANDSCAPE_SAMPLE)
# Run the datasource
ds = wsl.DataSourceWSL(
cloudinit/sources/DataSourceWSL.py
Outdated
except FileNotFoundError: | ||
LOG.debug("No data found at %s, ignoring.", data_path) | ||
def __init__(self, path: PurePath): | ||
self._raw_data: bytes = util.load_binary_file(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just set self.raw here and reference this instance attribute instead of defining a separate raw
method. Less parenthesis to reference the value at call sites
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I also misspoke earlier, this should not be bytes by a str as you originally declared for userdata_raw. user_data processors can handle decoding content if necessary, but it seems that functionality is now vestigial in most cases.. Long story short , using util.load_text_file should work for reading file paths with user-data.
cloudinit/sources/DataSourceWSL.py
Outdated
if len(self._raw_data) == 0: | ||
self._data = dict() | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels strange to set this _data to an empty dict() just so we can merge that empty dict later when we are sure that both agent_data and landscape_user_data are "config config data parts". Instead let's treat len(agent_data.raw) == 0 as agent_data is None and ignore it altogether without attempting to merge.
cloudinit/sources/DataSourceWSL.py
Outdated
# for the agent data to be skipped in the presence of syntax errors in | ||
# user data. | ||
assert user_data is not None and agent_data is not None | ||
if not agent_data.is_dict() or not user_data.is_dict(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's change this to a simple ConfigData.is_cloud_config() check to better represent what we are looking for here. If not cloud_config, and non-empty, use #include
directive.
cloudinit/sources/DataSourceWSL.py
Outdated
if agent_data is None: | ||
assert user_data is not None | ||
self.userdata_raw = user_data.raw() | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With all these additional conditionals it probably warrants a separate function that can be better tested. Let's create merge_agent_landscape_data
to perform these merge operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor change in the docstring, otherwise it looks great!
f02ccbf
to
5eda65b
Compare
I've pushed commit 5eda65b for your review. Feel free to alter/adapt or reject those comments or suggestions. I thought it may be easier for you to see the pushed commit rather than have to extract it from github review comments. |
add some focused unittests on merge_agent_landscape_data
5eda65b
to
c2b9a97
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeap, I'm pretty happy with the results. Just have a clarification to propose in the merge_agent_landscape_data()
function docstring.
I'm playing with this patch with different scenarios in WSL. It's much nicer when there are syntax errors in user-data for example. The #include
makes it indeed more resilient. And Jinja just works! Maybe we should consider providing wsl-specific metadata from stuff the datasource computes so it can be used in templates, but that's matter for a future PR, if it shows usefulness :)
and since agent.yaml had a valid ubuntu pro token:
That was correctly applied on the system.
- A scenario where Landscape sends non-cloud-config user-data (a template shell script):
0dfdb55
to
5fed3ee
Compare
5fed3ee
to
c4749b0
Compare
@CarlosNihelton yes please, the more structured content provided in meta-data that is exposed to cloud-init the more we can use jinja to surface values or provide conditionals in user-data based on exposed WSL config info. It definitely would be helpful to include an valuable config settings to cloud-init if they may alter how we want cloud-init to behave based on that extended host config information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Let's ship it.
Proposed Commit Message
Additional Context
I only figured out this bad behavior when testing with empty Landscape data,
thus the test added for the case of empty Landscape data.
Any non-cloud-config content type would trigger the same erroneous behavior.
jibel showed me an interesting traceback due a syntax error in the user data file,
and that motivated investigating other edge use cases. I found the only way to
properly pass heterogenous pieces of userdata into cloud-init is by modelling
them as an
#include
file.Unfortunately I couldn't figure out a proper way to assert on the rendering of
jinja templates in the unit test file, as it happens outside of the datasource.
The same is valid for the assumption that cloud-init will apply the agent data
in the presence of syntax errors in the user-data, as that decision is also taken
outside of the datasource, as far as I can tell.
This implementation is more robust and allows for other cloud-init features to
just work.
Test Steps
To test that behavior:
Ubuntu-24.04.user-data) in
%USERPROFILE%/.ubuntupro/.cloud-init`cloud-init status --long
should show:"cloud-init.log" will contain a trace similar to:
Closes UDENG-3718
Merge type