Skip to content

find_replace transforms on XPath with predicates does not work #3771

@leboff

Description

@leboff

Describe the bug

When using the find replace transform with an XPath that includes predicates it fails to locate the proper element because the referenced tag in the predicate is not namespaced.

This test in test_transforms.py passes because the test XML included is not namespaced however it fails if it were re-written as the following (note the added xmlns to bookstore):

def test_xpath_replace_with_exp_and_index_has_xmlns(task_context):
    zip_content = {
        Path(
            "Foo.xml"
        ): '<bookstore xmlns="foobar"> <book category="cooking"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category="children"> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="web"> <title lang="en">XQuery Kick Start</title> <author>James McGovern</author> <author>Per Bothner</author> <author>Kurt Cagle</author> <author>James Linn</author> <author>Vaidyanathan Nagarajan</author> <year>2003</year> <price>49.99</price> </book> <book category="web"> <title lang="en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore>',
    }
    patterns = [
        {"xpath": "/bookstore/book[price>40]/author[2]", "replace": "Rich Author"}
    ]
    builder = create_builder(task_context, zip_content, patterns)

    modified_zip_content = {
        Path(
            "Foo.xml"
        ): '<bookstore xmlns="foobar"> <book category="cooking"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category="children"> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="web"> <title lang="en">XQuery Kick Start</title> <author>James McGovern</author> <author>Rich Author</author> <author>Kurt Cagle</author> <author>James Linn</author> <author>Vaidyanathan Nagarajan</author> <year>2003</year> <price>49.99</price> </book> <book category="web"> <title lang="en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore>',
    }
    zip_assert(builder, modified_zip_content)

The transformed xpath ends up looking like /*[local-name()="bookstore"]/*[local-name()="book"][price>40]/*[local-name()="author"][2]

note how price here does not reference local-name.

I think something that could work here is instead of parsing the xpath to add local-name references, storing predicates and re-adding them, the process can be simplified significantly (and made a bit more robust) by simply removing the xmlns declaration and then adding it back at the end.

Reproduction steps

  1. Create a deploy task with find_replace tag
        options:
          transforms:
            - transform: find_replace
              options:
                patterns:
                  - xpath: /DuplicateRule[masterLabel="SomeRule"]/sortOrder
                    replace: 2
  1. Ensure xml has an xmlns declaration
  2. XPath does not correctly resolve

Your CumulusCI and Python versions

CumulusCI version: 3.85.0
Python version: 3.11.7

Operating System

Mac OSX 14.4

Windows environment

No response

CumulusCI installation method

None

Error Gist

No response

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions