Skip to content

Base URL placeholder defaults in multi-website configuration pollute config cache for one another #36707

Open
@danemacmillan

Description

@danemacmillan

Preconditions and environment

  • Magento 2.4.3-p3
  • PHP 7.4.33
  • Config cache enabled: magento cache:enable config
  • Magento Configured with two Web Sites, with codes base and base2
    • Each Web Site with one Store each, using codes store1 and store2.
      • Each Store with one Store View each, using codes en1 and en2.
        • Summary: base/store1/en1, base2/store2/en2.
  • Nginx with fastcgi_param MAGE_RUN_TYPE website and each host configured with their appropriate website code: fastcgi_param MAGE_RUN_CODE base and fastcgi_param MAGE_RUN_CODE base2.

Steps to reproduce

With two Web Sites configured, now set their base URLs using the default placeholders, so that an explicit domain does not need to be configured, which will allow the deployment to exist at any domain, so long as it points to the correct deployment, as noted here:

{{base_url}} which represents a base URL defined by a virtual host setting or by a virtualization environment like Docker. For example, if you set up a virtual host with the hostname magento.example.com, you can install the software with --base-url={{base_url}} and access the Admin with a URL like http://magento.example.com/admin.

This is also noted in the Admin at Stores > Settings > Configuration > General > Web > Base URLs, should they want to set the URL values at each Web Site scope (not default).

To simplify the process, just use the commands to set the URLs.

Web Site code: base
magento config:set web/unsecure/base_url {{base_url}} --scope=website --scope-code=base
magento config:set web/unsecure/base_link_url {{unsecure_base_url}} --scope=website --scope-code=base
magento config:set web/unsecure/base_static_url "" --scope=website --scope-code=base
magento config:set web/unsecure/base_media_url "" --scope=website --scope-code=base

magento config:set web/secure/base_url {{base_url}} --scope=website --scope-code=base
magento config:set web/secure/base_link_url {{unsecure_base_url}} --scope=website --scope-code=base
magento config:set web/secure/base_static_url "" --scope=website --scope-code=base
magento config:set web/secure/base_media_url "" --scope=website --scope-code=base
Web Site code: base2
magento config:set web/unsecure/base_url {{base_url}} --scope=website --scope-code=base2
magento config:set web/unsecure/base_link_url {{unsecure_base_url}} --scope=website --scope-code=base2
magento config:set web/unsecure/base_static_url "" --scope=website --scope-code=base2
magento config:set web/unsecure/base_media_url "" --scope=website --scope-code=base2

magento config:set web/secure/base_url {{base_url}} --scope=website --scope-code=base2
magento config:set web/secure/base_link_url {{unsecure_base_url}} --scope=website --scope-code=base2
magento config:set web/secure/base_static_url "" --scope=website --scope-code=base2
magento config:set web/secure/base_media_url "" --scope=website --scope-code=base2

Ensure at the very least the config cache is enabled: magento cache:enable config. Everything else can be off.

With Nginx (or Apache) configured with the relevant MAGE_RUN_TYPE and MAGE_RUN_CODE per domain virtual host, navigate to the either of the domains first. For example purposes, base will respond to domain https://base.test/en1/ and base2 will respond to domain https://base2.test/en2/.

The first domain loads fine, with all its static and media assets also loading from the same base URL of https://base.test/, being https://base.test/media/ and https://base.test/static/.

Now navigate to the second domain. It loads fine, except that all of its static and media assets are not loading from https://base2.test/, but are loading from the first domain, which obviously causes numerous CSP and CORS errors, while additionally all of the links on the site are pointing to https://base.test/, which means the moment anyone interacts with the second site, they are brought to the first one.

Disabling the config cache eliminates this problem.

Worthwhile debugging notes

  • Setting the base_static_url and base_media_url configs with placeholder values {{unsecure_base_url}}static/ and {{unsecure_base_url}}media/ does not remedy the issue. The problem is the same. For simplicity, they have been left empty, as they are optional anyway.

  • Explicitly setting all of the URLs to real values instead of using the placeholders DOES NOT cause this problem. Both Web Sites load fine, respecting their specific URLs and not attempting to load from the other. The static and media URLs will load from their explicitly configured URLs, and not pollute each other's config cache values. In fact, the only reason I discovered this problem is because I was reviewing how to simplify deployments on different domains while still sharing a common DB without having to explicitly set domains in the DB. That was when I read that even the Base URL can be set to {{base_url}}, which would mean the domain that gets used is whatever is configured in the server's virtual host (Nginx or Apache). While it works for the Base URL, the static and media URLs pollute each other, based on whichever is accessed first.

Expected result

After navigating to https://base.test/en1/, navigating to https://base2.test/en2/ should load all of its static and media assets from its own base URL of https://base2.test/.

Actual result

Instead, what happens is that navigating to https://base2.test/en2/ will have all of its static and media assets loaded from the Base URL of the first Web Site that was accessed, being https://base.test/.

Additional information

Note that a similar problem was documented in #10693, though there was no mention of whether they also set the static and media URLs. The probably didn't. In all likelihood, the deployments that the user was making were single Web Site configurations, which would not have this problem, as there would be no other Web Site Base URLs polluting the Base URLs of the config cache.

Release note

No response

Triage and priority

  • Severity: S0 - Affects critical data or functionality and leaves users without workaround.
  • Severity: S1 - Affects critical data or functionality and forces users to employ a workaround.
  • Severity: S2 - Affects non-critical data or functionality and forces users to employ a workaround.
  • Severity: S3 - Affects non-critical data or functionality and does not force users to employ a workaround.
  • Severity: S4 - Affects aesthetics, professional look and feel, “quality” or “usability”.

Edit 1: Research

This method gets called on all configs within all scopes, regardless of the Store View being loaded:

It then eventually calls a method to process all placeholders, which then replaces all the {{base_url}} and similar URL placeholders with the value detected from the request via this method:

$distroBaseUrl = $this->request->getDistroBaseUrl();

What that means is that every Web Site-scoped config is replaced with the currently discovered HTTP_HOST, which is erroneous. The current Web Site scope would need to be known before attempting to replace placeholders, and the placeholder replacements themselves should only be done on the configs for the current Web Site scope, not all of them, as is being done now. What this means is that when the config cache is enabled, the moment the first Web Site within a multi-website configuration is hit, the domain for that site is used for every other Web Site scope config, and then saved to the cache. This happens because the code does not care about what scope the placeholder replacements should be made; instead it simply replaces all of them, regardless of scope. That means when the next site is loaded from a different domain, all of its Link, Static, and Media URLs have already been generated by the placeholder replacement operation from the first load of the other site.

In summary, what this means is that the URL placeholder replacement logic runs way too early on the configs, and instead needs to be run only once scopes are known, and only run under the currently active scope.

Also, given how early in the process URL placeholders are being replaced, it means, for example, that code like this will never be run, as replacements will have already occurred (keep in mind that this section would not address the other URL placeholders, though, even if they managed to get to this section:

if (false !== strpos($url, self::BASE_URL_PLACEHOLDER)) {

Edit 2: Research

Based on git logs, much of this functionality was rewritten in a very large commit (a978a4c) by Sergii Kovalenko, which may be the source of the problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Area: FrameworkComponent: UrlIssue: ConfirmedGate 3 Passed. Manual verification of the issue completed. Issue is confirmedPriority: P3May be fixed according to the position in the backlog.Progress: ready for devReported on 2.4.3-p3Indicates original Magento version for the Issue report.Reproduced on 2.4.xThe issue has been reproduced on latest 2.4-develop branch

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions