Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build isolated jar files out of projects by changing the top namespace #212

Closed
tengstrand opened this issue Apr 12, 2022 · 15 comments
Closed
Assignees
Labels
improvement Not a bug but an improvement to overall user experience

Comments

@tengstrand
Copy link
Collaborator

tengstrand commented Apr 12, 2022

Is your feature request related to a problem? Please describe.
Polylith is not well suited for building libraries as it is today. The problem is that if we for example build two libraries (as jar files) out of two project, then the user of these libraries can get into trouble if the same brick is used in both projects.

One example is when we build version 1 of library a and b and then modify a shared component c and build version 2 of both libraries. If we now include a and b in our class path, we will get different versions of c depending on which of the libraries a or b comes first in the classpath.

Describe the solution you'd like
To guarantee that version 1 of a library only uses bricks from version 1, we need to change the top namespace of all bricks in the project we build from, e.g. com.mycompany to com.mycompany.mylib (by copying the code and rename all namespaces and references to them in the code) before we build the jar.

This functionality is probably best implemented as a custom command when we have support for that.

@tengstrand tengstrand self-assigned this Apr 19, 2022
@tengstrand tengstrand added the improvement Not a bug but an improvement to overall user experience label Apr 19, 2022
@TimoKramer
Copy link
Contributor

Is it possible to go forward with this in any way without the custom command being finished?

@tengstrand
Copy link
Collaborator Author

Yes, it is. It could for example be added as an alias in the root deps.edn file.

@TimoKramer
Copy link
Contributor

Hi, I would start working on it if you don't mind. So you would imagine to have an alias similar to :build but something like :build-mylib? Where this alias is specific to one library that one wants to build and deploy/release a jar from?

I took a quick look into the build function and would implement it quite similar but amended somehow like this:

(b/delete {:path class-dir})
(b/write-pom opts)
(b/copy-dir {:src-dirs   src+dirs
                    :target-dir class-dir})
(rename-ns)
(b/jar opts))
(b/delete {:path class-dir})
(println "Uberjar is built.")

where b/write-pom and b/copy-dir are steps from Sean's build.clj and the rename-ns step renames the namespaces in place in the target-folder.

What do you think?

@tengstrand
Copy link
Collaborator Author

Hi @TimoKramer!

Sorry for late answer.
Yes, this looks fine with me. Maybe :build-lib could be a better name for the alias?

Maybe @seancorfield or @furkan3ayraktar have ideas around this?

@seancorfield
Copy link
Contributor

I think the idea of publishing library JARs built from a Polylith repo that are not published in lock step with each other goes against the whole idea of using a monorepo in the first place, since you're reintroducing the "version hell" problems that separate libraries cause.

I don't think shading components (renaming nses) is a good idea -- getting it right in all cases is non-trivial and you're likely to end up your users depending on the "same" code through different ns paths since if you put component c into both libraries a and b, people will treat it as part of each library in terms of being able to use that code (and trying to "hide" it from documentation isn't going to stop people finding it and calling it).

I would lean toward releasing component c as an independent library and making the build process build, tag, and release all the library components together at the same time and keep them all in sync (and then libraries a and b can explicitly depend on the same version of c, i.e., a 1.1 and b 1.1 both depend on c 1.1 and all three libs get a release every time any of them get a releasable update).

@TimoKramer
Copy link
Contributor

That's an interesting approach. Just thinking if I got it all right...

Given we have a database released with different APIs like a library, an http- and a cli-API, the http- and cli-API rely on the library. When we changed e.g. the http-API then we would release the library and the http-base as well as the cli-base in lock-step with the library, correct?

Would that mean we treat the library as a base?

@seancorfield
Copy link
Contributor

Yes, my approach would be to release 1.100.1 of all artifacts, then 1.100.2, etc (or whatever version scheme you prefer). But cut new releases of all of them whenever you cut a new release of any of them. That way, it's clear for users what versions of everything should be used together -- and the transitive dependencies would all match too.

That's not going to prevent dependency conflicts overall, because a third-party library Q could depend on your A 1.100.1, and then a user might depend on both Q and B 1.100.2, and they'll get C 1.100.1 or 1.100.2 depending on whether they're using lein or clojure and how they have dependencies declared, but that's an "existing problem" that people have been dealing with for years and we have "community knowledge" around it. Shading does help avoid some of that. But if you have related libraries and systems -- which seems to be your scenario and would be the most likely in a monorepo situation, I think -- then you're either going to want users to keep their toolchain versions in near lock-step already or you're going to be super careful about backward compatibility across versions (which Clojurians tend to be anyway).

All this is to say that I think shading is a worse solution -- in general -- than keeping things simple and leveraging existing tooling and knowledge around transitive dependencies.

Regarding bases for libraries -- that's probably a good approach since the library is a sort of external API or service (for users) and that's what bases are for.

One thing that Polylith provides that you couldn't leverage in a "lock-step library" model is swappable implementations: if your A and B both depend on C's interface but the actual shipped JARs were built with A + C1 and B + C2, then you're going to have a problem since you will end up with two C (interface) namespaces on the classpath (for someone using both A and B) but only one can get picked so you'll get implementation C1 or C2 depending on which C is first (on the classpath). So that's a situation where shading might be worth the pain in order to use swappable implementations (but at that point, I'd probably just use separate C1 / C2 interfaces as well to avoid that).

The other place where shading can be worth the pain is if you are providing low-level tooling that must operate in process with arbitrary user code but a) that's not just "libraries" and b) you probably will need to shade all of your dependencies, not just those within your monorepo. An example of b) can be seen here https://github.com/xtdb/xtdb/blob/master/modules/jdbc/src/xtdb/jdbc/mysql.clj where XTDB depends on a shaded version of next.jdbc.

Bottom line: I don't think this is a problem that Polylith should be trying to solve.

  • Third-party dependencies are out of scope for Polylith (such as next.jdbc) but that is the area that tends to benefit most from shading.
  • Shading is very hard to get right since it isn't just renaming namespaces/files, but also transforming all internal references to those namespaces (:require is "easy" but there can also be fully-qualified calls in source code or even dynamic references to namespaces via resolve and require at runtime). That's why vendoring a dependency is typically the approach taken: transforming just the dependency itself, only when a new version is required, and consuming that via its vendored names (the XTDB case above and, I think, the approach Mr Anderson uses for dependencies that end up embedded in tooling).

@seancorfield
Copy link
Contributor

Is there any additional discussion to be had, or can this issue be closed? (no action for over a year)

@TimoKramer
Copy link
Contributor

I think you can close it. I didn't go forward with it but have a proper idea of what to do. Thanks again!

@tengstrand
Copy link
Collaborator Author

Yes, let's close this issue.
Issue #318 will add the section "Poly as a library", which I'm working on right now.

@tengstrand
Copy link
Collaborator Author

I think I have a potential solution to this problem. My idea is that we include a number in the top namespace, e.g. thingorama-0001. The first time we release a set of libraries from the monorepo, it will be for version 1, where all top namespaces till be thingorama-0001, e.g.:

  • thingorama-0001-bar-1.2.3.jar
  • thingorama-0001-foo-1.2.3.jar

The next time we release the libraries, we increase the version to 2 for all top namespaces:

  • thingorama-0002-bar-1.2.4.jar
  • thingorama-0002-foo-1.2.4.jar

The poly tool could include a command that can set and/or increase this number, which will go through and rename all the top namespaces, and is something we do before a release. A benefit of this approach is that different versions of a library can live side by side, without interfering each other!

So the difference here from my previous suggestion, is that we actually update the source code. That will allow us to run our tests to validate the change.

@tengstrand tengstrand reopened this Nov 26, 2023
@seancorfield
Copy link
Contributor

I'm not sure I follow what you're suggesting here. If you change the name of a namespace that users might rely on, they can't upgrade without changing their code, which means those changes are "breaking".

(I'm also strongly against commands changing source code by default, as you know)

@seancorfield
Copy link
Contributor

And, just to reiterate: I think this problem is out of scope for Polylith.

@tengstrand
Copy link
Collaborator Author

Yes, I agree that this is out of scope of Polylith. And yes, changing all namespaces every time you release a new version isn't optimal, and it also requires all users of the libraries to update their source code. It also allows different versions to run in parallell, which can both be good and bad.

I just wanted to discuss this topic a bit more. I think we leave it as it is, without actions.

@tengstrand
Copy link
Collaborator Author

I updated the Artifacts page where I explains all this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Not a bug but an improvement to overall user experience
Projects
None yet
Development

No branches or pull requests

3 participants