Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduced a new resolver for the mvn:/// scheme. #1962

Merged
merged 21 commits into from
Jun 10, 2024
Merged

Conversation

jurgenvinju
Copy link
Member

@jurgenvinju jurgenvinju commented Jun 6, 2024

If this mvn://<groupId>!<artifactId>!<versionWithoptionalReleaseTag> scheme works properly, it can transparantly replace all the lib://<artifactId/ usages.

  • this is a logical resolve that produces file:/// URIs. This important for efficiency as the mvn:// paths end up in classloader situations where they are short-circuited into highly optimized URLclassloaders.
  • in fact this is a short-hand notation for file:/// URLs that is based only on the folder structure of the maven local repository: <path>/<to>/<groupId>/<artifactId>/<version>/<artifactId>-<version>.jar and nothing more and nothing less.
  • mvn:/// the authority-less root resolves to the root of the local maven repository for exploration purposes.
  • the actual root of the maven repository is:
    • if -Dmaven.repo.local=<whatever> is set, the root is whatever
    • otherwise if ~/.m2 exists, the root is ~/.m2/repository
    • otherwise if <localRepository>something</localRepository> is set in settings.xml in the maven installation folder, then the root is something
    • otherwise the root defaults to ~/.m2/repository
  • The two ! separators are essential; we first tried with . because of style, but while testing on real local repositories (big ones) we found many projects to have . in their artifact names. That led to ambigous parsing of the authority which is not stable.

For Eclipse projects we were already using plugin:// and bundle:// for dependencies transparantly. The lib:/// scheme did not add anything much there anyway.

A new PR will make the necessary adjustments to the PathConfig and RascalManifest configuration algorithms for the interpreter and the compiler. This one focuses only on adding the resolution capability for the mvn scheme. It is quite possible that after these changes the Required-Libraries configuration option is no longer use{d,ful} and we can get rid of it.

Copy link

codecov bot commented Jun 7, 2024

Codecov Report

Attention: Patch coverage is 55.93220% with 26 lines in your changes missing coverage. Please review.

Project coverage is 49%. Comparing base (34fd02f) to head (9f31950).
Report is 4 commits behind head on main.

Files Patch % Lines
...rascalmpl/uri/file/MavenRepositoryURIResolver.java 53% 17 Missing and 5 partials ⚠️
src/org/rascalmpl/uri/jar/JarURIResolver.java 62% 3 Missing ⚠️
...pl/library/lang/java/m3/internal/JarConverter.java 50% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##              main   #1962   +/-   ##
=======================================
  Coverage       49%     49%           
- Complexity    6244    6261   +17     
=======================================
  Files          664     665    +1     
  Lines        59572   59620   +48     
  Branches      8635    8645   +10     
=======================================
+ Hits         29319   29385   +66     
+ Misses       28054   28035   -19     
- Partials      2199    2200    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jurgenvinju jurgenvinju marked this pull request as ready for review June 7, 2024 11:13
@jurgenvinju
Copy link
Member Author

Needs a few tests but they are hard to write given that the most used feature is really simple, while the funny exceptions mess with the configuration that the current build depends on.

@jurgenvinju jurgenvinju changed the title moved jarify to JarURIResolver and introduced a new resolver for the mvn scheme. to be tested Introduced a new resolver for the mvn:/// scheme. Jun 7, 2024
@jurgenvinju
Copy link
Member Author

@DavyLandman This is now tested on my big .m2 folder, which includes tycho/eclipse projects and many other weirdly named projects. Too bad we could not use . instead of !.

A next PR will introduce List<String> URIResolverRegistry.listKnownAuthorities(str scheme) so that we can auto-complete project names and mvn projects, and eclipse plugins, alike.

Copy link
Member

@DavyLandman DavyLandman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, it's a bit of a pity we end up calling mvn as a fallback.

The only question I have is about the encoding inside of the authority. why not make it path based?

src/org/rascalmpl/uri/file/MavenRepositoryURIResolver.java Outdated Show resolved Hide resolved
* So the authority encodes the identity of the maven project and the path encodes
* what's inside the respective jar file.
*
* Here version is [0-9]+ major . [0-9]+ minor .[0-9]+ path -[A-Za-z\-]* optionalTag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: path -> patch

* version reference in the authority pattern, by design. Any automated resolution here would
* make the dependency management downstream less transparant and possibly error-prone.
*
* This resolver also does not implement any download or installation procedure, by design.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the one thing I'm missing from this PR, or is that intended in a later PR?

I guess it's because we're refering to a local maven repo with the mvn scheme, not the global public one (which would import a "where is the repo located" problem?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's important to separate concerns here:

  1. The mvn:/// names are intended as the unambiguous result of dependency resolution.
  2. If dependency resolution requires download or installation; those two have to be implemented as human interactive processes or bundling has to be done explicitly by a responsible developer. There are legal implications here that should not belong with an automated scheme resolver.
  3. Transparency is of the utmost importance, in particular these new names must make it clear when/if a project is resolved with more than one different version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally I really want this resolver to be quiet on the printing to stderr front. Now that the resolver is transparent, a quick view of the current path config is fully informative while before the lib:/// resolver had to print a log for transparancy purposes. A download feature would also require ugly visual logs every time it happened.

public class MavenRepositoryURIResolver extends AliasedFileResolver {
private final Pattern authorityRegEx
= Pattern.compile("^([a-zA-Z0-9-_.]+?)[!]([a-zA-Z0-9-_.]+)([!][a-zA-Z0-9\\-_.]+)$");
// groupId ! artifactId ! optionAlComplexVersionString
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would regular / not work as separators? I'm asking since in other places we used ! to go into a jar/zip.

So we would have:

mvn://groupId/artifactId/version!fileInsideTheJar.txt

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ is not allowed in authorities

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use ! here too to go into the jar. See below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why the ! is a good character to use. We've used it before to avoid overlap with filenames; and that seemed ok. Every other character that is not a / or a ! or a . is a bad separator because people are allowed to use then in filenames.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

. ended up not to work because people don't follow the maven naming rules.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant, groupid as authority, and the rest for regular paths?

Alternative: we could even use authority to distinguish between local and remote maven repo

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Remote maven repo's are something to think about very carefully, in a legal sense. I don't want to blur that distinction anywhere. So for this scheme that is out of the question.
  • I want the authority to identify a jar, a project scope, just like the other schemes we use for this, project, plugin, bundle, etc. That way the path is always consistently rooted at the base of a file name space. I don't want our Rascal code to have to split loc paths based on some arbitrary knowledge about the specific scheme.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last point makes code that runs directly from target folders or project schemes and from an mvn jar, or from an eclipse plugin, or OSGI bundle, etc. etc. much easier to make correct, because loc.path will produce the same postfix in every situation.

src/org/rascalmpl/uri/jar/JarURIResolver.java Show resolved Hide resolved
@jurgenvinju
Copy link
Member Author

Looks good, it's a bit of a pity we end up calling mvn as a fallback.

The only question I have is about the encoding inside of the authority. why not make it path based?

I started with that, but the puzzle pieces wouldn't fit.

  1. This allows for the empty authority to mean something else without additional string interpretation.
  2. Crossing over from authority into path coincides exactly with crossing over from jar file into the root of the contents of the jar file.
  3. This way there is no ambiguity between the jar file and the root inside of it.
  4. The noted exception is when you want to start at the root of a jar. I use a single optional ‘!‘ at the root of the path to signal a "jarify" to the resolver; and auto complete continues inside.
  5. With everything jammed into path this would make it all harder to read and harder to parse.

Still it would be possible in theory to make it all work as path strings as well. With judicious uses of the ! ; comparable to what we did with the authority. I like this better for the smooth transition between the jarified and non-jarified locations; and I ended up here because I started with . separators which wouldn't fly in the path at all due to find very weird projects on three grand central (typically from Google and eclipse) that don't play nice with their own names.

@jurgenvinju
Copy link
Member Author

jurgenvinju commented Jun 9, 2024

Looks good, it's a bit of a pity we end up calling mvn as a fallback.

I don't like that either. The alternative is to parse the XML itself. I decided that when we have built-in maven support in the core runtime, we can revisit this code and just query the configuration API directly.

However, as this code is run via static initialization (via reg.getInstance()) ; there will still be considerable hoops to jump through before we can call from this resolver into the maven implementation. I'm not so sure that will work. At least let's delay the XML parsing here until we know what's the deal.

@jurgenvinju
Copy link
Member Author

Also this is where I fell to one knee; because even if the .m2 folder exists the config.xml would still take precedence. I choose to break this in favor of efficiency. If you really want to config another .m2 folder you should remove the one from your home directory for this scheme to work. Of course this makes the current code much much faster in case of the default setup.

@jurgenvinju
Copy link
Member Author

@DavyLandman thanks for the Sunday review 🤓👍🤩

@jurgenvinju jurgenvinju merged commit 7a28561 into main Jun 10, 2024
1 check passed
@jurgenvinju jurgenvinju deleted the mvn-scheme branch June 10, 2024 10:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants