Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated the shade configuration, so it creates jars that can be used for #481

Merged
merged 9 commits into from
Mar 9, 2019
Merged

Updated the shade configuration, so it creates jars that can be used for #481

merged 9 commits into from
Mar 9, 2019

Conversation

hallvard
Copy link
Contributor

@hallvard hallvard commented Mar 7, 2019

OSGi bundles and Eclipse integration.

Thanks for contributing.

Description

Shade configurations were updated/created in each module's pom.xml files.

Testing

I checked that the shaded jars were usable in the Eclipse integration.

excel/pom.xml Outdated Show resolved Hide resolved
@benmccann
Copy link
Collaborator

Listing every transitive dependency seems extremely brittle and annoying to maintain. If the shade plugin can't be more easily configured, perhaps the bundle plugin could be used instead? I think it's more directly meant for building OSGI packages and allows you to include or exclude dependencies by scope

@hallvard
Copy link
Contributor Author

hallvard commented Mar 7, 2019

I completely agree, but this is what I got working in my first attempt. I looked at the shaded jar for jsplot and it included everything, but I may have looked at an old one from before the modules were configured with provided. I'll try a bit more... I know of the bundle plugin, but I think it requires a bit more elaborate configuration, but I'll dig more into it. I really want to robust and easy to maintain setup, that doesn't interfere with the rest of the build setup for the non-OSGi case.

@hallvard
Copy link
Contributor Author

hallvard commented Mar 7, 2019

The core needs a list of dependencies to exclude, since shade doesn't seem to support omitting optional dependencies. It works with provided dependencies, so I can remove the include list for html, json and html. jsplot is different, since it just depends on core the normal way. There's a suggestion here: https://stackoverflow.com/questions/28458058/maven-shade-plugin-exclude-a-dependency-and-all-its-transitive-dependencies, but I haven't tried it, yet.

core/pom.xml Outdated
@@ -69,6 +69,30 @@
<configuration>
<shadedArtifactAttached>true</shadedArtifactAttached>
<shadedClassifierName>shaded</shadedClassifierName>
<artifactSet>
<excludes>
Copy link
Collaborator

@benmccann benmccann Mar 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we make the dependencies on the reader projects provided and optional?

I'd just include a comment explaining why we did it so that it's not accidentally removed later:

<scope>provided</scope> <!-- because the shade plugin doesn't exclude optional dependencies -->

@benmccann
Copy link
Collaborator

Why do you need separate jars for Eclipse? Won't you want to include all the different readers? Why not just make one uber jar that includes everything including core and the readers

@hallvard
Copy link
Contributor Author

hallvard commented Mar 7, 2019

Requiring the user to install all the features (in this case, readers), including the 19Mb xlsx reader, goes against the idea of modular and extensible frameworks, that is key to Eclipse's architecture. In the Eclipse integration of tablesaw the optional readers are provided by optional plug-ins (bundles) that will only be activated if the corresponding file types are used and opened. This is done using Eclipse's extension mechanism. I know classes in general aren't loaded before they're actively used, but jar size is still an issue. Tablesaw could use a similar trick with a registry of readers and a Table.read method that takes the file type as its only argument. This will also make it possible to avoid the "reversed" optional dependency.

@benmccann
Copy link
Collaborator

Wow. I didn't realize poi-ooxml and dependencies are 19Mb. I would have thought it'd be much smaller

I like the idea of using a registry. I think the current API is nicer than having to pass the file extension. But I'm not sure you'd have to change the API to implement a registry.

@hallvard
Copy link
Contributor Author

hallvard commented Mar 7, 2019

The poi stuff is generic for the whole set of office file formats. There are smaller libraries that support just xls, but I'm not sure they support xlsx.

The problem with an API with specific methods for each file format is the dependency from the core to the file format module. There's lots of formats out there, e.g. Smile supports several that could be of interest to tablesaw (without using Smile), and each would need another method in the DataFrameReader and another dependency from core to be "officially" supported. With a file format registry (and some way of discovering the modules providing them) this wouldn't be a problem. Yes, you can implement a registry without changing the API, but that wouldn't give you the advantage of increased modularity.

@benmccann
Copy link
Collaborator

I think we could do both. Have a registry which supports new formats. But have an easy fluent API for the most common readers which we support directly which calls the registry under the hood instead of calling the specific reader instance directly

create proper jars for OSGi integration. Needed to add dependency to
commons-lang3 or else reading csv failed.
core/pom.xml Outdated Show resolved Hide resolved
smile/pom.xml Outdated Show resolved Hide resolved
@hallvard
Copy link
Contributor Author

hallvard commented Mar 7, 2019

Should actually be able to do without the includes section, as for the other optional modules that has a provided dependency on core.

@benmccann
Copy link
Collaborator

That'd be great. I think this would look a lot better if we were able to do without the includes

smile/pom.xml Outdated
@@ -50,25 +50,13 @@
<artifactId>maven-compiler-plugin</artifactId>
</plugin>

<plugin>
<plugin>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added an extra space

<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.1.0</version>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move all the common configuration under pluginManagement in the parent pom?

@lwhite1 lwhite1 merged commit f63ec5b into jtablesaw:master Mar 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants