Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate how we can load external assemblies when running Lucene.Net.Benchmarks on the command line #305

Open
NightOwl888 opened this issue Jul 7, 2020 · 6 comments
Labels
benchmarks investigation is:enhancement New feature or request pri:normal up-for-grabs This issue is open to be worked on by anyone
Milestone

Comments

@NightOwl888
Copy link
Contributor

NightOwl888 commented Jul 7, 2020

The benchmarks project was designed to be able to load user-defined projects to run. In Java, this could be done with a single string to identify the types to load, however, .NET requires a reference to the actual assembly in order to read the types from it.

We currently have it set up to read all types from all assemblies that are referenced, but this causes the Lucene.Net.Tests.Benchmark.ByTask.Tasks.Alt::TestWithoutAlt() test to fail because in Java the types were supposed to be loaded on demand. So, we need to investigate the best way to load types from external assemblies in .NET to run benchmarks on from the lucene-cli tool.

The part that has been altered to allow assemblies to be "automatically" discovered is:

// Loads all assemblies in current referenced project (this was not in the original Lucene source)
IEnumerable<string> referencedAssemblies = AssemblyUtils.GetReferencedAssemblies().Select(a => a.GetName().Name);
result.Add(dfltPkg);

if (alts == null)
{
	result.UnionWith(referencedAssemblies);
	return result.ToArray();
}

foreach (string alt in alts.Split(',').TrimEnd())
{
	result.Add(alt);
}
result.UnionWith(referencedAssemblies);

Equivalent in Lucene 4.8.1

The problem is that when running as a separate process (lucene-cli), the end user has no way to reference assemblies, and therefore cannot change what is loaded by the tool.

I am no expert on Java, but from what I gather there is a convention-based and extensible "class path" that can be interacted with by end users regardless of whether it is inside or outside of the .jar package. I think the way it works is that by simply dropping an external .class file (similar to a .NET Type) in the same directory as an internal class, the JVM will load it, but it is also possible to inject a custom "class loader" to load from alternate locations or to add additional class paths on the command line.

It is also possible in Java to either reference a .jar file like a DLL or execute it like an EXE. For example, if a main() method exists in any class, it can be executed directly on the .jar file like:

java -cp lucene-core.jar org.apache.lucene.index.IndexUpgrader [-delete-prior-commits] [-verbose] indexDir

or 

java -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex pathToIndex [-fix] [-verbose] [-segment X] [-segment Y]

Since DLLs cannot be executed directly in .NET in the same way, we have a gap between the two platforms. The lucene-cli tool was created as a wrapper process to execute any one of the main() methods that were included in Lucene to fill this gap. This wrapper executable has revealed yet another gap, since the end user should be able to supply their own classes to benchmark and currently there is no way to do so.

So, the essence of this task is to do the following:

  • Create a similar convention-based and/or command-line based way for end users to be able to run any of the benchmark commands against their own assemblies/code
  • Prefer to utilize the .NET platform's nearest counterpart and convention rather than invent a custom one, where possible
  • Utilize the Lucene.Net.Benchmarks string-based configuration for the user to be able to specify which type to load

IMO, we don't necessarily have to have as many options as were available in Java, we simply need to provide an option, which we are currently lacking.

@NightOwl888 NightOwl888 added up-for-grabs This issue is open to be worked on by anyone investigation lucene-cli is:enhancement New feature or request pri:normal labels Jul 7, 2020
@NightOwl888 NightOwl888 added this to the 4.8.0 milestone Jul 7, 2020
@jeme
Copy link
Contributor

jeme commented Oct 6, 2020

Loading types in from non-referenced assemblies are fairly simple in .NET, if they reside in a different location (folder wise) one has to implement some assembly resolution handling, but I have done that many times in the past.

In only really gets complicated if we begin to talk about having the loaded assemblies isolated. This is often done to allow for loading and then unloading them again. However since this is a command line then that sounds irrelevant.

But I think the Issue lacks more context, this could be some examples or references to documentation of how the Java version works. As well as how do we envision this should work?

@Shazwazza
Copy link
Contributor

For netcore this is really easy and you can unload them. In .NET Framework this is more difficult and you cannot unload them unless you create and destroy custom AppDomains at runtime which is possible but you cannot flow data between the domains unless you are using string serialization or remoting (all ugly). In netcore you just use AssemblyLoadContext, there are samples here https://github.com/dotnet/samples/tree/master/core/tutorials/Unloading (https://github.com/dotnet/samples/blob/master/core/tutorials/Unloading/Host/Program.cs)

But the way benchmarkdotnet works is underneath for the execution it dynamically creates a netcore project and compiles it with the references that you are telling it to, it then runs the benchmarks against the compiled .exe output. So by using benchmarkdotnet you are sort of already loading in external assemblies. It's been a while since I looked but you can control how benchmarkdotnet builds it's program.

@NightOwl888
Copy link
Contributor Author

@jeme

Thanks, you were correct in that the original issue was lacking some context, and I have added more information to better explain the task.

The Lucene.Net.Benchmark project was designed to work either as a library that users can extend or as an executable that can just be run. The issue crops up in the latter case where we need some sort of a "plug in" architecture so the end user can supply their own assembly to run it against. Java has a native feature to do this, but .NET does not.

@Shazwazza

Although I think that we should look into leveraging BenchmarkDotNet for Lucene.Net.Benchmark at some point, the current incarnation is just a line-by-line port from Java. The Lucene.Net.Benchmark project uses a DSL to control the configuration of a benchmark, including strings that are meant for loading external types.

Since the commands are essentially run-once I don't believe there will be any issues with "unloading" to worry about.

@Shazwazza
Copy link
Contributor

oh yes my bad, i was confused with the benchmarkdotnet project(s) that we have, this one is different.

Since the commands are essentially run-once I don't believe there will be any issues with "unloading" to worry about.

If its for netcore 3 then AssemblyLoadContext is still the way to do it whether you unload or not. This is a nice post about it https://codetherapist.com/blog/netcore3-plugin-system/ If it's not netcore 3 then you can use Assembly.Load(name) if you want it loaded correctly (with fusion) but then the assembly needs to be in your probing paths (i.e. /bin), else you can load with Assembly.LoadFrom(filename) or Assembly.Load(bytes) but if you do that, the assembly will not be loaded in the same context. This is all different depending on the platform you are running on. In Net Framework this is all super ugly and you need to know about the 3 load contexts: Default, Load-From, No Context, see https://docs.microsoft.com/en-us/dotnet/framework/deployment/best-practices-for-assembly-loading but basically dealing with anything but the Default is a pain and you will almost always need an AppDomain.AssemblyResolve event, but you might get success with LoadFrom

netcore has fixed all this nonsense :) so depends on what it needs to run on

@NightOwl888
Copy link
Contributor Author

Actually, come to think of it, that brings up another potential gap that wasn't previously considered. The lucene-cli tool is targeted at .NET Core 3.1 only. This may be an issue if the end user needs to load .NET Framework assemblies into its context in order to benchmark the types within them.

Potential solutions/workarounds:

  • Don't support .NET Framework in the CLI, require .NET Framework users to compile their DLL as .NET Standard in order to benchmark in .NET Core 3.1 or use the DLL and build their own wrapper CLI for .NET Framework
  • Create a separate version of the tool for .NET Framework (possibly even move the benchmark commands to a separate tool)

I know that in early versions of .NET Core, it was possible to load .NET Framework assemblies with certain conditions/limitation, which could also potentially be explored.

@Shazwazza
Copy link
Contributor

Don't support .NET Framework in the CLI,

That would be my vote, I just don't see it worth spending a whole lot of time for .NET Framework compatibility. If the main project supports it then I think that's enough IMO.

I know that in early versions of .NET Core, it was possible to load .NET Framework assemblies with certain conditions/limitation, which could also potentially be explored.

Yep we were exploiting that in our own builds and it sort of still works in netcore 3, however in netcore 3 official support for it has been entirely dropped. Like if you drop a dll into the /bin it will 'work' but i think it really depends on what's in the DLL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmarks investigation is:enhancement New feature or request pri:normal up-for-grabs This issue is open to be worked on by anyone
Projects
None yet
Development

No branches or pull requests

4 participants