Skip to content

Work in Progress: How to write reusable components that work both locally and on the platform

Claus Stadler edited this page Sep 22, 2017 · 17 revisions

This page is work in progress. At present, this document should be viewed as ideas and no official code yet exists

The goals of this document are

  • raise awareness of current issues and point out approaches how they can be addressed.
  • eventually concrete guidelines/best practices on how to design reusable benchmarking components should be distilled. This includes providing pointers to concrete classes that solve common problems. E.g. How can I run a benchmark against one of the open source systems on my local machine; what service wrappers do already exist.

A brief introduction to why dependency injection + abstraction is key!

Consider having created a task generator, that during its task preparation phase requires a SPARQL endpoint. You may be tempted to simply write:

class MyTaskGenerator {
  public void init() {
    createContainer("git.project-hobbit.eu:4567/henning.petzka/facetedgoldvirtuoso/image", envVariables);
  }
  ..
}

If you think: That was too easy - well, here is the next task: Adapt above code so that, for testing purposes, you can use an in-memory triple store. If you now think: I can just create a new class, then read on about how to do this properly. They magic phrase is dependency injection. The following examples are based on the popular Java Spring framework.

// Spring lets us mark our platform components
// as components using the @Component annotation. How convenient!
// Technically, the annotation just tells dependency injection frameworks
// that a class is subject to dependency injection
@Component
class MyTaskGenerator {
  @Autowired
  SparqlBasedService sparqlService; 

  public void init() {
  }
  ..
}

You see - now the TaskGenerator only declares that it want's some SparqlBasedService and that's it. All logic in the TaskGenerator built on this service will just run as usual, as long as sparqlService references a valid Java object.

Configuration

But how can we now get a concrete sparqlService object into our TaskGenerator? The answer: Configuration classes. These are simply classes with methods annotated with @Bean. Beans are Java objects subject to injection into components. At this point it may be worth noting, that the use of these annotations automates certain aspects of the dependency injection process, but in the face of lack of annotations, the frameworks allow one to do things the programmatic way as well.

@Configuration
class MyLocalTestingEnviroment {
    @Bean
    public SparqlBasedSystemService preparationSparqlService() {
        VirtuosoSystemService result = new VirtuosoSystemService(
                Paths.get("/opt/virtuoso/vos/7.2.4.2/bin/virtuoso-t"),
                Paths.get("/opt/virtuoso/vos/7.2.4.2/databases/hobbit_1112_8891/virtuoso.ini"));

        return result;
    }
}

Probably you can already see the pattern, that you can simply create another configuration class for another environment

class HobbitPlatformEnvironment {
    @Bean
    public SparqlBasedService preparationSparqlService() {
        return new SparqlServiceThatGetsStartedViaSendingAMessageOnTheCommandQueueWhichStartsADocker(...);
    }
}

Obviously, if our TaskGenerator explicitly declared a dependency on SparqlServiceThatGets..., we wouldn't have gained much. So the choice of proper abstraction is key as well.

Services

All Hobbit components act as services. Hence, a generic service abstraction is a quite fundamental thing to have. Fortunately, probably one of the most essential Java libraries, Google's Guava, features Guava Services! Guava even features the ServiceManager which provides a convenient way to synchronize the start up and shutdown of multiple services. It can even list services by state, i.e. which may be used to inspect which services we are waiting for.

Combining Guava's Service and ServiceManager with a standard Java8 Supplier already enables us to implement parts of a BenchmarkController in a purely abstract way. The service factories can take care of incrementing task generator ids.

class MyBenchmarkController {
    @Resource(name="dataGeneratorServiceFactory")
    protected Supplier<Service> dataGeneratorServiceFactory;

    @Resource(name="taskGeneratorServiceFactory")
    protected Supplier<Service> taskGeneratorServiceFactory;

    @Resource(name="systemAdapterServiceFactory")
    protected Supplier<Service> systemAdapterServiceFactory;

    public void init() {
        // Obtain concrete instances so we can easily inspect them in the debugger debugger
        dataGeneratorService = dataGeneratorServiceFactory.get();
        taskGeneratorService = taskGeneratorServiceFactory.get();
        systemAdapterService = systemAdapterServiceFactory.get();

        serviceManager = new ServiceManager(Arrays.asList(
                dataGeneratorService,
                taskGeneratorService,
                systemAdapterService
        ));

        // A s
        ServiceManagerUtils.startAsyncAndAwaitHealthyAndStopOnFailure(
                serviceManager,
                60, TimeUnit.SECONDS,
                60, TimeUnit.SECONDS);
    }
}

Process synchronization using CompletableFutures (aka Promises)

At present, Hobbit uses Semaphores to synchronize e.g. the waiting for ready signals of components. Here is an alternative way - cleaner and easier. Again, this way, in case of reaching a timeout, we can easily inspect the culprit future(s) that did not resolve.

class MyBenchmarkController {
    public void init() {
        ...
        CompletableFuture<ByteBuffer> dataGenerationFuture = ByteChannelUtils.sendMessageAndAwaitResponse(
                commandChannel,
                ByteBuffer.wrap(new byte[]{Commands.DATA_GENERATOR_START_SIGNAL}),
                Collections.singleton(commandPublisher),
                firstByteEquals(Commands.DATA_GENERATION_FINISHED));

        CompletableFuture<ByteBuffer> taskGenerationFuture = ByteChannelUtils.sendMessageAndAwaitResponse(
                commandChannel,
                ByteBuffer.wrap(new byte[]{Commands.TASK_GENERATOR_START_SIGNAL}),
                Collections.singleton(commandPublisher),
                firstByteEquals(Commands.TASK_GENERATION_FINISHED));

        // Wait for all components to send their ready signals
        CompletableFuture<?> preparationPhaseCompletion = CompletableFuture.allOf(
                dataGenerationFuture,
                taskGenerationFuture,
                systemUnderTestReadyFuture);

        try {
            preparationPhaseCompletion.get(60, TimeUnit.SECONDS);
        } catch(Exception e) {
            throw new RuntimeException("Preparation phase did not complete in time", e);
        }

        ...
    }
...
}

RabbitMQ / AMQP and Java7 Nio ByteChannels

Up to now, the component implementations that I have seen did not take advantage of RabbitMQ, but rather of the abstractions provided by Hobbit's Abstract*Controller classes. In the simplest case, the communication channels that are currently in use by hobbit can be emulated by a java.nio.WritableByteChannel implementation to which listeners can subscribe. This means, that components do not have to declare dependencies on RabbitMQ classes and no RabbitMQ server is necessary. Should RabbitMQ be necessity, then a component can still declare a dependency on it. How to design the local and platform configurations is still a matter of discussion.

Bootstrapping components written in the new style in the platform environment

  • Let's assume the task is to wire up a MyNewStyleBenchmarkController class with the existing framework.
  • Essentially my approach is to create "boostrapping" subclasses of all Abstract*Component classe, e.g. BootstrappingBenchmarkController.
  • This component is launched with the ComponentStarter as usual.
  • The init() method of the BootstrappingBenchmarkController then sets up a dependency injection environment - i.e. creates beans that wrap the RabbitMQ channels
  • Afterwards, the init() method reads out the class name of 'MyNewStyleBenchmarkController' from the system environment (maybe there is a better way)
  • Finally, the init() method instantiates the class and performs dependency injection (i.e. based on its environment)

Streamlining of protocols

This section discusses possible protocls to recommend for communication between components. Note: I use a new Term Task Executor here ; if we see the System Adapter as the component that interfaces with the platform, then the TaskExecutor is part of the System Adapter besides the system.

Source Component Target Component Purpose Protocol Justification
Task Generator Task Executor Request task execution RDF URI Resources Combines TaskID with arbitrary metadata; 2 main attributes should be task category and workload id in order to enable automatic generation of performance charts