Providing Async method with Java 8 CompletableFuture #208

pprun · 2014-11-13T03:57:07Z

Spark API currently is very simple and blocking,
as Java 8 introduced the async style method by CompletableFuture,
we hope Spark be scale to vary large throughput application.

aplatypus · 2014-11-13T11:06:11Z

If you're interested we got pretty good turn-around with a simple async model. The essential structure is outlined here:

http://stackoverflow.com/questions/26519879/how-to-notify-content-consumed-for-jetty-9-02-contentlistener

Response times really dropped. Unfortunately JAXB (XML) processing remained/s the real processor hog, especially when we timed both the data-comm and XML. HTTP GET-s & PUT-s will be a lot less intensive; it's a good fit.

o0x2a · 2015-11-15T21:18:37Z

It would be great if Spark offer non-blocking APIs. 😍

yeshodhan · 2015-11-20T20:31:50Z

👍

naivefun · 2015-11-27T01:15:07Z

+1 for non blocking

suzel · 2015-12-07T13:02:21Z

👍

vrcca · 2016-01-07T22:22:39Z

+1

EtienneK · 2016-01-12T08:38:18Z

+1

krrg · 2016-01-27T02:25:53Z

+1

dirkharrington · 2016-03-19T01:10:57Z

+1

mgivney · 2016-03-20T17:45:08Z

👍

dinesh707 · 2016-03-21T06:18:15Z

+1

ruurd · 2016-03-22T16:35:48Z

Actually, -1. Let spark be simple. And blocking.

Non-blocking introduces a lot of other complexities that would have to be handled also. I say NO.

krrg · 2016-03-22T16:58:35Z

@ruurd In the spirit of an open discussion, could you expand on this? What complexities did you have in mind?

ruurd · 2016-03-22T17:04:56Z

Lots of threading for example? How many requests would you want to handle simultaneously as a proces? What to do if you pass that threshold? What to do if you have passed it and now the number of simultaneous requests drops below the threshold? How can you simply and meaningfully configure this kind of stuff? Should the configuration be changeable on the fly? And and and...

Besides. If spark cannot process requests fast enough, it is simple enough to put it behind a load balancer.

krrg · 2016-03-22T17:57:16Z

Non-blocking != Lots of Threading. Although threads are one way of implementing a non-blocking server, it is not the only way. See https://docs.oracle.com/javase/8/docs/api/java/nio/channels/Selector.html and http://tutorials.jenkov.com/java-nio/selectors.html#why-use-a-selector, for instance.

The point here is that you can multiplex many requests on a single thread. Obviously this raises different questions of implementation, but lots of threading doesn't have to be one of them.

luolong · 2016-03-23T12:32:13Z

@ruurd: Actually, the blocking version causes more threading as the only way to scale blocking API is to feed it more threads, while non-blocking variant scales quite nicely with relatively low number of threads.

ruurd · 2016-03-23T13:15:04Z

Nice trick, zeroing in on the trheading stuff :-) The main point is that nonblocking IO is going to make spark bigger and more difficult to configure. I think Spark is small, lightweight, easy to get running, short time to market, microservice. If your problem does not fit, find another tool.

northlander · 2016-04-04T04:48:44Z

+1

yeshodhan · 2016-05-03T22:36:36Z

@ruurd stfu!
+1, big time! again!

ruurd · 2016-05-04T07:54:19Z

@yeshodhan stfu yourself!
-30000.

o0x2a · 2016-05-05T13:59:54Z

@ruurd Using nio instead of io will not make things hard for you buddy, so just chill.
Use your time to read on the topic instead.

ruurd · 2016-05-05T15:12:26Z

@Code-guru 1) I'm not your buddy 2) tell @yeshodhan to chill he is starting this and 3) if you really want to use something that entertains async and experience related difficulties, use node. Using an aynchronous IO paradigm will make spark harder to use, harder to maintain, harder to debug, will increase the number of failure modes it has to deal with and just plain does not fit in with what spark wants to be: easy, small, lightweight.

tipsy · 2016-05-05T17:25:15Z

For the people who want this, just how large are you applications?

Making Spark async is not on the roadmap currently, mostly because of the reasons @ruurd just mentioned. We think that ease of use is the main selling point of Spark, so we're very wary of changing the current paradigm into something more complex.
We'll have a look at it for Spark 3, maybe we can find a way to make it extremely simple to use.

krrg · 2016-05-05T20:52:40Z

My service was ~2000 lines of code, servicing about 100,000 HTTP requests a day, usually within a 12 hour window.

We ended up using Vertx, since it supported async, and had the words "Lightweight" "Easy" "Fast" and "Simple" on its homepage.

tipsy · 2016-05-05T22:27:56Z

@krrg Thanks. Did you have performance issues with Spark, or was it a 'better safe than sorry' decision? Did you do a comparison test?

LeifW · 2016-05-05T23:11:38Z

A Scala version of this framework, Scalatra, added non-blocking IO support, using Servlet 3.0+. It's not in the core, but an add-on module.
Given current Spark syntax get("/hello", (req, res) -> "Hello World"), a version using Java 8 CompleteableFuture might return a CompletableFuture<String> instead of simply a String, e.g. get("/hello", (req, res) -> CompleteableFuture("Hello World")) or get("/hello", (req, res) -> someAsyncHttpRequestTo("http://google.com/?q=foo"))

In my opinion, an async version can be less work to configure, as I don't have to pick ahead of time a number of request threads in the servlet container pool (usually just runs on one thread per CPU core).

Some other JVM web frameworks supporting async: Finagle, Netty, JAX-RS, Scalatra, Servlets, Spray, etc...

krrg · 2016-05-06T01:43:40Z

@tipsy It was more "better safe than sorry" approach. Unfortunately I don't have any performance results.

ruurd · 2016-05-06T06:47:53Z

@krrg and if you are a vertx user what is you interest in turning spark into vertx? And @LeifW I think that Scalatra is a Scala version patterned after Sinatra.

ruurd · 2016-05-06T07:10:31Z

@LeifW not having to pick the number of request threads introduces unexpected behavior in that case. What if you have to use your server for additional tasks? How are those tasks going to deal with a program that just hogs all CPUs because it feels like it? So instead of configuring Spark you will need to configure something else NOT to hog your CPU. I'm a big believer in convention over configuration but in this case it most probably will bite you in the proverbial behind the moment your service is being used outside of a development environment. Having to configure the number of requests threads forces you to plan ahead for the case where that number is insufficient.

luolong · 2016-05-06T08:09:58Z

@ruurd I must admit I have not had any reason to configure it, but from what I understand, the underlying fork-join api that is backing the async servlet stuff, has some knobs for tuning the threading behavior.

As a consumer of the async API I really don't have to do anything too different. Basic async servlet examples make the difference clear and very easy:

get AsyncContext from a request
run your processing on a separate thread with the attached async context.
Call asyncContext.complete() when done.

Servlet 3.0 api itself doesn't really impose any specific threading strategies on you.

Most of the very simple samples on the internet use simple thread executor to execute a long running task off the servlet request processing thread, which makes it very malleable to thread pool configuration and execution strategies.

As a Spark api surface area, I imagine that if I register a handler for an endpoint that returns a CompletableFuture instead of a plain result, that should be enough to signal that I really want it to be run asynchronously I imagine there's really no more complexity required.

ruurd · 2016-05-09T05:02:54Z

@luolong OK the scenario I see before me is that you fork of a long running process then rip in no time flat through the handler and spend the rest of the time waiting for the result of the forked process to return the end result. Where did my gains go? And how long is the requestor waiting for a result?
There is only one scenario in which I can imagine that this could make sense at all: in the case that there is no one waiting for a result at short notice (most websites have an NFR that specifies 3 seconds max waiting time for all top level requests in the 99th percentile). Even long running processes are hampered by the fact that the browser will close the connection after a given amount of time. So max runtime would be what? 30 seconds?
I think that microservices should be engineered to yield a result in something in the order of 100 ms tops. And that it should be engineered to run only a single task per request. Anything long running should be handed down to a different proces over a bus as a fire-and-forget. Synchronous IO makes it much easier to measure performance, is easier from a development and testing perspective and the resulting services have a more deterministic behavior meaning that it is easier to derive how the service should be horizontally scaled. If you need to scale then use something like kong. That is specially made for managing microservices and allows you to keep microservices what they are: micro. simple. fast. synchronous :-)

vietj · 2016-05-09T06:50:24Z

@ruurd indeed to fully benefit of non-blocking / async you need non blocking services as well, otherwise there is no real again and more complexity. In the scenario you describe async request / blocking service then you move the thread blocked from the IO layer to another thread (usually a worker pool). However your users could use a non blocking service like a Cassandra client. That being said to me the fundamental problem is that servlet technology is blocking by nature and the non blocking programming model provided by the servlet spec is not trivial (frameworks should make it easier).

Don't get me wrong I'm not pledging for supporting async in SparkJava, you are the boss, I'm just shedding some light on the benefits / drawback of async.

luolong · 2016-05-09T07:38:38Z

Well, @ruurd you can certainly do as you like with this framework. It seems that you have thoroughly thought about this issue and decided against it. I might not share your views, but I do respect them.

The reason I was interested in async support in Spark was that my use case was intermediate service set up to translate web requests from an internal service API to an external services that had a very high probability of being slow. In addition, the internal API was heavily asynchronous.

Having async support in this situation was highly desirable. Anyway, that project is now long done and forgotten -- I ended up simply implementing bare bones Servlets and using the async support provided by Servlet spec instead.

tipsy · 2016-05-09T12:14:54Z

Just to clear things up, @ruurd is not a Spark maintainer.

vietj · 2016-05-09T12:17:28Z

@tipsy sorry for the misunderstanding, anyway I just gave my opinionated view whoever the boss is :-)

kivan-mih · 2016-07-16T14:49:06Z

+1 for async apis. But blocking apis should remain as well, engineer should choose between them.

paulakimenko · 2016-08-25T12:30:54Z

+1

shenliuyang · 2017-08-29T05:44:40Z

+1

mj1618 · 2018-02-12T13:34:17Z

PR submitted in this thread if anyone interested in reviewing (it's a bit of a spike and not merge-ready yet) #549

vinyfalcao · 2018-06-13T15:55:29Z

+1

ghost · 2018-07-29T00:02:01Z

+1

foldik · 2018-10-04T19:04:11Z

+1

kran · 2018-10-12T13:31:13Z

-1 of course.

buckelieg · 2021-02-09T10:23:22Z

As of my 5 cents:
Discussing about sync/async request handling - could it be reasonable to have both options if we talk about framework?
And leave the descision about what to use to developer?
Whenever one has to implement some feature it could be great to have technical capabilities to do the stuff without needing to leave the framework

tipsy added the Feature request label Nov 22, 2015

This was referenced Mar 29, 2016

Concurrency #325

Closed

docs request - discussion of concurrency model #454

Closed

ruurd mentioned this issue Dec 6, 2016

Support of event based (non-blocking) request processing. #549

Open

tipsy mentioned this issue Apr 14, 2018

Async implementation feedback javalin/javalin#191

Closed

tipsy pinned this issue Mar 16, 2019

perwendel unpinned this issue Mar 22, 2019

Providing Async method with Java 8 CompletableFuture #208

Providing Async method with Java 8 CompletableFuture #208

Comments

pprun commented Nov 13, 2014

aplatypus commented Nov 13, 2014

o0x2a commented Nov 15, 2015

yeshodhan commented Nov 20, 2015

naivefun commented Nov 27, 2015

suzel commented Dec 7, 2015

vrcca commented Jan 7, 2016

EtienneK commented Jan 12, 2016

krrg commented Jan 27, 2016

dirkharrington commented Mar 19, 2016

mgivney commented Mar 20, 2016

dinesh707 commented Mar 21, 2016

ruurd commented Mar 22, 2016

krrg commented Mar 22, 2016

ruurd commented Mar 22, 2016

krrg commented Mar 22, 2016

luolong commented Mar 23, 2016

ruurd commented Mar 23, 2016

northlander commented Apr 4, 2016

yeshodhan commented May 3, 2016

ruurd commented May 4, 2016

o0x2a commented May 5, 2016

ruurd commented May 5, 2016

tipsy commented May 5, 2016

krrg commented May 5, 2016

tipsy commented May 5, 2016

LeifW commented May 5, 2016 • edited by tipsy Loading

krrg commented May 6, 2016

ruurd commented May 6, 2016 • edited Loading

ruurd commented May 6, 2016

luolong commented May 6, 2016

ruurd commented May 9, 2016 • edited Loading

vietj commented May 9, 2016

luolong commented May 9, 2016

tipsy commented May 9, 2016

vietj commented May 9, 2016

kivan-mih commented Jul 16, 2016

paulakimenko commented Aug 25, 2016

shenliuyang commented Aug 29, 2017

mj1618 commented Feb 12, 2018

vinyfalcao commented Jun 13, 2018

ghost commented Jul 29, 2018

foldik commented Oct 4, 2018

kran commented Oct 12, 2018

buckelieg commented Feb 9, 2021

LeifW commented May 5, 2016 •

edited by tipsy

Loading

ruurd commented May 6, 2016 •

edited

Loading

ruurd commented May 9, 2016 •

edited

Loading