Multi-process Mode (#215)

* Reorder * Separate derived results * Move compute percentiles from latency to result * Start test server in half the cores by default * Refactor a bit * Move multicore code to its own file * Run single-core if cores=1 * Run loadtest in multicore with --cores * Do not use deprecated api * Moved multicore to cluster * Make cluster work with scheduling policy none * Aggregate all results from all workers * Function to combine results * Reject result when there is an error in the cluster * New test to combine results * New test for results * Combine results in map * Add test for empty and complex results * Show how many cores the test run on * Combine histogram correctly * Reset all values before combining * Check elapsed seconds * Show results from workers * Share values amongst cores * Reorder * Wait for server to start * Remove traces * Divide max requests and rps by cores only if present * Share requests and rps properly among cores * Share rps and max requests properly between cores * Show target and effective rps * Show effective rps last * Rename * Store start and end times in ns and ms * Compute elapsed seconds as derivative of start and end times * Improve docs for result * Show cores only if specified * Document --cores * v6.3.0 * Clarify --cores in the API
alexfernandez · Aug 21, 2023 · ee36172 · ee36172
1 parent 93f2d5c
commit ee36172
Show file tree

Hide file tree

Showing 11 changed files with 389 additions and 137 deletions.
diff --git a/README.md b/README.md
@@ -90,9 +90,13 @@ so that you can abort deployment e.g. if 99% of the requests don't finish in 10
 ### Usage Don'ts
 
 `loadtest` saturates a single CPU pretty quickly.
-Do not use `loadtest` if the Node.js process is above 100% usage in `top`, which happens approx. when your load is above 1000~4000 rps.
+Do not use `loadtest` in this mode
+if the Node.js process is above 100% usage in `top`, which happens approx. when your load is above 1000~4000 rps.
 (You can measure the practical limits of `loadtest` on your specific test machines by running it against a simple
-Apache or nginx process and seeing when it reaches 100% CPU.)
+[test server](#test-server)
+and seeing when it reaches 100% CPU.)
+In this case try using in multi-process mode using the `--cores` parameter,
+see below.
 
 There are better tools for that use case:
 
@@ -260,8 +264,9 @@ The following parameters are _not_ compatible with Apache ab.
 #### `--rps requestsPerSecond`
 
 Controls the number of requests per second that are sent.
-Can be fractional, e.g. `--rps 0.5` sends one request every two seconds.
-Not used by default: each request is sent as soon as the previous one is responded.
+Cannot be fractional, e.g. `--rps 0.5`.
+In this mode each request is not sent as soon as the previous one is responded,
+but periodically even if previous requests have not been responded yet.
 
 Note: Concurrency doesn't affect the final number of requests per second,
 since rps will be shared by all the clients. E.g.:
@@ -276,6 +281,19 @@ to send all of the rps, adjust it with `-c` if needed.
 
 Note: --rps is not supported for websockets.
 
+#### `--cores number`
+
+Start `loadtest` in multi-process mode on a number of cores simultaneously.
+Useful when a single CPU is saturated.
+Forks the requested number of processes using the
+[Node.js cluster module](https://nodejs.org/api/cluster.html).
+
+In this mode the total number of requests and the rps rate are shared among all processes.
+The result returned is the aggregation of results from all cores.
+
+Note: this option is not available in the API,
+where it runs just in the provided process.
+
 #### `--timeout milliseconds`
 
 Timeout for each generated request in milliseconds.
@@ -337,11 +355,11 @@ Sets the certificate for the http client to use. Must be used with `--key`.
 
 Sets the key for the http client to use. Must be used with `--cert`.
 
-### Server
+### Test Server
 
 loadtest bundles a test server. To run it:
 
-    $ testserver-loadtest [--delay ms] [error 5xx] [percent yy] [port]
+    $ testserver-loadtest [options] [port]
 
 This command will show the number of requests received per second,
 the latency in answering requests and the headers for selected requests.
@@ -354,6 +372,27 @@ The optional delay instructs the server to wait for the given number of millisec
 before answering each request, to simulate a busy server.
 You can also simulate errors on a given percent of requests.
 
+The following optional parameters are available.
+
+#### `--delay ms`
+
+Wait the specified number of milliseconds before answering each request.
+
+#### `--error 5xx`
+
+Return the given error for every request.
+
+#### `--percent yy`
+
+Return an error (default 500) only for the specified % of requests.
+
+#### `--cores number`
+
+Number of cores to use. If not 1, will start in multi-process mode.
+
+Note: since version v6.3.0 the test server uses half the available cores by default;
+use `--cores 1` to use in single-process mode.
+
 ### Complete Example
 
 Let us now see how to measure the performance of the test server.
@@ -364,8 +403,9 @@ First we install `loadtest` globally:
 
 Now we start the test server:
 
-    $ testserver-loadtest
-    Listening on port 7357
+    $ testserver-loadtest --cores 2
+    Listening on http://localhost:7357/
+    Listening on http://localhost:7357/
 
 On a different console window we run a load test against it for 20 seconds
 with concurrency 10 (only relevant results are shown):
@@ -458,7 +498,7 @@ The result (with the same test server) is impressive:
     99%      10 ms
     100%      25 ms (longest request)
 
-Now you're talking! The steady rate also goes up to 2 krps:
+Now we're talking! The steady rate also goes up to 2 krps:
 
     $ loadtest http://localhost:7357/ -t 20 -c 10 --keepalive --rps 2000
     ...
@@ -528,7 +568,7 @@ and will not call the callback.
 
 The latency result returned at the end of the load test contains a full set of data, including:
 mean latency, number of errors and percentiles.
-An example follows:
+A simplified example follows:
 
 ```javascript
 {
@@ -545,8 +585,8 @@ An example follows:
 	'95': 11,
 	'99': 15
   },
-  rps: 2824,
-  totalTimeSeconds: 0.354108,
+  effectiveRps: 2824,
+  elapsedSeconds: 0.354108,
   meanLatencyMs: 7.72,
   maxLatencyMs: 20,
   totalErrors: 3,

diff --git a/bin/loadtest.js b/bin/loadtest.js
@@ -3,6 +3,8 @@
 import {readFile} from 'fs/promises'
 import * as stdio from 'stdio'
 import {loadTest} from '../lib/loadtest.js'
+import {runTask} from '../lib/cluster.js'
+import {Result} from '../lib/result.js'
 
 
 const options = stdio.getopt({
@@ -32,8 +34,9 @@ const options = stdio.getopt({
 	key: {args: 1, description: 'The client key to use'},
 	cert: {args: 1, description: 'The client certificate to use'},
 	quiet: {description: 'Do not log any messages'},
+	cores: {args: 1, description: 'Number of cores to use', default: 1},
 	agent: {description: 'Use a keep-alive http agent (deprecated)'},
-	debug: {description: 'Show debug messages (deprecated)'}
+	debug: {description: 'Show debug messages (deprecated)'},
 });
 
 async function processAndRun(options) {
@@ -51,21 +54,62 @@ async function processAndRun(options) {
 		help();
 	}
 	options.url = options.args[0];
-	try {
-		const result = await loadTest(options)
-		result.show()
-	} catch(error) {
-		console.error(error.message)
-		help()
+	options.cores = parseInt(options.cores) || 1
+	const results = await runTask(options.cores, async workerId => await startTest(options, workerId))
+	if (!results) {
+		process.exit(0)
+		return
+	}
+	showResults(results)
+}
+
+function showResults(results) {
+	if (results.length == 1) {
+		results[0].show()
+		return
+	}
+	const combined = new Result()
+	for (const result of results) {
+		combined.combine(result)
+	}
+	combined.show()
+}
+
+async function startTest(options, workerId) {
+	if (!workerId) {
+		// standalone; controlled errors
+		try {
+			return await loadTest(options)
+		} catch(error) {
+			console.error(error.message)
+			return help()
+		}
+	}
+	shareWorker(options, workerId)
+	return await loadTest(options)
+}
+
+function shareWorker(options, workerId) {
+	options.maxRequests = shareOption(options.maxRequests, workerId, options.cores)
+	options.rps = shareOption(options.rps, workerId, options.cores)
+}
+
+function shareOption(option, workerId, cores) {
+	if (!option) return null
+	const total = parseInt(option)
+	const shared = Math.round(total / cores)
+	if (workerId == cores) {
+		// last worker gets remainder
+		return total - shared * (cores - 1)
+	} else {
+		return shared
 	}
 }
 
 await processAndRun(options)
 
-/**
- * Show online help.
- */
 function help() {
 	options.printHelp();
 	process.exit(1);
 }
+
diff --git a/bin/testserver.js b/bin/testserver.js
@@ -3,40 +3,51 @@
 import * as stdio from 'stdio'
 import {startServer} from '../lib/testserver.js'
 import {loadConfig} from '../lib/config.js'
+import {getHalfCores, runTask} from '../lib/cluster.js'
 
+const options = readOptions()
+start(options)
 
-const options = stdio.getopt({
-	delay: {key: 'd', args: 1, description: 'Delay the response for the given milliseconds'},
-	error: {key: 'e', args: 1, description: 'Return an HTTP error code'},
-	percent: {key: 'p', args: 1, description: 'Return an error (default 500) only for some % of requests'},
-});
-const configuration = loadConfig()
-if (options.args && options.args.length == 1) {
-	options.port = parseInt(options.args[0], 10);
-	if (!options.port) {
-		console.error('Invalid port');
-		options.printHelp();
-		process.exit(1);
+
+function readOptions() {
+	const options = stdio.getopt({
+		delay: {key: 'd', args: 1, description: 'Delay the response for the given milliseconds'},
+		error: {key: 'e', args: 1, description: 'Return an HTTP error code'},
+		percent: {key: 'p', args: 1, description: 'Return an error (default 500) only for some % of requests'},
+		cores: {key: 'c', args: 1, description: 'Number of cores to use, default is half the total', default: getHalfCores()}
+	});
+	const configuration = loadConfig()
+	if (options.args && options.args.length == 1) {
+		options.port = parseInt(options.args[0], 10);
+		if (!options.port) {
+			console.error('Invalid port');
+			options.printHelp();
+			process.exit(1);
+		}
 	}
-}
-if(options.delay) {
-	if(isNaN(options.delay)) {
-		console.error('Invalid delay');
-		options.printHelp();
-		process.exit(1);
+	if(options.delay) {
+		if(isNaN(options.delay)) {
+			console.error('Invalid delay');
+			options.printHelp();
+			process.exit(1);
+		}
+		options.delay = parseInt(options.delay, 10);
 	}
-	options.delay = parseInt(options.delay, 10);
-}
 
-if(!options.delay) {
-	options.delay = configuration.delay
-}
-if(!options.error) {
-	options.error = configuration.error
+	if(!options.delay) {
+		options.delay = configuration.delay
+	}
+	if(!options.error) {
+		options.error = configuration.error
+	}
+	if(!options.percent) {
+		options.percent = configuration.percent
+	}
+	return options
 }
-if(!options.percent) {
-	options.percent = configuration.percent
+
+function start(options) {
+	runTask(options.cores, async () => await startServer(options))
 }
 
-startServer(options);
 
diff --git a/lib/cluster.js b/lib/cluster.js
@@ -0,0 +1,42 @@
+process.env.NODE_CLUSTER_SCHED_POLICY = 'none'
+
+import {cpus} from 'os'
+// dynamic import as workaround: https://github.com/nodejs/node/issues/49240
+const cluster = await import('cluster')
+
+
+export function getHalfCores() {
+	const totalCores = cpus().length
+	return Math.round(totalCores / 2) || 1
+}
+
+export async function runTask(cores, task) {
+	if (cores == 1) {
+		return [await task()]
+	}
+	if (cluster.isPrimary) {
+		return await runWorkers(cores)
+	} else {
+		const result = await task(cluster.worker.id)
+		process.send(result)
+	}
+}
+
+function runWorkers(cores) {
+	return new Promise((resolve, reject) => {
+		const results = []
+		for (let index = 0; index < cores; index++) {
+			const worker = cluster.fork()
+			worker.on('message', message => {
+				results.push(message)
+				if (results.length === cores) {
+					return resolve(results)
+				}
+			})
+			worker.on('error', error => {
+				return reject(error)
+			})
+		}
+	})
+}
+