-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Testing file-based functions with line breaks #214
Labels
Comments
Merged
Merged
10 tasks
nck-mlcnv
added a commit
that referenced
this issue
Nov 14, 2023
* SPARQLProtocolWorker is a draft for a better, more reliable worker that is tailored towards SPARQL Protocol. Each worker uses a single HttpClient and handles work completion conditions itself. * Add workerId and ExecutionStats to SPARQLProtocolWorker Refactored SPARQLProtocolWorker to record workerId and execution stats for each worker. WorkerId was added to uniquely identify each worker. An ExecutionStats inner class was created to track start time, duration, HTTP status code, content length, number of bindings, and number of solutions for each worker's task. * "Refactor SPARQLProtocolWorker to handle query streams. This commit changes the query building mechanism within SPARQLProtocolWorker.java, shifting from StringBuilder to InputStream, aiming to support processing of large queries, and reduce overhead from using String for queryID. Now it reads queries directly from QueryHandler's data stream, with modifications to a number of HTTP Request methods to accommodate this change. The refactor also includes addition of new method in Query Handler which returns 'QueryHandle' record—a container for index and InputStream for a query." * Add streaming support for handling large queries Introduced InputStream support in the QueryList and QuerySource to handle large queries more efficiently. Changes have been made to IndexedQueryReader, QuerySource, QueryHandler, and several other classes to accommodate the new streaming feature. Previously, all queries were loaded into memory which might cause OutOfMemoryError for large queries. It still depends on the SPARQL worker used if queries are streamed to the client. * Refactored BigByteArrayOutputStream * Hashing and large response body support for SPARQLProtocolWorker * remove dangling javadoc comment * Scaffold ResponseBodyProcessor. This class keeps track of already handled responses to avoid repeated processing. It uses a concurrent hash map to store the responses identified by unique keys. This approach aims to improve the efficiency of handling response bodies in multi-threaded scenarios. * Use unsynchronized ByteArrayOutputStream for BigByteArrayInput/BigArrayOutputStream and complete rewrite of BigByteArrayInputStream. This should increase the performance of both streams significantly. * Add Language Processor and SparqlJsonResultCountingParser Implemented the AbstractLanguageProcessor interface to process InputStreams. A new SAX Parser (SaxSparqlJsonResultCountingParser) was introduced for SPARQL JSON results, returning solutions, bound values, and variables. * Completed ResponseBodyProcessor and integrated it into SPARQLProtocolWorker * Worker integration and removal of a lot of code * small fixes * changes to the SPARQLProtocolWorker * delegated executeQuery method * reuse bbaos if not consumed * removed assert for non-differing content-length header value and actual content length * better logging for malformed url * Add basic logging for Suite class * remove JUnit 4 and add surefire plugin The surefire plugin is used for better control over the available system resources for the test, because the BigByteArrayStream tests can take a lot of them. * update iguana-schema.json * Update config file validation and change suiteID generation This also removes some unused redundant code. The suiteID has also been changed to a string type, that consists of an epoch timestamp in seconds and the hashcode of the configuration file. * Remove CLIProcessManager.java * Update schema file and re-enable tests The validation function has also been made public, for better testing. * Remove test files for IndexQueryReader See issue #214. * Add start and end-time for each worker. Adjusted the test as well and integrate it in the StresstestResultProcessor and Storages. * Remove unused dependencies * Document possible problem with the SPARQLProtocolWorker and the connected client --------- Co-authored-by: Alexander Bigerl <alexander@bigerl.eu> Co-authored-by: Alexander Bigerl <bigerl@mail.upb.de>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Testing file based functions that are sensitive to line breaks should not rely on files checked into git. Instead, the file should be generated during the test as temporary files that are flagged with
deleteOnExit()
.The text was updated successfully, but these errors were encountered: