Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing file-based functions with line breaks #214

Closed
bigerl opened this issue Jun 28, 2023 · 0 comments · Fixed by #221
Closed

Testing file-based functions with line breaks #214

bigerl opened this issue Jun 28, 2023 · 0 comments · Fixed by #221

Comments

@bigerl
Copy link
Member

bigerl commented Jun 28, 2023

Testing file based functions that are sensitive to line breaks should not rely on files checked into git. Instead, the file should be generated during the test as temporary files that are flagged with deleteOnExit().

@bigerl bigerl mentioned this issue Jun 28, 2023
nck-mlcnv added a commit that referenced this issue Nov 4, 2023
@nck-mlcnv nck-mlcnv linked a pull request Nov 4, 2023 that will close this issue
10 tasks
nck-mlcnv added a commit that referenced this issue Nov 14, 2023
* SPARQLProtocolWorker is a draft for a better, more reliable worker that is tailored towards SPARQL Protocol. Each worker uses a single HttpClient and handles work completion conditions itself.

* Add workerId and ExecutionStats to SPARQLProtocolWorker

Refactored SPARQLProtocolWorker to record workerId and execution stats for each worker. WorkerId was added to uniquely identify each worker. An ExecutionStats inner class was created to track start time, duration, HTTP status code, content length, number of bindings, and number of solutions for each worker's task.

* "Refactor SPARQLProtocolWorker to handle query streams.

This commit changes the query building mechanism within SPARQLProtocolWorker.java, shifting from StringBuilder to InputStream, aiming to support processing of large queries, and reduce overhead from using String for queryID. Now it reads queries directly from QueryHandler's data stream, with modifications to a number of HTTP Request methods to accommodate this change. The refactor also includes addition of new method in Query Handler which returns 'QueryHandle' record—a container for index and InputStream for a query."

* Add streaming support for handling large queries

Introduced InputStream support in the QueryList and QuerySource to handle large queries more efficiently. Changes have been made to IndexedQueryReader, QuerySource, QueryHandler, and several other classes to accommodate the new streaming feature. Previously, all queries were loaded into memory which might cause OutOfMemoryError for large queries. It still depends on the SPARQL worker used if queries are streamed to the client.

* Refactored BigByteArrayOutputStream

* Hashing and large response body support for SPARQLProtocolWorker

* remove dangling javadoc comment

* Scaffold ResponseBodyProcessor. This class keeps track of already handled responses to avoid repeated processing. It uses a concurrent hash map to store the responses identified by unique keys. This approach aims to improve the efficiency of handling response bodies in multi-threaded scenarios.

* Use unsynchronized ByteArrayOutputStream for BigByteArrayInput/BigArrayOutputStream and complete rewrite of BigByteArrayInputStream. This should increase the performance of both streams significantly.

* Add Language Processor and SparqlJsonResultCountingParser

Implemented the AbstractLanguageProcessor interface to process InputStreams. A new SAX Parser (SaxSparqlJsonResultCountingParser) was introduced for SPARQL JSON results, returning solutions, bound values, and variables.

* Completed ResponseBodyProcessor and integrated it into SPARQLProtocolWorker

* Worker integration and removal of a lot of code

* small fixes

* changes to the SPARQLProtocolWorker

* delegated executeQuery method
* reuse bbaos if not consumed
* removed assert for non-differing content-length header value and actual content length
* better logging for malformed url

* Add basic logging for Suite class

* remove JUnit 4 and add surefire plugin

The surefire plugin is used for better control over the available system resources for the test, because the BigByteArrayStream tests can take a lot of them.

* update iguana-schema.json

* Update config file validation and change suiteID generation

This also removes some unused redundant code. The suiteID has also been changed to a string type, that consists of an epoch timestamp in seconds and the hashcode of the configuration file.

* Remove CLIProcessManager.java

* Update schema file and re-enable tests

The validation function has also been made public, for better testing.

* Remove test files for IndexQueryReader

See issue #214.

* Add start and end-time for each worker.

Adjusted the test as well and integrate it in the StresstestResultProcessor and Storages.

* Remove unused dependencies

* Document possible problem with the SPARQLProtocolWorker and the connected client

---------

Co-authored-by: Alexander Bigerl <alexander@bigerl.eu>
Co-authored-by: Alexander Bigerl <bigerl@mail.upb.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants