Inference Engine async writes #574

taenin · 2024-10-02T22:49:35Z

This PR is based on #558

This PR updates how writes are done for inference.

If all requests are successful, the final written file will have all responses in the same order as the provided input for all engines.
During inference, all requests are written to a /scratch directory containing a file with the same name. There is no guarantee of order on the values written to this file (this truly only matters for the InferenceEngines that leverage parallelism).

Non-parallel engines will write to disk in-line (blocking). I've benchmarked this for medium sized files (100s of MB): appending a line of text to these files takes on average 1.788e-05 seconds

wizeng23 · 2024-10-02T23:03:52Z

src/oumi/core/inference/base_inference_engine.py

+        original_filepath = Path(output_filepath)
+        return str(original_filepath.parent / "scratch" / original_filepath.name)
+
+    async def _save_conversation_async(


Is there a lock on the file writer? If so, at what level is it defined?

Since this comment I've removed the _save_conversation_async method as I found it's actually slower for writes (mostly benefiting reads).

That said, with my new code we get strong guarantees about no concurrent access from asyncio itself. In asyncio code, a context switch can only happen when you hit an await block. As I'm now doing my write outside of an await, there can only be one concurrent write at a time.

This is distinct from threads, where context switches are far less predictable.

src/oumi/core/inference/base_inference_engine.py

nikg4 · 2024-10-02T23:36:04Z

src/oumi/inference/llama_cpp_inference_engine.py

+                # Write what we have so far to our scratch directory.
+                self._save_conversation(
+                    new_conversation,
+                    self._get_scratch_filepath(generation_config.output_filepath),


this is saving conversations into scratch directory, not to the originally configured dir. Wouldn't it be a problem ?

This is a fair point. I've updated my approach now to only save to a scratch directory for async code. Sync paths will simply append to the target file. Updated!

taenin added 3 commits October 2, 2024 12:38

Add async writes.

30b33cf

Update the GCP inference script.

bb1e1e8

Merge branch 'main' into taenin/async_writes

6f7c1f7

taenin requested review from oelachqar, wizeng23, jgreer013 and nikg4 October 2, 2024 22:50

taenin marked this pull request as ready for review October 2, 2024 22:50

wizeng23 approved these changes Oct 2, 2024

View reviewed changes

taenin changed the title ~~Taenin/async writes~~ Inference Engine async writes Oct 2, 2024

nikg4 reviewed Oct 2, 2024

View reviewed changes

src/oumi/core/inference/base_inference_engine.py Show resolved Hide resolved

nikg4 reviewed Oct 2, 2024

View reviewed changes

src/oumi/core/inference/base_inference_engine.py Outdated Show resolved Hide resolved

nikg4 reviewed Oct 2, 2024

View reviewed changes

nikg4 approved these changes Oct 2, 2024

View reviewed changes

taenin added 2 commits October 2, 2024 16:50

Updates after removing asyncio files.

62c896f

Remove comments.

5b6ef1e

taenin merged commit 1524cfd into main Oct 3, 2024
1 check passed

taenin deleted the taenin/async_writes branch October 3, 2024 16:14

taenin mentioned this pull request Oct 3, 2024

Realtime Inference Writes #558

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference Engine async writes #574

Inference Engine async writes #574

taenin commented Oct 2, 2024

wizeng23 Oct 2, 2024

taenin Oct 2, 2024

nikg4 Oct 2, 2024 •

edited

Loading

taenin Oct 2, 2024

Inference Engine async writes #574

Inference Engine async writes #574

Conversation

taenin commented Oct 2, 2024

wizeng23 Oct 2, 2024

Choose a reason for hiding this comment

taenin Oct 2, 2024

Choose a reason for hiding this comment

nikg4 Oct 2, 2024 • edited Loading

Choose a reason for hiding this comment

taenin Oct 2, 2024

Choose a reason for hiding this comment

nikg4 Oct 2, 2024 •

edited

Loading