Skip to content

Performance Improvement ideas / feature requests #49

@UsernamesLame

Description

@UsernamesLame

As promised, here's the thread I'm making for this.

RE: pre-processing:

In pywhispercpp/model.py we have transcribe and it can take a numpy ndarray. What I was thinking is, rather than load in audio, crush it to mono, set it to 16khz, why not pre-process all that and generate binary blob files that we can feed in that just contain the numpy ndarray?

It's not a big performance increase, but anything we can do outside of Python land ahead of time will give us a win. And I'm ok chasing micro-optimizations in Python land. I'm useless in C++ land.

Also let's put all logging behind a flag to disable it. If possible, lets add a flag to disable whisper.cpp's incessant logging info to stderr. I know it has no impact on the transcription audio, but it should be controllable.

RE: copy.deepcopy

We need to drop @statimethod everywhere, and implement the deep copy methods on the C++ side. This is a minor request from me, it would just let us initialize the model in memory and create a deep copy that we can treat as a completely independent instance.

The other option is I can write a helper class using BytesIO to hold the model in memory and we can feed that to the Model class I guess? It would still be better than re-initializing the model to create a sterile instance.

RE: micro-optimizations

Under _get_segments we have assert end <= n, f"{end} > {n}: `End` index must be less or equal than the total number of segments" but I have to ask, is it even possible to end up in a situation where this assert would come true?

RE: features

Lets make the model usable in a context manager so we can do quick and dirty things like:

with Model("base.en", n_threads=6) as model:
    for segments in model.transcribe("file.mp3")
        for segment in segments:
            print(segment)

Not really necessary, just gives a more pleasant way of interacting with the model class.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions