Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Falcon benchmark #1

Merged
merged 32 commits into from
Nov 29, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add unknown words
  • Loading branch information
Pouya Rostam committed Nov 27, 2023
commit f605a935dea6e57f0ac6e3a96fc27bc3196b80c1
1 change: 1 addition & 0 deletions .spell-check/.cspell.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,5 +18,6 @@
"**/*.pv",
"**/*.so",
"**/*.wav",
"**/*.json",
]
}
27 changes: 27 additions & 0 deletions .spell-check/dict.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
Apim
DIHARD
Diarization
Jaccard
PICOVOICE
Ryzen
barh
boto
diarization
edgecolor
figsize
fontsize
jaccard
matplotlib
omegaconf
picovoice
pretrained
protobuf
psutil
pvfalcon
pyannote
rttm
soundfile
tqdm
xlim
xticks
ylabel
1 change: 0 additions & 1 deletion benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,6 @@ def main() -> None:
parser.add_argument("--azure-subscription-key")
parser.add_argument("--gcp-bucket-name")
parser.add_argument("--gcp-credentials")
parser.add_argument("--nemo-model-config")
parser.add_argument("--picovoice-access-key")
parser.add_argument("--pyannote-auth-token")
parser.add_argument("--type", choices=[bt.value for bt in BenchmarkTypes], required=True)
Expand Down
2 changes: 2 additions & 0 deletions dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ def create(cls, x: Datasets, data_folder: str, **kwargs: Any) -> "Dataset":

class VoxConverse(Dataset):
def __init__(self, data_folder: str, label_folder: str, only_en: bool = True) -> None:
# / *spell - checker: disable * /
en_audio_files = {
"aepyx.wav", "aggyz.wav", "aiqwk.wav", "aorju.wav", "auzru.wav", "bjruf.wav", "bmsyn.wav", "bvqnu.wav",
"bvyvm.wav", "bxcfq.wav", "cadba.wav", "cawnd.wav", "clfcg.wav", "cpebh.wav", "cqfmj.wav", "crorm.wav",
Expand Down Expand Up @@ -69,6 +70,7 @@ def __init__(self, data_folder: str, label_folder: str, only_en: bool = True) ->
"ytula.wav", "yukhy.wav", "zedtj.wav", "zehzu.wav", "zowse.wav", "zqidv.wav", "zsgto.wav", "zzsba.wav",
"zztbo.wav",
}
# / *spell - checker: enable * /
self._samples = list()

files = glob.iglob(os.path.join(data_folder, "*.wav"))
Expand Down
1 change: 0 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ pvfalcon
pyannote.audio
pyannote.metrics
requests
simple-diarizer
soundfile
torch
tqdm
mrrostam marked this conversation as resolved.
Show resolved Hide resolved
Loading