Refactoring streaming transducer #78

ezerhouni · 2022-07-22T12:49:30Z

This PR is about refactoring beam search code for offline and online.
The idea is to create one class per beam search type. When we add new beam search (fast_beam_search_nbest, etc), we can inherit the necessary class and write as little code as possible.

ezerhouni · 2022-07-22T12:50:28Z

@csukuangfj I started only for streaming rnn-t at the moment. Could you have a quick look and tell me what you think about it. If this refactoring is satisfactory for you, I will do the same for the other models.

sherpa/bin/streaming_pruned_transducer_statelessX/beam_search.py

csukuangfj · 2022-07-22T13:10:03Z

@csukuangfj I started only for streaming rnn-t at the moment. Could you have a quick look and tell me what you think about it. If this refactoring is satisfactory for you, I will do the same for the other models.

Thanks! It looks good to me.

sherpa/bin/streaming_pruned_transducer_statelessX/beam_search.py

ezerhouni · 2022-07-22T14:49:54Z

@csukuangfj The PR should be ready to be reviewed. I tested all the streaming models but not the offline yet (will do next week).
I didn't touch offline_asr.py as it seems to be the same as offline_server.py, is there any difference ?

csukuangfj · 2022-07-22T14:52:38Z

offline_asr.py does not involve network communication. It's standalone, taking sound files as inputs and outputs text.

csukuangfj · 2022-07-22T14:54:02Z

Thanks. I will review it when I come to home(in 30 minutes).

ezerhouni · 2022-07-22T14:56:29Z

Thanks. I will review it when I come to home(in 30 minutes).

Great ! Thank you ! There is no rush

csukuangfj

Thanks!

Left some minor comments.

sherpa/bin/conv_emformer_transducer_stateless2/beam_search.py

sherpa/bin/conv_emformer_transducer_stateless2/streaming_server.py

sherpa/bin/pruned_stateless_emformer_rnnt2/beam_search.py

sherpa/bin/pruned_transducer_statelessX/beam_search.py

sherpa/bin/pruned_transducer_statelessX/offline_asr.py

csukuangfj · 2022-07-23T07:07:00Z

Sorry, there are conflicts now.

ezerhouni · 2022-07-23T07:16:03Z

@csukuangfj I will take care of the conflicts, hopefully it won't be too bad.

csukuangfj · 2022-07-23T07:24:46Z

I just checked it locally and found that most of the conflicts are about style issues:

Hope that it will not cause too much trouble for you.

ezerhouni · 2022-07-23T07:50:47Z

@csukuangfj Just rebased, it was quite straightforward.

sherpa/bin/conv_emformer_transducer_stateless2/streaming_server.py

csukuangfj · 2022-07-23T07:56:08Z

I suggest ignoring errors reported by flake8 in https://github.com/k2-fsa/sherpa/runs/7479612670?check_suite_focus=true

We can edit https://github.com/k2-fsa/sherpa/blob/master/.flake8 to do that. (Either ignore the whole file or ignore
just some specific types of errors.)

You can use
https://github.com/k2-fsa/icefall/blob/master/.flake8
as an example.

csukuangfj · 2022-07-23T08:08:19Z

sherpa/bin/pruned_transducer_statelessX/offline_server.py

        if decoding_method == "greedy_search":
-            nn_and_decoding_func = run_model_and_do_greedy_search
+            self.beam_search = GreedySearchOffline(
+                self.model,


The CI complains
https://github.com/k2-fsa/sherpa/runs/7479632765?check_suite_focus=true

2022-07-23 08:00:57,024 INFO [offline_server.py:258] Using device: cpu Traceback (most recent call last): File "sherpa/bin/pruned_transducer_statelessX/offline_server.py", line 675, in <module> main() File "/opt/hostedtoolcache/Python/3.7.13/x64/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line [28](https://github.com/k2-fsa/sherpa/runs/7479632765?check_suite_focus=true#step:11:29), in decorate_context return func(*args, **kwargs) File "sherpa/bin/pruned_transducer_statelessX/offline_server.py", line 643, in main num_active_paths=num_active_paths, File "sherpa/bin/pruned_transducer_statelessX/offline_server.py", line [29](https://github.com/k2-fsa/sherpa/runs/7479632765?check_suite_focus=true#step:11:30)5, in __init__ self.model, AttributeError: 'OfflineServer' object has no attribute 'model'

I haven't test this part of the code yet. I need to do some extra testing before moving to merging

@csukuangfj Seems like everything passed

Thanks a lot!

Merging

Refactoring streaming transducer

4155598

ezerhouni requested review from csukuangfj and pkufool July 22, 2022 12:49

csukuangfj reviewed Jul 22, 2022

View reviewed changes

sherpa/bin/streaming_pruned_transducer_statelessX/beam_search.py Outdated Show resolved Hide resolved

csukuangfj reviewed Jul 22, 2022

View reviewed changes

sherpa/bin/streaming_pruned_transducer_statelessX/beam_search.py Outdated Show resolved Hide resolved

csukuangfj reviewed Jul 22, 2022

View reviewed changes

sherpa/bin/streaming_pruned_transducer_statelessX/beam_search.py Outdated Show resolved Hide resolved

ezerhouni added 4 commits July 22, 2022 15:57

Refactor code for emformer

8fec2e9

Refactor conv emformer

775ff0d

Rename decode.py to stream.py

a67ad5d

Refactor offline code

7728bf1

Refactor offline_asr

2d87220

csukuangfj reviewed Jul 22, 2022

View reviewed changes

Fix code according to review

ffc92b8

Merge remote-tracking branch 'dan/master' into refactor-beam-code

4cef28d

csukuangfj reviewed Jul 23, 2022

View reviewed changes

sherpa/bin/conv_emformer_transducer_stateless2/streaming_server.py Outdated Show resolved Hide resolved

csukuangfj added the ready label Jul 23, 2022

csukuangfj approved these changes Jul 23, 2022

View reviewed changes

csukuangfj reviewed Jul 23, 2022

View reviewed changes

Fix flake8 file and black formating

8672d1f

Fix offline beam search

edef3c5

csukuangfj added ready and removed ready labels Jul 23, 2022

Add keyword arguments

6cceaea

csukuangfj added ready and removed ready labels Jul 23, 2022

csukuangfj merged commit fca8c18 into k2-fsa:master Jul 23, 2022

ezerhouni changed the title ~~[WIP] Refactoring streaming transducer~~ Refactoring streaming transducer Jul 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactoring streaming transducer #78

Refactoring streaming transducer #78

ezerhouni commented Jul 22, 2022

ezerhouni commented Jul 22, 2022

csukuangfj commented Jul 22, 2022

ezerhouni commented Jul 22, 2022

csukuangfj commented Jul 22, 2022

csukuangfj commented Jul 22, 2022

ezerhouni commented Jul 22, 2022

csukuangfj left a comment

csukuangfj commented Jul 23, 2022

ezerhouni commented Jul 23, 2022

csukuangfj commented Jul 23, 2022

ezerhouni commented Jul 23, 2022

csukuangfj commented Jul 23, 2022

csukuangfj Jul 23, 2022

ezerhouni Jul 23, 2022

ezerhouni Jul 23, 2022

csukuangfj Jul 23, 2022

Refactoring streaming transducer #78

Refactoring streaming transducer #78

Conversation

ezerhouni commented Jul 22, 2022

ezerhouni commented Jul 22, 2022

csukuangfj commented Jul 22, 2022

ezerhouni commented Jul 22, 2022

csukuangfj commented Jul 22, 2022

csukuangfj commented Jul 22, 2022

ezerhouni commented Jul 22, 2022

csukuangfj left a comment

Choose a reason for hiding this comment

csukuangfj commented Jul 23, 2022

ezerhouni commented Jul 23, 2022

csukuangfj commented Jul 23, 2022

ezerhouni commented Jul 23, 2022

csukuangfj commented Jul 23, 2022

csukuangfj Jul 23, 2022

Choose a reason for hiding this comment

ezerhouni Jul 23, 2022

Choose a reason for hiding this comment

ezerhouni Jul 23, 2022

Choose a reason for hiding this comment

csukuangfj Jul 23, 2022

Choose a reason for hiding this comment