-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade OpenFST to the latest C++14 version #4565
Comments
Do |
fstproject --help:
I grepped config.h under
The version seems to be 1.8.0 And as for which fstproject:
It seems to be using the fstproject program built under kaldi. I also tried running the problematic command by itself.
It does look like Perhaps this is the relevant replacement flag
This command returned a lot of binary output: |
Just for completeness, making that change where it failed also seems to've worked:
Is this the expected correct outcome? |
OK, so it looks like the OpenFst guys have changed the command line options. Unfortunately they don't seem to have provided any time to change over, by accepting both the old and new options. @jtrmal @kkm000 we'll have to figure out how to deal with this. OpenFst tools don't seem to have a --version option, but we can still run with --help 2>&1 and grep for the option name, to see which option name it has. I think something like that would be better than requiring a specific version (unless we have already had to make non-compatible C++ changes, in which case we can just change all the scripts). |
The API had been also changed between 1.7.2 and 1.7.7. My take is just require 1.8.0+. @danpovey, no, we haven't integrated the new API. So I do not know what to make of @joshuar9 issue--the current codebase would not even compile with 1.8.0. #4533 had been taken, but I don't see any activity happening since. |
Openfst is a C++17 library as of 1.8.0. Since it's a template library, C++17-isms will probably leak into the kaldi source code, so we should probably migrate to C++17 at the same time. Honestly, openfst breaking their API seems moronic to me. It is straightforward to make things backwards compatible. In my view, We can either (1) patch openfst to support the old command-line options. This naively doesn't seem that complicated. Or (2) we can use some "sed" or codemod changes in the entire kaldi codebase to change the openfst binary invocations. Either way, we would probably need to do some diffs in the bin/ directory of openfst from 1.7.2 to 1.8.1. We could then just document the exact ways that we ran codemod or sed for the sake of power users of kaldi who also need to migrate proprietary code (and I think it is fair to see that most kaldi users are power users at this point). |
@galv, you can look at and analyse changes here: https://github.com/kkm000/openfst/tree/original. This is a completely unmodified OpenFST source (with only Kaldi codebase is entirely C++17-compatible. What we're missing is a good CMake build framework to detect compiler features. I do not know about API changes since 1.7.2. The next Windows port is going to be 1.7.3, but I do not remember why I had to put it aside: either I run into something nasty, or just did not have time for that. @jtrmal wrote a full CMake build framework, and I maintain only minimal VS projects to build the base library (libfst) and the binaries (and, by extension, libfstscript or how it's called). Normally, any OS version can be built from the The NEWS file for 1.8.0 shows "* Migration to C++17 (1.8.0)"; I suppose 1.7.9 is C++14. What I propose, given that both I and Yenda are swamped beyond reasonable right now is:
What I want to really avoid is maintaining patches to OpenFST. We did that; because of it an upgrade was nearly impossible. If you want to look into compile problems of Kaldi with FST 1.7.9, i.e. how much API has changed, I'd be very grateful. The change that I know of was in the SymbolTable class, and it was the way the symbol table is enumerated and queried. We do not use symbol tables much, so this should be an easy fix. It's probably too late to try to come up with This will give us a breathing room. I'm thinking of adding a parallel Sticking with 1.7.2 for a bit more is also an option. It's not broken at all. Personally, I'm having no issues using it. @joshuar9, it would be very enlightening to understand what exactly prompted you to upgrade as far forward as 1.8.1? You had to modify Kaldi code, I reckon. Or did you compile against 1.7.2, but use the 1.8.1 binaries? Before solving the problem, I want to make sure we really have a problem. @galv, tangentially, while I have you, if you have time and feel comfortable to review the CUDA decoder part of #4560, I'd really appreciate it, but this one is not urgent at all. And if you know the guts of the decoder and have any ideas what could be causing the three-is-a-charm issue #4556, I'd jump with joy. Possibly literally. |
I'm trying to recall all the steps I took to get here myself. lol. I started by following a Vosk docker definition file https://github.com/alphacep/vosk-server/blob/master/docker/Dockerfile.kaldi-vosk-server Then I did I repeated this just awhile ago, and when I look at kaldi/tools/Makefile, I see:
So I guess this branch has Openfst at 1.8.0. After performing make in the kaldi/tools directory, I get a folder openfst-1.8.0. Ohhh. I just noticed. This is for a fork of the kaldi project alphacep/kaldi vs kaldi-asr/kaldi. And sure enough |
This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open. |
I'm trying to run through the mini_librispeech/s5 example to get a better understanding on how to build models and have encountered an error. I built Kaldi using the MKL math library, and I'm seeing this issue in the relevant log:
All of the log files of the form exp/tri3b/decode_tglarge_dev_clean_2/log/rescorelm.*.log seem to have the same error.
It seems to be a bad command,
lattice-lmrescore --lm-scale=-1.0 "ark:gunzip -c exp/tri3b/decode_tgsmall_dev_clean_2/lat.10.gz|" "fstproject --project_output=true data/lang_test_tgsmall/G.fst |" ark:- | lattice-lmrescore-const-arpa --lm-scale=1.0 ark:- data/lang_test_tglarge/G.carpa "ark,t:|gzip -c>exp/tri3b/decode_tglarge_dev_clean_2/lat.10.gz"
when I run the command by itself (after running
source path.sh
) outside of run.sh, it throws the same error.I don't understand the format of the command enough to correct it myself, nor how to fix it in the script.
The text was updated successfully, but these errors were encountered: