Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning, truncation. #2

Open
onehundredfeet opened this issue Jun 9, 2023 · 6 comments
Open

Warning, truncation. #2

onehundredfeet opened this issue Jun 9, 2023 · 6 comments

Comments

@onehundredfeet
Copy link

I am getting this while running it:

WARNING:root:frame length (1103) is greater than FFT size (512), frame will be truncated. Increase NFFT to avoid.

My file is 44100 hz, as the original one states.

@fadiaburaid
Copy link

I am getting the same warning as well even with the sample files. However, I don't know whether it affects the output. I don't use Maya so I have no way to test it. I was planning to use Three.js to control the rig blendshapes but this will take some time.

@jh-gglabs
Copy link

Hi, I have found a related issue. The time at converting to speech feature using python_speech_features would make some problems.

@fadiaburaid
Copy link

fadiaburaid commented Jun 16, 2023

After testing the viseme output on actual avatar rig, I can confirm that the warning has no effect. Just make sure the input wav file has a sample rate of 44100hz. The results are not the best because I am using an avatar with Oculus viseme format not a JALI rig so I had to do a conversion. I am sure I can improve on that.

2023-06-16.14-16-37.mp4

@jh-gglabs
Copy link

@fadiaburaid Thanks for sharing, it looks nice though, can you share how to do the conversion from JALI to Oculus viseme format?

@fadiaburaid
Copy link

@jh-gglabs I just map the visemes to their Oculus equivalent ones (Look at the table below). If multiple visemes with the same representation fire up at the same time I take the average. Something still doesn't seem right I am still experimenting. I wanted to use Ovrlipsync as it has better performance from my experience with it in Unity. I managed to find the model, but I am still unable to view the model graph and do the inference.

index JALI Oculus Representation
0 Jaw Not used
1 Lip Not used
2 Ah aa
3 Aa aa
4 Eh E
5 Ee E
6 Ih I
7 Oh O
8 Uh U
9 U U
10 Eu E
11 Schwa E
12 R RR
13 S SS
14 Sh Ch Zh CH
15 Th TH
16 JY CH
17 LNTD RR
18 GK kk
19 MBP PP
20 FV FF
21 WA_PEDAL Not Used

@junhwanjang
Copy link
Owner

Thanks for sharing :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants