Skip to content

Question regarding pitch models (Reflow vs DDPM) #193

@ariikamusic

Description

@ariikamusic

Hello,

I have a question regarding the current pitch models, specifically the differences between Reflow and DDPM. With the latest update, it seems like Reflow has become the new default and recommended setting for training acoustic and variance models. While Reflow is very fast—faster than DDPM—it appears to be at the cost of quality.

I've conducted multiple experiments with my dataset of three speakers (a soprano, a mezzo-soprano, and a tenor), each with approximately three hours of Japanese singing data, using the multispeaker method. Unfortunately, the experiments using Reflow for the pitch models have been inconsistent in my experience. The speakers are all very expressive and stylized in their singing, which is rarely reflected in the results. I've tried different batch sizes, maximum steps, step sizes, and switched between L1 and L2 loss functions, but none of these adjustments have produced the desired results. Specifically, I find that Reflow does not accurately replicate the singers' styles. The resulting F0 is relatively flat, with little variation or randomness, and the singing style feels "safe" with minimal vibrato, even when the singer uses vibrato frequently.

On the other hand, experiments using DDPM have yielded much clearer and more accurate results, better replicating the singers' styles. It seems to me that DDPM trains more carefully compared to Reflow.

My question is: What could be the reason for this difference in results between these two diffusion types? Might DDPM be more suited for highly stylized and random singing, especially when using L2 loss for bigger outliers? Is Reflow more suited for singing that is less random?

Thank you in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions