index.html

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="generator" content="Hugo 0.88.1" />
<meta name="viewport" content="width=device-width, initial-scale=1">
<link href="https://fonts.googleapis.com/css?family=Roboto:300,400,700" rel="stylesheet" type="text/css">
<link rel="stylesheet" href="//cdnjs.cloudflare.com/ajax/libs/highlight.js/8.4/styles/github.min.css">
<link rel="stylesheet" href="css/normalize.css">
<link rel="stylesheet" href="css/skeleton.css">
<link rel="stylesheet" href="css/custom.css">
<link rel="alternate" href="index.xml" type="application/rss+xml" title="Speech Research">
<link rel="shortcut icon" href="favicon.png" type="image/x-icon" />
<title>Speech Research</title>
</head>
<body>

<div class="container">

	<header role="banner">
	<h1 class="site-title">Speech Research </h1>		
	</header>
	This page lists some speech related research at Microsoft Research Asia, conducted by the team led by <a href="https://www.microsoft.com/en-us/research/people/xuta/">Xu Tan</a>. The research topics cover text to speech, singing voice synthesis, music generation, automatic speech recognition, etc. Some research are open-sourced via <a href="https://github.com/microsoft/neuralspeech">NeuralSpeech</a> and <a href="https://github.com/microsoft/muzic">Muzic</a>.
	<br>
	<br>
	We are hiring researchers on speech, NLP, and deep learning at Microsoft Research Asia. Please contact xuta@microsoft.com if you have interests.
	<br>
	<br>

	<main role="main">
		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/prompttts2/">PromptTTS 2: Describing and Generating Voices with Text Prompt</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2023-09-07">September 07, 2023</time></span>
		</article>
		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/naturalspeech2/">NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2023-4-19">April 19, 2023</time></span>
		</article>
		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/prompttts/">PromptTTS: Controllable Text-to-Speech with Text Descriptions</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2022-11-22">November 22, 2022</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/videodubbing/">VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2022-08-30">August 30, 2022</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/binauralgrad/">BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2022-05-29">May 29, 2022</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/naturalspeech/">NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2022-05-03">May 03, 2022</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/mpbert/">Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2022-04-02">April 02, 2022</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/adaspeech4/">AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2022-03-06">March 06, 2022</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/speechtransducer/">Speech-T: Transducer for Text to Speech and Beyond</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2021-10-06">October 06, 2021</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/telemelody/">TeleMelody: Lyric-to-Melody Generation with a Template-Based Two-Stage Method</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2021-09-21">September 21, 2021</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/deeprapper/">DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2021-08-16">August 16, 2021</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/priorgrad/">PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Driven Adaptive Prior</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2021-06-11">June 11, 2021</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/adaspeech3/">AdaSpeech 3: Adaptive Text to Speech for Spontaneous Style</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2021-06-02">June 02, 2021</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/adaspeech2/">AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2021-03-05">March 05, 2021</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/adaspeech/">AdaSpeech: Adaptive Text to Speech for Custom Voice</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2021-03-01">March 01, 2021</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/fastspeech2/">FastSpeech 2: Fast and High-Quality End-to-End Text to Speech</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2021-02-10">February 10, 2021</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/songmass/">SongMASS: Automatic Song Writing with Pre-training and Alignment Constraint</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2020-12-14">December 14, 2020</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/lightspeech/">LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2020-11-03">November 03, 2020</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/denoispeech/">DenoiSpeech: Denoising Text to Speech with Frame-Level Noise Modeling</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2020-10-14">October 14, 2020</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/hifisinger/">HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2020-09-02">September 02, 2020</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/popmag/">PopMAG: Pop Music Accompaniment Generation</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2020-08-01">August 01, 2020</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/uwspeech/">UWSpeech: Speech to Speech Translation for Unwritten Languages</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2020-06-12">June 12, 2020</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/multispeech/">MultiSpeech: Multi-Speaker Text to Speech with Transformer</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2020-05-09">May 09, 2020</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/seminas/">Semi-Supervised Neural Architecture Search</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2020-03-01">March 01, 2020</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/deepsinger/">DeepSinger: Singing Voice Synthesis with Data Mined From the Web</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2020-02-14">February 14, 2020</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/lrspeech/">LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2020-02-02">February 02, 2020</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/fastspeech/">FastSpeech: Fast, Robust and Controllable Text to Speech</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2019-05-10">May 10, 2019</time></span>
		</article>

		<article itemscope itemtype="http://schema.org/Blog">
			<h2 class="entry-title" itemprop="headline"><a href="/unsuper/">Almost Unsupervised Text to Speech and Automatic Speech Recognition</a></h2>
			<span class="entry-meta"><time itemprop="datePublished" datetime="2019-04-10">April 10, 2019</time></span>
		</article>

	</main>


</div>

<script>
	(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
	(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
	m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
	})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
	ga('create', 'UA-139981676-1', 'auto');
	ga('send', 'pageview');
</script>

<script src="//cdnjs.cloudflare.com/ajax/libs/highlight.js/8.4/highlight.min.js"></script>
<script>hljs.initHighlightingOnLoad();</script>


<script type="text/x-mathjax-config">
     MathJax.Hub.Config({
         HTML: ["input/TeX","output/HTML-CSS"],
         TeX: {
                Macros: {
                         bm: ["\\boldsymbol{#1}", 1],
                         argmax: ["\\mathop{\\rm arg\\,max}\\limits"],
                         argmin: ["\\mathop{\\rm arg\\,min}\\limits"]},
                extensions: ["AMSmath.js","AMSsymbols.js"],
                equationNumbers: { autoNumber: "AMS" } },
         extensions: ["tex2jax.js"],
         jax: ["input/TeX","output/HTML-CSS"],
         tex2jax: { inlineMath: [ ['$','$'], ["\\(","\\)"] ],
                    displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
                    processEscapes: true },
         "HTML-CSS": { availableFonts: ["TeX"],
                       linebreaks: { automatic: true } }
     });
 </script>

 <script type="text/x-mathjax-config">
     MathJax.Hub.Config({
       tex2jax: {
         skipTags: ['script', 'noscript', 'style', 'textarea', 'pre', 'code']
       }
     });
 </script>

 <script type="text/javascript" async
   src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.4/MathJax.js?config=TeX-MML-AM_CHTML">
 </script>


</body>
</html>