| Year | name | Paper | Brief | assese |
|---|---|---|---|---|
| 2022 | Learn2Sing 2.0 | demo效果不好 | - | |
| 2020 | SymphonyNet | 西工大未开源,效果优于learn2sing | - |
| Year | name | Paper | Brief | assese |
|---|---|---|---|---|
| 2023 | Multitrack Music Transformer | arXiv | 多轨tranformer 数据集:Symbolic orchestral database (SOD). | - |
| 2023 | Multitrack Music Transformer | arXiv | 多轨tranformer 数据集:Symbolic orchestral database (SOD). | - |
| 2020 | MuseMorphose: Full-Song and Fine-Grained Piano Music Style Transfer with One Transformer VAE | 2022 TASLP arXiv demo | perform style transfer on long musical pieces, while allowing users to control musical attributes down to the bar level | - |
| 2020 | MuseMorphose: Full-Song and Fine-Grained Piano Music Style Transfer with One Transformer VAE | 2022 TASLP arXiv demo | perform style transfer on long musical pieces, while allowing users to control musical attributes down to the bar level | - |
| 2022 | Lyric-Melody-Generation | - | ||
| 2020 | Jukebox: A Generative Model for Music | arXiv | - | |
| 2019 | Music Transformer tf2.0 pytorch | 2019 ICLR arXiv | - | |
| 2020 | Pop music transformer | arXiv blog | REMI | - |
| 2020 | SmallMusicVAE: An encoded latent space model for music variational autoencoder. | arXiv | magenta project - | - |
| 08.03 | Structure-Enhanced Pop Music Generation via Harmony-Aware Learning | ACM MM 2022 arXiv blog | 1.The repo may be incomplete and some of the code is a bit messy. We will improve in the near future - | - |
| 08.03 | GIGA-Piano-XL | SOTA Piano Transformer model (oops!It's not) trained on 4.2GB of Solo Piano MIDI music - | - | |
| 08.03 | Orchestrator | Local windowed attention multi-instrumental music transformer tailored for music orchestration/instrumentation and stable music generation- | - | |
| 08.03 | Euterpe | Multi-Instrumental Music Transformer trained on 12GB/400k MIDIs 1) Improvisation 2) Single Continuation 3) Auto-Continuation 4) Inpainting 5) Melody Orchestration | - | |
| 09.02 | ERNIE-Music: Text-to-Waveform Music Generation with Diffusion Models | arXiv | - | - |
| 2023.01.27 | RAVE2 | arXiv blog samples | - | |
| 18.01 | Msanii: High Fidelity Music Synthesis on a Shoestring Budget | arXiv | GitHub | Hugging Face Colab |
| 16.01 | ArchiSound: Audio Generation with Diffusion | arXiv | GitHub | - |
| Year | name | Paper | Brief | assese |
|---|---|---|---|---|
| 08.03 | AccoMontage2 | ISMIR 2022 paper | Chord and accompaniment generator, pure python package that generate chord progression and accompaniment according to given melodies. | |
| 2023.1.30 | SingSong: Generating musical accompaniments from singing | paper |
| Year | name | Paper | Brief | assese |
|---|---|---|---|---|
| 08.03 | Automatic Analysis and Influence of Hierarchical Structure on Melody, Rhythm and Harmony in Popular Music | CSMC 2021 paper | use 909pop, 1) a novel algorithm to extract repetition structure at both phrase and section levels from a MIDI data set of popular music, 2) formal evidence that melody, harmony and rhythm are organized to reflect different levels of hierarchy, 3) data-driven models offering new music features and insights for traditional music theory .more detail |
| Year | name | Paper | Brief | assese |
|---|---|---|---|---|
| 2023.04 | TANGO: Text-to-Audio generation using instruction tuned LLM and Latent Diffusion Model | arx | adopt such an instruction-tuned LLM FLAN-T5 as the text encoder for text-to-audio (TTA) generation—a task where the goal is to generate an audio from its textual description | |
| 2023.04 | NaturalSpeech 2:Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers | arx | a TTS system that uses a latent diffusion model to synthesize natural voices with high expressiveness/robustness/fidelity and strong zero-shot ability. | |
| 2023.04 | Bark: Text-Prompted Generative Audio Model | a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. . | ||
| 2023.04 | AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models | arx | an instruction-guided audio editing model based on latent diffusion models | |
| 2023.01.30 | AudioLDM: Text-to-Audio Generation with Latent Diffusion Models | arx 1 Hugging Face | Text-to-Audio Generation: Generate audio given text input.Audio-to-Audio Generation: Given an audio, generate another audio that contain the same type of sound.Text-guided Audio-to-Audio Style Transfer: Transfer the sound of an audio into another one using the text description.- | |
| 2023.01.30 | Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion | arXiv | - | |
| 2023.02.08 | Noise2Music: Text-conditioned Music Generation with Diffusion Models | arXiv | - | |
| 2023.02.08 | MusicLM: Generating Music From Text | paper dataset GitHub (unofficial) | - | |
| 2023.02.04 | Multi-Source Diffusion Models for Simultaneous Music Generation and Separation | paper blog | - | |
| 2023.01.29 | Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models | paper | - |
| Year | name | Paper | Brief | assese |
|---|---|---|---|---|
| 08.03 | Slakh | blog | synthesize large amounts of MIDI data into audio files (1000s of hours) withLakh MIDI Dataset v0.1- | |
| 08.03 | Jazznet | The jazznet dataset is an extensible dataset containing 162520 labeled 【piano patterns】: chords, arpeggios, scales, and chord progressions, and their inversions in all keys of the 88-key piano.- | ||
| 08.03 | pop909 | paper | the vocal melody, the lead instrument melody, and the piano accompaniment for each song in MIDI format aligend.Task 1: Piano accompaniment generation;Task 2: Re-orchestration from audio | 伴奏生成,结构化生成可尝试 |