Skip to content

Muzic: Music Understanding and Generation with Artificial Intelligence

License

Notifications You must be signed in to change notification settings

microsoft/muzic

Repository files navigation




Muzic is a research project on AI music that empowers music understanding and generation with deep learning and artificial intelligence. Muzic is pronounced as [ˈmjuːzeik]. Besides the logo in image version (see above), Muzic also has a logo in video version (you can click here to watch ). Muzic was started by some researchers from Microsoft Research Asia and also contributed by outside collaborators.


We summarize the scope of our Muzic project in the following figure:


The current work in Muzic includes:


For more speech related research, you can find from this page: https://speechresearch.github.io/ and https://github.com/microsoft/NeuralSpeech.

We are hiring!

We are hiring both research FTEs and research interns on Speech/Audio/Music/Video and LLMs. Please get in touch with Xu Tan (tanxu2012@gmail.com) if you are interested.

What is New?

  • CLaMP has won the Best Student Paper Award at ISMIR 2023!
  • We release MusicAgent, an AI agent for versatile music processing using large language models.
  • We release MuseCoco, a music composition copilot to generate symbolic music from text.
  • We release GETMusic, a versatile music copliot with a universal representation and diffusion framework to generate any music tracks.
  • We release the first model for cross-modal symbolic MIR: CLaMP.
  • We release two new research work on music structure modeling: MeloForm and Museformer.
  • We give a tutorial on AI Music Composition at ACM Multimedia 2021.

Requirements

The operating system is Linux. We test on Ubuntu 16.04.6 LTS, CUDA 10, with Python 3.6.12. The requirements for running Muzic are listed in requirements.txt. To install the requirements, run:

pip install -r requirements.txt

We release the code of several research work: MusicBERT, PDAugment, CLaMP, DeepRapper, SongMASS, TeleMelody, ReLyMe, Re-creation of Creations (ROC), MeloForm, Museformer, GETMusic, MuseCoco, and MusicAgent. You can find the README in the corresponding folder for detailed instructions on how to use.

Reference

If you find the Muzic project useful in your work, you can cite the papers as follows:

  • [1] MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training, Mingliang Zeng, Xu Tan, Rui Wang, Zeqian Ju, Tao Qin, Tie-Yan Liu, ACL 2021.
  • [2] PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription, Chen Zhang, Jiaxing Yu, Luchin Chang, Xu Tan, Jiawei Chen, Tao Qin, Kejun Zhang, ISMIR 2022.
  • [3] DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling, Lanqing Xue, Kaitao Song, Duocai Wu, Xu Tan, Nevin L. Zhang, Tao Qin, Wei-Qiang Zhang, Tie-Yan Liu, ACL 2021.
  • [4] SongMASS: Automatic Song Writing with Pre-training and Alignment Constraint, Zhonghao Sheng, Kaitao Song, Xu Tan, Yi Ren, Wei Ye, Shikun Zhang, Tao Qin, AAAI 2021.
  • [5] TeleMelody: Lyric-to-Melody Generation with a Template-Based Two-Stage Method, Zeqian Ju, Peiling Lu, Xu Tan, Rui Wang, Chen Zhang, Songruoyao Wu, Kejun Zhang, Xiangyang Li, Tao Qin, Tie-Yan Liu, EMNLP 2022.
  • [6] ReLyMe: Improving Lyric-to-Melody Generation by Incorporating Lyric-Melody Relationships, Chen Zhang, LuChin Chang, Songruoyao Wu, Xu Tan, Tao Qin, Tie-Yan Liu, Kejun Zhang, ACM Multimedia 2022.
  • [7] Re-creation of Creations: A New Paradigm for Lyric-to-Melody Generation, Ang Lv, Xu Tan, Tao Qin, Tie-Yan Liu, Rui Yan, arXiv 2022.
  • [8] MeloForm: Generating Melody with Musical Form based on Expert Systems and Neural Networks, Peiling Lu, Xu Tan, Botao Yu, Tao Qin, Sheng Zhao, Tie-Yan Liu, ISMIR 2022.
  • [9] Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation, Botao Yu, Peiling Lu, Rui Wang, Wei Hu, Xu Tan, Wei Ye, Shikun Zhang, Tao Qin, Tie-Yan Liu, NeurIPS 2022.
  • [10] PopMAG: Pop Music Accompaniment Generation, Yi Ren, Jinzheng He, Xu Tan, Tao Qin, Zhou Zhao, Tie-Yan Liu, ACM Multimedia 2020.
  • [11] HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis, Jiawei Chen, Xu Tan, Jian Luan, Tao Qin, Tie-Yan Liu, arXiv 2020.
  • [12] CLaMP: Contrastive Language-Music Pre-training for Cross-Modal Symbolic Music Information Retrieval, Shangda Wu, Dingyao Yu, Xu Tan, Maosong Sun, ISMIR 2023, Best Student Paper Award.
  • [13] GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework, Ang Lv, Xu Tan, Peiling Lu, Wei Ye, Shikun Zhang, Jiang Bian, Rui Yan, arXiv 2023.
  • [14] MuseCoco: Generating Symbolic Music from Text, Peiling Lu, Xin Xu, Chenfei Kang, Botao Yu, Chengyi Xing, Xu Tan, Jiang Bian, arXiv 2023.
  • [15] MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models, Dingyao Yu, Kaitao Song, Peiling Lu, Tianyu He, Xu Tan, Wei Ye, Shikun Zhang, Jiang Bian, EMNLP 2023 Demo.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.