HeartMuLa: Open-Source Music Generation Models 2026

January 25, 2026

Category: Practical Open Source Projects

Tags:

Open Source AI Models Music Generation Python Library HeartMuLa

HeartMuLa is a family of cutting‑edge, open‑source music foundation models that enable anyone to generate, transcribe, and process music with AI.

1. What is HeartMuLa?

HeartMuLa – a music language model that generates music conditioned on lyrics and tags, supporting multiple languages (English, Chinese, Japanese, Korean, Spanish).
HeartCodec – a high‑fidelity 12.5 Hz music codec for efficient compression and reconstruction.
HeartTranscriptor – a Whisper‑based lyric transcriber tuned specifically for music.
HeartCLAP – an audio‑text alignment model that creates a shared embedding space for cross‑modal retrieval.

These models are released under an Apache‑2.0 license, making them free to use, modify, and distribute.

2. Core Features

Feature	Description
Multi‑GPU & Lazy Loading	Run with multiple GPUs or use lazy loading to save memory on a single GPU.
Multilingual	Condition generation on lyrics in Chinese, Japanese, Korean, Spanish, or English.
Fine‑grained Control	Use tags (e.g., `piano,happy,wedding`) to steer style and instrumentation.
Pre‑trained Checkpoints	Models for 3B and 7B variants are available on Hugging Face and ModelScope.
Audio Codec Support	Encode & decode audio efficiently with `HeartCodec`.

3. Quick Start

# Clone the repository
git clone https://github.com/HeartMuLa/heartlib.git
cd heartlib

# Install requirements
pip install -e .

# Download checkpoints (choose the 3B or 7B variant)
# Hugging Face example
hf download --local-dir './ckpt/HeartMuLa-oss-3B' 'HeartMuLa/HeartMuLa-oss-3B'
# Optional: 7B model (after release)
# hf download --local-dir './ckpt/HeartMuLa-oss-7B' 'HeartMuLa/HeartMuLa-oss-7B'

# Download the codec checkpoints
hf download --local-dir './ckpt/HeartCodec-oss' 'HeartMuLa/HeartCodec-oss'

# Run a simple generation demo
python ./examples/run_music_generation.py --model_path=./ckpt --version="3B"

The script will read assets/lyrics.txt and assets/tags.txt, generate a music clip, and save it to assets/output.mp3.

4. Customizing the Generation

4.1 Provide Your Own Lyrics & Tags

Edit assets/lyrics.txt.
Edit assets/tags.txt with comma‑separated tags (piano,happy,wedding).
Rerun the script to generate with the new content.

4.2 Multi‑GPU & Device Allocation

If you have 2×RTX 4090s, place the model parameters on separate devices:

--mula_device cuda:0 --codec_device cuda:1

On a single GPU, enable lazy loading:

--lazy_load true

4.3 Sampling Hyperparameters

Parameter	Default	Effect
`topk`	50	Controls diversity
`temperature`	1.0	Controls randomness
`cfg_scale`	1.5	Controls fidelity vs creativity

Feel free to experiment to get the style you desire.

5. Advanced Usage

Reference Audio Conditioning: Future releases plan to accept an audio preview to refine generated output.
Fine‑Tuning: The repo includes scripts for fine‑tuning on custom datasets.
Inference Acceleration: Release of inference scripts and streaming inference is forthcoming; expect RTF ≈ 1.0.

6. Licensing & Attribution

All code and model weights are licensed under Apache‑2.0.
If you use the models in a published work, cite the following:

@misc{yang2026heartmulafamilyopensourced,
  title={HeartMuLa: A Family of Open Sourced Music Foundation Models},
  author={Dongchao Yang and Yuxin Xie and Yuguo Yin and Zheyu Wang and Xiaoyu Yi and Gongxi Zhu and Xiaolong Weng and Zihan Xiong and Yingzhe Ma and Dading Cong and Jingliang Liu and Zihang Huang and Jinghan Ru and Rongjie Huang and Haoran Wan and Peixu Wang and Kuoxi Yu and Helin Wang and Liming Liang and Xianwei Zhuang and Yuanyuan Wang and Haohan Guo and Junjie Cao and Zeqian Ju and Songxiang Liu and Yuewen Cao and Heming Weng and Yuexian Zou},
  year={2026},
  eprint={2601.10547},
  archivePrefix={arXiv},
  primaryClass={cs.SD},
  url={https://arxiv.org/abs/2601.10547},
}

7. Community & Support

Join the HeartMuLa Discord for quick help and discussion.
Follow the repository on GitHub for updates and new releases.
Contributions and issues are welcome via pull requests.

HeartMuLa brings professional‑grade music generation to the open‑source community. With robust support for multilingual lyrics, tag‑based style conditioning, high‑fidelity audio codecs, and flexible deployment options, it’s an ideal toolkit for researchers, creators, and developers looking to explore AI‑driven music synthesis. Download the code, experiment with the demos, and start building your own AI‑powered musical projects today.

Original Article: View Original