HeartMuLa: Open-Source Music Generation Models 2026

HeartMuLa is a family of cutting‑edge, open‑source music foundation models that enable anyone to generate, transcribe, and process music with AI.


1. What is HeartMuLa?

  • HeartMuLa – a music language model that generates music conditioned on lyrics and tags, supporting multiple languages (English, Chinese, Japanese, Korean, Spanish).
  • HeartCodec – a high‑fidelity 12.5 Hz music codec for efficient compression and reconstruction.
  • HeartTranscriptor – a Whisper‑based lyric transcriber tuned specifically for music.
  • HeartCLAP – an audio‑text alignment model that creates a shared embedding space for cross‑modal retrieval.

These models are released under an Apache‑2.0 license, making them free to use, modify, and distribute.


2. Core Features

Feature Description
Multi‑GPU & Lazy Loading Run with multiple GPUs or use lazy loading to save memory on a single GPU.
Multilingual Condition generation on lyrics in Chinese, Japanese, Korean, Spanish, or English.
Fine‑grained Control Use tags (e.g., piano,happy,wedding) to steer style and instrumentation.
Pre‑trained Checkpoints Models for 3B and 7B variants are available on Hugging Face and ModelScope.
Audio Codec Support Encode & decode audio efficiently with HeartCodec.

3. Quick Start

# Clone the repository
git clone https://github.com/HeartMuLa/heartlib.git
cd heartlib

# Install requirements
pip install -e .

# Download checkpoints (choose the 3B or 7B variant)
# Hugging Face example
hf download --local-dir './ckpt/HeartMuLa-oss-3B' 'HeartMuLa/HeartMuLa-oss-3B'
# Optional: 7B model (after release)
# hf download --local-dir './ckpt/HeartMuLa-oss-7B' 'HeartMuLa/HeartMuLa-oss-7B'

# Download the codec checkpoints
hf download --local-dir './ckpt/HeartCodec-oss' 'HeartMuLa/HeartCodec-oss'

# Run a simple generation demo
python ./examples/run_music_generation.py --model_path=./ckpt --version="3B"

The script will read assets/lyrics.txt and assets/tags.txt, generate a music clip, and save it to assets/output.mp3.


4. Customizing the Generation

4.1 Provide Your Own Lyrics & Tags

  • Edit assets/lyrics.txt.
  • Edit assets/tags.txt with comma‑separated tags (piano,happy,wedding).
  • Rerun the script to generate with the new content.

4.2 Multi‑GPU & Device Allocation

If you have 2×RTX 4090s, place the model parameters on separate devices:

--mula_device cuda:0 --codec_device cuda:1

On a single GPU, enable lazy loading:

--lazy_load true

4.3 Sampling Hyperparameters

Parameter Default Effect
topk 50 Controls diversity
temperature 1.0 Controls randomness
cfg_scale 1.5 Controls fidelity vs creativity

Feel free to experiment to get the style you desire.


5. Advanced Usage

  • Reference Audio Conditioning: Future releases plan to accept an audio preview to refine generated output.
  • Fine‑Tuning: The repo includes scripts for fine‑tuning on custom datasets.
  • Inference Acceleration: Release of inference scripts and streaming inference is forthcoming; expect RTF ≈ 1.0.

6. Licensing & Attribution

  • All code and model weights are licensed under Apache‑2.0.
  • If you use the models in a published work, cite the following:
@misc{yang2026heartmulafamilyopensourced,
  title={HeartMuLa: A Family of Open Sourced Music Foundation Models},
  author={Dongchao Yang and Yuxin Xie and Yuguo Yin and Zheyu Wang and Xiaoyu Yi and Gongxi Zhu and Xiaolong Weng and Zihan Xiong and Yingzhe Ma and Dading Cong and Jingliang Liu and Zihang Huang and Jinghan Ru and Rongjie Huang and Haoran Wan and Peixu Wang and Kuoxi Yu and Helin Wang and Liming Liang and Xianwei Zhuang and Yuanyuan Wang and Haohan Guo and Junjie Cao and Zeqian Ju and Songxiang Liu and Yuewen Cao and Heming Weng and Yuexian Zou},
  year={2026},
  eprint={2601.10547},
  archivePrefix={arXiv},
  primaryClass={cs.SD},
  url={https://arxiv.org/abs/2601.10547},
}

7. Community & Support

  • Join the HeartMuLa Discord for quick help and discussion.
  • Follow the repository on GitHub for updates and new releases.
  • Contributions and issues are welcome via pull requests.

8. Conclusion

HeartMuLa brings professional‑grade music generation to the open‑source community. With robust support for multilingual lyrics, tag‑based style conditioning, high‑fidelity audio codecs, and flexible deployment options, it’s an ideal toolkit for researchers, creators, and developers looking to explore AI‑driven music synthesis. Download the code, experiment with the demos, and start building your own AI‑powered musical projects today.

Original Article: View Original

Share this article