VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling 🎶

This is the official repository for "VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling".

📺 Demo Video

🔆 Abstract

In this work, we systematically study music generation conditioned solely on the video. First, we present the large-scale dataset V2M, which comprises 190K video-music pairs and includes various genres such as movie trailers, advertisements, and documentaries. Furthermore, we propose VidMuse, a simple framework for generating music aligned with video inputs. VidMuse stands out by producing high-fidelity music that is both acoustically and semantically aligned with the video. By incorporating local and global visual cues, VidMuse enables the creation of musically coherent audio tracks that consistently match the video content through Long-Short-Term modeling. Through extensive experiments, VidMuse outperforms existing models in terms of audio quality, diversity, and audio-visual alignment.

🔆 Data Construction

Dataset Construction. To ensure data quality, V2M goes through rule-based coarse filtering and content-based fine-grained filtering. Music source separation is applied to remove speech and singing signals in the audio. After processing, human experts curate the benchmark subset, while the remaining data is used as the pretraining dataset. The pretrain data is then refined using Audio-Visual Alignment Ranking to select the finetuning dataset.

🔆 Method

Overview of the VidMuse Framework.

⚙️ Code & Datasets

Our code and datasets will come soon.

🤗 Citation

@article{tian2024vidmuse,
  title={VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling},
  author={Tian, Zeyue and Liu, Zhaoyang and Yuan, Ruibin and Pan, Jiahao and Huang, Xiaoqiang and Liu, Qifeng and Tan, Xu and Chen, Qifeng and Xue, Wei and Guo, Yike},
  journal={arXiv preprint arXiv:2406.04321},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling 🎶

📺 Demo Video

🔆 Abstract

🔆 Data Construction

🔆 Method

⚙️ Code & Datasets

🤗 Citation

About

Releases

Packages

ZeyueT/VidMuse

Folders and files

Latest commit

History

Repository files navigation

VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling 🎶

📺 Demo Video

🔆 Abstract

🔆 Data Construction

🔆 Method

⚙️ Code & Datasets

🤗 Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages