amanteur

Follow

🌚

Amantur Amatov amanteur

🌚

Follow

Deep Learning for Music Information Retrieval, Audio Processing (and life)

54 followers · 89 following

Kits.AI
Bishkek, Kyrgyzstan
in/amanteur

Achievements

Achievements

Lists (1)

Sort

Music Source Separation

Beta Lists are currently in beta. Share feedback and report bugs.

Stars

sai-soum / Diff-MST

Multitrack music mixing style transfer given a reference song using differentiable mixing console.

Jupyter Notebook 33 1 Updated Sep 4, 2024

kyutai-labs / moshi

Python 5,125 384 Updated Sep 24, 2024

HJ-Ok / S2cap

S2cap ♥: Constructing a Singing Style Caption Dataset

6 Updated Sep 19, 2024

baijinglin / TS-BSmamba2

TS-BSmamba2: A TWO-STAGE BAND-SPLIT MAMBA-2 NETWORK FOR MUSIC SEPARATION

Python 31 Updated Sep 16, 2024

WangHelin1997 / SoloAudio

SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.

Python 31 3 Updated Sep 21, 2024

HarunoriKawano / BEST-RQ

Implementation of the paper "Self-supervised Learning with Random-projection Quantizer for Speech Recognition" in Pytorch.

Python 57 2 Updated May 25, 2023

KentoNishi / torch-pitch-shift

Pitch-shift audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.

Python 131 10 Updated Apr 13, 2023

NVIDIA / BigVGAN

Official PyTorch implementation of BigVGAN (ICLR 2023)

Python 848 96 Updated Sep 5, 2024

mulab-mir / muchomusic

MuChoMusic is a benchmark for evaluating music understanding in multimodal audio-language models.

Jupyter Notebook 22 1 Updated Aug 9, 2024

Zain-Jiang / Speech-Editing-Toolkit

It's a repository for implementations of neural speech editing algorithms.

Python 186 19 Updated Jan 9, 2024

feizc / FluxMusic

Text-to-Music Generation with Rectified Flow Transformers

Python 1,462 110 Updated Sep 6, 2024

RoyChao19477 / SEMamba

This is the official implementation of the SEMamba paper. (Accepted to IEEE SLT 2024)

Python 120 11 Updated Sep 9, 2024

sergree / matchering

🎚️ Open Source Audio Matching and Mastering

Python 1,314 154 Updated Jul 24, 2024

NVIDIA / audio-flamingo

PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.

Python 170 9 Updated Aug 20, 2024

astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.

Rust 31,258 1,042 Updated Sep 24, 2024

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 11,579 2,425 Updated Sep 24, 2024

zhangyongmao / VISinger2

VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer

Python 309 42 Updated Jul 17, 2024

Plachtaa / seed-vc

zero-shot voice conversion & singing voice conversion with in context learning

Python 242 23 Updated Sep 24, 2024

WangHelin1997 / Fast-GeCo

Source code and demo for INTERSPEECH 2024 paper: Noise-robust Speech Separation with Fast Generative Correction

Python 25 Updated Sep 14, 2024

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 2,609 245 Updated Sep 14, 2024

yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 4,745 388 Updated Aug 10, 2024

myshell-ai / MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

Python 4,443 556 Updated Aug 9, 2024

liuhuang31 / HiFTNet-sr

HiFTNet wav/audio super-resolution 16/24 kHz to 48 kHz

Python 22 2 Updated Jan 2, 2024

felixperfler / Stable-Hybrid-Auditory-Filterbanks

Official Implementation of Interspeech 2024 Paper "Hold Me Tight: Stable Encoder-Decoder Design for Speech Enhancement"

Python 27 Updated Sep 20, 2024

xjuspeech / YOLOPitch

Python 5 Updated Jun 11, 2024

EmilianPostolache / stable-audio-controlnet

Fine-tune Stable Audio Open with DiT ControlNet.

Python 158 3 Updated Sep 2, 2024

jim-schwoebel / voice_datasets

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

1,679 223 Updated Jun 6, 2024

jishengpeng / WavTokenizer

SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling

Python 667 39 Updated Sep 21, 2024

khhungg / BSSE-SE

Boosting Self-Supervised Embeddings for Speech Enhancement

Python 42 4 Updated Jun 23, 2022

mcomunita / AFX-Research

Scientific literature about Audio Effects

HTML 123 2 Updated Sep 3, 2024