iehppp2010

iehppp2010

3 followers · 7 following

Stars

baijinglin / TS-BSmamba2

TS-BSmamba2: A TWO-STAGE BAND-SPLIT MAMBA-2 NETWORK FOR MUSIC SEPARATION

Python 30 Updated Sep 16, 2024

kyutai-labs / moshi

Python 4,887 360 Updated Sep 23, 2024

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 1,884 106 Updated Sep 23, 2024

SamsungLabs / SummaryMixing

This repository implements SummaryMixing, a simpler, faster and much cheaper replacement to self-attention for automatic speech recognition (see: https://arxiv.org/abs/2307.07421). The code is read…

Python 104 11 Updated Sep 17, 2024

lucidrains / BS-RoFormer

Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labs

Python 399 13 Updated Aug 6, 2024

bloomberg / pystack

🔍 🐍 Like pstack but for Python!

Python 1,008 45 Updated Sep 17, 2024

AI-Hobbyist / StarRail_Datasets

StarRail Datasets For SVC/SVS/TTS

278 14 Updated Sep 7, 2024

ytsrt66589 / pyneuralfx

Jupyter Notebook 41 3 Updated Aug 13, 2024

coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 33,641 4,095 Updated Aug 16, 2024

sh-lee97 / grafx

GRAFX: An Open-Source Library for Audio Processing Graphs in PyTorch

Python 83 6 Updated Sep 20, 2024

RickyL-2000 / ROSVOT

Robust Singing Voice Transcription and MIDI Extraction

Python 48 2 Updated Jul 29, 2024

NVIDIA / BigVGAN

Official PyTorch implementation of BigVGAN (ICLR 2023)

Python 848 96 Updated Sep 5, 2024

FunAudioLLM / SenseVoice

Multilingual Voice Understanding Model

Python 2,726 258 Updated Sep 2, 2024

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 30,922 3,360 Updated Sep 21, 2024

marianne-m / brouhaha-vad

Predicts the level of noise and reverberation on your audiofiles

Jupyter Notebook 136 24 Updated May 22, 2024

YUCHEN005 / GenTranslate

Code for paper "GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators"

Python 192 6 Updated Jul 22, 2024

ga642381 / speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

598 27 Updated Sep 20, 2024

Stability-AI / stable-audio-tools

Generative models for conditional audio generation

Python 2,536 235 Updated Jul 15, 2024

rsxdalv / tts-generation-webui

TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS)

TypeScript 1,639 178 Updated Sep 23, 2024

FgoDt / oldnatten

natten v0.15.1

Cuda 1 Updated Mar 26, 2024

SHI-Labs / NATTEN

Neighborhood Attention Extension. Bringing attention to a neighborhood near you!

Cuda 345 26 Updated Aug 20, 2024

atong01 / conditional-flow-matching

TorchCFM: a Conditional Flow Matching library

Python 1,062 83 Updated Aug 21, 2024

haidog-yaqub / DiffPitcher

Diffusion-based singing voice pitch correction

Python 72 14 Updated Sep 20, 2024

Annmixiu / MTANet

INTERSPEECH2023: Multi-band Time-frequency Attention Network for Singing Melody Extraction from Polyphonic Music

Python 28 3 Updated May 27, 2024

OrionStarAI / Orion

Orion-14B is a family of models includes a 14B foundation LLM, and a series of models: a chat model, a long context model, a quantized model, a RAG fine-tuned model, and an Agent fine-tuned model. …

Python 782 57 Updated Jun 3, 2024

chenkui164 / FastASR

这是一个用C++实现ASR推理的项目，它依赖很少，安装也很简单，推理速度很快，在树莓派4B等ARM平台也可以流畅的运行。支持的模型是由Google的Transformer模型中优化而来，数据集是开源wenetspeech(10000+小时)或阿里私有数据集(60000+小时)，所以识别效果也很好，可以媲美许多商用的ASR软件。

C 482 74 Updated Mar 19, 2023

YangLing0818 / Diffusion-Models-Papers-Survey-Taxonomy

Diffusion model papers, survey, and taxonomy

2,909 247 Updated Aug 9, 2024

thuhcsi / SECap

Python 130 11 Updated Jul 9, 2024

ddlBoJack / emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Python 583 42 Updated Sep 9, 2024

p0p4k / pflowtts_pytorch

Unofficial implementation of NVIDIA P-Flow TTS paper

Python 212 30 Updated Jul 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly