Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 4,472 384 Updated Sep 6, 2024

CompVis / latent-diffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Jupyter Notebook 11,519 1,504 Updated Feb 29, 2024

fastai / imagenette

A smaller subset of 10 easily classified classes from Imagenet, and a little more French

Jupyter Notebook 959 73 Updated Sep 26, 2022

CompVis / stable-diffusion

A latent text-to-image diffusion model

Jupyter Notebook 67,608 10,086 Updated Jun 18, 2024

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 27,148 3,986 Updated Sep 22, 2024

facebookresearch / DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 6,031 535 Updated May 31, 2024

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 4,925 499 Updated Sep 19, 2024

Res2Net / Res2Net-PretrainedModels

(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper "Res2Net: A New Multi-scale Backbone Architecture"

Python 1,067 214 Updated Dec 8, 2022

huggingface / speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Python 3,037 322 Updated Sep 20, 2024

facebookresearch / audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Python 20,653 2,101 Updated Jul 18, 2024

huggingface / dataspeech

Python 274 35 Updated Sep 3, 2024

xingyaoww / code-act

Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji.

Python 444 36 Updated May 23, 2024

niedev / RTranslator

Open source real-time translation app for Android that runs locally

C++ 6,481 487 Updated Sep 22, 2024

plaggy / fast-whisper-server

ASR + diarization model server with speculative decoding

Python 47 8 Updated May 22, 2024

jquesnelle / yarn

YaRN: Efficient Context Window Extension of Large Language Models

Python 1,312 115 Updated Apr 17, 2024

hiyouga / LLaMA-Factory

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)

Python 31,171 3,841 Updated Sep 19, 2024

cooper12121 / llama3-8x8b-MoE

Copy the MLP of llama3 8 times as 8 experts , created a router with random initialization,add load balancing loss to construct an 8x8b MoE model based on llama3.

Python 24 3 Updated Jul 1, 2024

mlabonne / llm-course

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 37,347 3,928 Updated Jul 28, 2024

xfactlab / orpo

Official repository for ORPO

Python 411 36 Updated May 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thomas ThomasLWang

Block or report ThomasLWang

Stars

m-bain / whisperX

pyannote / pyannote-audio

yt-dlp / yt-dlp

KoboldAI / KoboldAI-Client

Eddycrack864 / Ultimate-Vocal-Remover-5.6-for-Google-Colab

RVC-Boss / GPT-SoVITS

shivammehta25 / Matcha-TTS

FireRedTeam / FireRedTTS

audacity / audacity

Podcastindex-org / database

Podcastindex-org / podcast-namespace

open-mmlab / Amphion