chenht2021

Haitao chenht2021

Chengdu
11:21 (UTC +08:00)

Achievements

Stars

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 2,596 245 Updated Sep 14, 2024

pika-online / funasr_seaco_paraformer_onnx_with_timestamp

修复funasr中seaco-paraformer导出onnx后没有时间戳的bug

Python 13 4 Updated Sep 12, 2024

lucidrains / transfusion-pytorch

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 564 22 Updated Sep 17, 2024

bfs18 / e2_tts

Python 45 7 Updated Sep 3, 2024

hacksider / Deep-Live-Cam

real time face swap and one-click video deepfake with only a single image

Python 36,126 5,127 Updated Sep 23, 2024

yaoxieyoulei / mytv-android

使用Android原生开发的电视直播软件

Kotlin 4,838 498 Updated Sep 13, 2024

ufal / whisper_streaming

Whisper realtime streaming for long speech-to-text transcription and translation

Python 1,796 221 Updated Sep 1, 2024

sh-lee97 / grafx

GRAFX: An Open-Source Library for Audio Processing Graphs in PyTorch

Python 83 6 Updated Sep 20, 2024

KimberleyJensen / Mel-Band-Roformer-Vocal-Model

Python 50 3 Updated Sep 17, 2024

wenet-e2e / west

We Speech Transcript based on LLM, in 300 lines of code.

Python 117 11 Updated Aug 16, 2024

winddori2002 / DEX-TTS

DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability

Python 80 6 Updated Jul 10, 2024

aiola-lab / whisper-medusa

Whisper with Medusa heads

Python 777 47 Updated Sep 23, 2024

merlresearch / sebbs

Prediction of sound event bounding boxes (SEBBs)

Python 20 2 Updated Aug 2, 2024

OpenT2S / LlamaVoice

LlamaVoice is a llama-based large voice generation model, providing inference and training ability.

Python 170 10 Updated Aug 26, 2024

0nutation / SpeechGPT

SpeechGPT Series: Speech Large Language Models

Python 1,227 81 Updated Jul 22, 2024

ex3ndr / supervoice-vall-e-2

VALL-E 2 reproduction

Jupyter Notebook 72 11 Updated Jul 14, 2024

gnobitab / InstaFlow

⚡ InstaFlow! One-Step Stable Diffusion with Rectified Flow (ICLR 2024)

Python 1,140 36 Updated Jun 7, 2024

jishengpeng / ControlSpeech

ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec

Python 181 6 Updated Sep 3, 2024

mini-sora / minisora

MiniSora: A community aims to explore the implementation path and future development direction of Sora.

Python 1,165 148 Updated Sep 8, 2024

facebookresearch / MobileLLM

MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.

Python 937 49 Updated Sep 3, 2024

BUTSpeechFIT / DiaPer

Python 42 2 Updated Feb 8, 2024

dmlguq456 / SepReformer

Official repository of SepReformer for speech separation

Python 72 7 Updated Jun 24, 2024

lucidrains / e2-tts-pytorch

Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch

Python 239 21 Updated Sep 11, 2024

asappresearch / simple-tts

Contains the code associated with the ICLR submission for our text-to-speech diffusion model

Python 51 1 Updated Oct 31, 2023

maxrmorrison / promonet

Prosody and Pronunciation Modification Network

Python 38 6 Updated Aug 8, 2024

KwaiVGI / LivePortrait

Bring portraits to life!

Python 11,896 1,245 Updated Sep 6, 2024

magpie-align / magpie

Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!

Python 409 42 Updated Sep 20, 2024

ditto-tts / ditto-tts.github.io

Official Demo Page for DiTTo-TTS: Efficient and Scalable Zero-Shot Text-to-Speech with Diffusion Transformer

HTML 28 1 Updated Aug 21, 2024

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 4,996 507 Updated Sep 23, 2024

jingyonghou / SenseVoice

Forked from FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

Python 1 Updated Jul 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Haitao chenht2021

Achievements

Achievements

Block or report chenht2021

Stars

gpt-omni / mini-omni

pika-online / funasr_seaco_paraformer_onnx_with_timestamp

lucidrains / transfusion-pytorch

bfs18 / e2_tts

hacksider / Deep-Live-Cam

yaoxieyoulei / mytv-android

ufal / whisper_streaming

sh-lee97 / grafx

KimberleyJensen / Mel-Band-Roformer-Vocal-Model

wenet-e2e / west

winddori2002 / DEX-TTS

aiola-lab / whisper-medusa

merlresearch / sebbs

OpenT2S / LlamaVoice

0nutation / SpeechGPT

ex3ndr / supervoice-vall-e-2

gnobitab / InstaFlow

jishengpeng / ControlSpeech

mini-sora / minisora

facebookresearch / MobileLLM

BUTSpeechFIT / DiaPer

dmlguq456 / SepReformer

lucidrains / e2-tts-pytorch

asappresearch / simple-tts

maxrmorrison / promonet

KwaiVGI / LivePortrait

magpie-align / magpie

ditto-tts / ditto-tts.github.io

FunAudioLLM / CosyVoice

jingyonghou / SenseVoice