Stars
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Efficient Triton Kernels for LLM Training
✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
Intelligence development framework in python for your product like Apple Intelligence
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"
Accelerate your web app development | Build fast. Run fast.
AlwaysReddy is a LLM voice assistant that is always just a hotkey away.
This is the Personality Core for GLaDOS, the first steps towards a real-life implementation of the AI from the Portal series by Valve.
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
Instant voice cloning by MIT and MyShell.
👩🏿💻👨🏾💻👩🏼💻👨🏽💻👩🏻💻中国独立开发者项目列表 -- 分享大家都在做什么
Openai style api for open large language models, using LLMs just as chatgpt! Support for LLaMA, LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, Xverse, SqlCoder, CodeLLaMA, ChatGLM, ChatGLM2, ChatGLM3 etc.…
Question and Answer based on Anything.
OpenUI let's you describe UI using your imagination, then see it rendered live.
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Distribute and run LLMs with a single file.
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫
DeepSeek-VL: Towards Real-World Vision-Language Understanding
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
A Data Streaming Library for Efficient Neural Network Training
[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Minimalistic large language model 3D-parallelism training