Highlights
- Pro
Stars
High-resolution models for human tasks.
Free, simple, and intuitive online database diagram editor and SQL generator.
[COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
A programming language exclusively designed for cybersecurity
RTMPose series (RTMPose, DWPose, RTMO, RTMW) without mmcv, mmpose, mmdet etc.
Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
Robust Speech Recognition via Large-Scale Weak Supervision
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
VideoTetris: Towards Compositional Text-To-Video Generation
A generative speech model for daily dialogue.
Bimanual Dexterous Teleoperation with Real-Time Retargeting using VisionPro
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly …
SpeeD: A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training
Mora: More like Sora for Generalist Video Generation
A Diffusion training toolbox based on diffusers and existing SOTA methods, including Dreambooth, Texual Inversion, LoRA, Custom Diffusion, XTI, ....
The official code of "Concept-centric Personalization with Large-scale Diffusion Priors".
A collection of resources on controllable generation with text-to-image diffusion models.
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures