-
@aim-uofa & Zhejiang University
- Hangzhou,China
-
17:50
(UTC +08:00)
Lists (14)
Sort Name ascending (A-Z)
Starred repositories
Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021 Oral.
[ICML 2024] Official code repository for 3D embodied generalist agent LEO
Official code of PatchmatchNet (CVPR 2021 Oral)
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
Code for "Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text".
EVA Series: Visual Representation Fantasies from BAAI
Awesome Embodied Navigation: Concept, Paradigm and State-of-the-arts
🙃 A delightful community-driven (with 2,400+ contributors) framework for managing your zsh configuration. Includes 300+ optional plugins (rails, git, macOS, hub, docker, homebrew, node, php, python…
SegRefiner: Towards Model-Agnostic Segmentation Refinement with Discrete Diffusion Process
Official code for PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking (ICCV 2023)
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
An implemtation of Everyting of Thoughts (XoT).
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
HKBU-HPML / IRS
Forked from blackjack2015/IRSIRS: A Large Synthetic Indoor Robotics Stereo Dataset for Disparity and Surface Normal Estimation
Official inference library for Mistral models
F3RM: Feature Fields for Robotic Manipulation. Official repo for the paper "Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation" (CoRL 2023).
[SIGGRAPH Asia 2024 (Journal Track)] StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal
A toolbox for benchmarking SOTA discriminative and generative geometry estimation models.
Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
An open-source implementation for training LLaVA-NeXT.
✨✨Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models