Stars
A ROS wrapper of the AprilTag 3 visual fiducial detector
[ICLR 2023] SQA3D for embodied scene understanding and reasoning
Official implementation of ECCV24 paper "SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding"
[ICML 2024] Official code repository for 3D embodied generalist agent LEO
[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
中文nlp解决方案(大模型、数据、模型、训练、推理)
Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Tr…
Code implementation for paper titled "HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision"
Democratization of RT-2 "RT-2: New model translates vision and language into action"
Benchmarking Knowledge Transfer in Lifelong Robot Learning
Official codebase for "Any-point Trajectory Modeling for Policy Learning"
Ego4D Goal-Step: Toward Hierarchical Understanding of Procedural Activities (NeurIPS 2023)
[CoRL2024] Official repo of `A3VLM: Actionable Articulation-Aware Vision Language Model`
Pointcept: a codebase for point cloud perception research. Latest works: PTv3 (CVPR'24 Oral), PPT (CVPR'24), OA-CNNs (CVPR'24), MSC (CVPR'23)
[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model
[IROS24 Oral]ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
[Embodied-AI-Survey-2024] Paper list and projects for Embodied AI
Cobra: Extending Mamba to Multi-modal Large Language Model for Efficient Inference
Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.
Gym-Styled UR5 arm with Robotiq-85 / 140 gripper in Bullet simulator
✨✨Latest Papers on Vision Mamba and Related Areas
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Official Algorithm Implementation of ICML'23 Paper "VIMA: General Robot Manipulation with Multimodal Prompts"