Highlights
- Pro
Stars
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
Yet another paper list on 3D vision | Robotics | Embodied AI.
✨✨Latest Advances on Multimodal Large Language Models
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Benchmarking large language models' complex reasoning ability with chain-of-thought prompting
Awesome deliberative prompting: How to ask LLMs to produce reliable reasoning and make reason-responsive decisions.
Awesome-LLM: a curated list of Large Language Model
A comprehensive list of Implicit Representations and NeRF papers relating to Robotics/RL domain, including papers, codes, and related websites
A curated list of 3D Vision papers relating to Robotics domain in the era of large models i.e. LLMs/VLMs, inspired by awesome-computer-vision, including papers, codes, and related websites
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
[CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Language 3D Assistant.
[ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities
Code&Data for Grounded 3D-LLM with Referent Tokens
Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources
Leveraging Large Language Models for Visual Target Navigation
[CVPR 2024] 🏡Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning
[CVPR 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
The Replica Dataset v1 as published in https://arxiv.org/abs/1906.05797 .
[CVPR'24 Oral] Official repository of Point Transformer V3 (PTv3)
This is the source code of Part2Word: Learning Joint Embedding of Point Clouds and Text by Bidirectional Matching between Parts and Words
[ICML 2024] Official code repository for 3D embodied generalist agent LEO
(CVPR 2023) PLA: Language-Driven Open-Vocabulary 3D Scene Understanding & (CVPR2024) RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding
[CVPR'23] OpenScene: 3D Scene Understanding with Open Vocabularies
[ECCV'20] Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling
A shift-window based transformer for 3D sparse tasks
Pointcept: a codebase for point cloud perception research. Latest works: PTv3 (CVPR'24 Oral), PPT (CVPR'24), OA-CNNs (CVPR'24), MSC (CVPR'23)
An open-source codebase for exploring autonomous driving pre-training
[NeurIPS'22] An official PyTorch implementation of PTv2.