Skip to content
View yangtao121's full-sized avatar

Highlights

  • Pro

Block or report yangtao121

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

Official code and checkpoint release for mobile robot foundation models: GNM, ViNT, and NoMaD.

Python 524 68 Updated Sep 15, 2024

Theia: Distilling Diverse Vision Foundation Models for Robot Learning

Python 141 6 Updated Aug 30, 2024

Language/Clicking grounded SAM + VOS for real-time video object tracking

Jupyter Notebook 15 2 Updated Aug 9, 2024

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…

Jupyter Notebook 2,783 335 Updated Apr 25, 2024

This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!

Jupyter Notebook 4,702 483 Updated Jan 29, 2024

A distilled Segment Anything (SAM) model capable of running real-time with NVIDIA TensorRT

Python 625 54 Updated Nov 20, 2023

RepViT: Revisiting Mobile CNN From ViT Perspective [CVPR 2024] and RepViT-SAM: Towards Real-Time Segmenting Anything

Jupyter Notebook 741 55 Updated Jun 14, 2024
Python 208 10 Updated Jun 28, 2024

[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model

Python 318 11 Updated Jul 9, 2024

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Jupyter Notebook 14,805 1,371 Updated Sep 5, 2024

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Python 6,312 656 Updated Aug 12, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 10,914 907 Updated Aug 21, 2024

Implementation of a 2-D 2DoF manipulator configuration space visualizer, as well as implementations of a gradient descent and wavefront planner for this manipulator.

Python 2 2 Updated Nov 6, 2019

Leveraging Large Language Models for Visual Target Navigation

Python 77 13 Updated Oct 24, 2023

Pytorch code for NeurIPS-20 Paper "Object Goal Navigation using Goal-Oriented Semantic Exploration"

Python 307 58 Updated Jul 20, 2023

该仓库主要记录 NLP 算法工程师相关的顶会论文研读笔记

C++ 3,869 661 Updated Aug 18, 2023

Reading list for research topics in embodied vision

497 66 Updated Jul 26, 2024

A curated list of research papers in Vision-Language Navigation (VLN)

180 30 Updated Apr 17, 2024

[TPAMI 2024] Official repo of "ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments"

Python 192 18 Updated Jul 23, 2024

Vision-and-Language Navigation in Continuous Environments using Habitat

Python 262 54 Updated Dec 16, 2023

AI Research Platform for Reinforcement Learning from Real Panoramic Images.

C++ 484 128 Updated Jul 12, 2024

Official implementation of Think Global, Act Local: Dual-scale GraphTransformer for Vision-and-Language Navigation (CVPR'22 Oral).

Python 106 8 Updated Jun 27, 2023

DOZE: A Dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments

Python 15 Updated Jul 8, 2024

Massively Parallel Deep Reinforcement Learning. 🔥

Python 3,649 839 Updated Sep 20, 2024

ROS nodes of IAP simulator

C++ 1 Updated May 20, 2024

publish pose

C++ 2 Updated Oct 10, 2021

An MBTI Exploration of Large Language Models

Python 449 21 Updated Feb 2, 2024

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Python 2,052 201 Updated Sep 22, 2024

A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models!

Python 115 2 Updated Dec 31, 2023

【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

Python 685 50 Updated Mar 25, 2024
Next