Highlights
- Pro
Starred repositories
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models
RUCAIBox / POPE
Forked from AoiDragon/POPEThe official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''
Control for Universal Robots with python Dashboard, RealTime Interfaces, RTDE (to be discussed). If there is anything specific that needs to be done - suggest it to the discussion.
ROS2 Interface for Universal Robot CoBots Control with ur_rtde (Python, C++)
[SIGGRAPH Asia 2022] Assemble Them All: Physics-Based Planning for Generalizable Assembly by Disassembly
Data, tools, and documentation of the Fusion 360 Gallery Dataset
Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots
A Web based GUI for Universal Robots UR5 industrial robot
📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
使用 OpenAI 兼容接口自动生成小说、漫画、字幕、游戏脚本等内容文本中实体词语表的翻译辅助工具
Official Algorithm Implementation of ICML'23 Paper "VIMA: General Robot Manipulation with Multimodal Prompts"
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Code for the paper "Disentangled Generative Models for Robust Prediction of System Dynamics"
LayerDiffuse in pure diffusers without any GUI
Continuation of Clash Verge - A Clash Meta GUI based on Tauri (Windows, MacOS, Linux)
A curated list of awesome papers on Embodied AI and related research/industry-driven resources.
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model
Official implementation of "DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion"
Cobra: Extending Mamba to Multi-modal Large Language Model for Efficient Inference