Skip to content
View MhLiao's full-sized avatar

Block or report MhLiao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models

Python 45 2 Updated May 13, 2024

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。

Python 11,448 858 Updated Sep 20, 2024

GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes from 6 mobile devices, spanning 6 types of cross-app tasks, 20…

Python 58 3 Updated Jul 10, 2024

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Python 248 5 Updated Aug 29, 2024

MINT-1T: A one trillion token multimodal interleaved dataset.

731 20 Updated Jul 31, 2024

[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Python 10,367 1,062 Updated Jun 21, 2024

Curated tutorials and resources for Large Language Models, AI Painting, and more.

3,774 256 Updated Mar 31, 2024

✨✨Latest Advances on Multimodal Large Language Models

11,810 761 Updated Sep 19, 2024

[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥

Python 2,275 251 Updated Aug 27, 2024

Inference code for CodeLlama models

Python 15,879 1,845 Updated Aug 12, 2024

Ongoing research training transformer models at scale

Python 10,040 2,261 Updated Sep 21, 2024
Python 732 46 Updated Jul 8, 2024

A list of AI autonomous agents

9,762 708 Updated Jul 30, 2024

An open-source framework for training large multimodal models.

Python 3,660 276 Updated Aug 31, 2024

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…

TypeScript 31,290 5,452 Updated Sep 20, 2024

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 92,523 14,813 Updated Sep 21, 2024

A toolbox of ocr models and algorithms based on MindSpore

Python 203 50 Updated Aug 19, 2024

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editin…

Python 3,192 232 Updated Aug 20, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 19,380 2,128 Updated Aug 12, 2024

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Python 25,303 2,906 Updated Sep 2, 2024

CDLA: A Chinese document layout analysis (CDLA) dataset

Python 240 30 Updated Sep 13, 2021

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"

Python 1,422 82 Updated Jan 23, 2024

Code release for ConvNeXt V2 model

Python 1,471 117 Updated Aug 14, 2024

CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.

C++ 2,329 213 Updated Sep 19, 2024

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Python 3,605 335 Updated Aug 7, 2024

The official implementation of the NeurIPS 2022 paper Q-ViT.

Python 77 7 Updated May 22, 2023

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

TypeScript 12,272 2,955 Updated Sep 20, 2024

An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference.

Python 3,660 652 Updated Sep 20, 2024

BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models

Python 6,579 1,697 Updated Sep 21, 2024

Code used to generate synthetic scenes and bounding box annotations for object detection. This was used to generate data used in the Cut, Paste and Learn paper

Python 288 72 Updated Oct 21, 2020
Next