MhLiao

Minghui Liao MhLiao

628 followers · 3 following

HUST
Wuhan, China

Stars

yuyq96 / TextHawk

Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models

Python 45 2 Updated May 13, 2024

opendatalab / MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具，支持PDF/网页/多格式电子书提取。

Python 11,448 858 Updated Sep 20, 2024

OpenGVLab / GUI-Odyssey

GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes from 6 mobile devices, spanning 6 types of cross-app tasks, 20…

Python 58 3 Updated Jul 10, 2024

OpenGVLab / OmniCorpus

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Python 248 5 Updated Aug 29, 2024

mlfoundations / MINT-1T

MINT-1T: A one trillion token multimodal interleaved dataset.

731 20 Updated Jul 31, 2024

magic-research / magic-animate

[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Python 10,367 1,062 Updated Jun 21, 2024

luban-agi / Awesome-AIGC-Tutorials

Curated tutorials and resources for Large Language Models, AI Painting, and more.

3,774 256 Updated Mar 31, 2024

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

11,810 761 Updated Sep 19, 2024

lyuwenyu / RT-DETR

[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥

Python 2,275 251 Updated Aug 27, 2024

meta-llama / codellama

Inference code for CodeLlama models

Python 15,879 1,845 Updated Aug 12, 2024

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 10,040 2,261 Updated Sep 21, 2024

e2b-dev / awesome-ai-agents

A list of AI autonomous agents

9,762 708 Updated Jul 30, 2024

mlfoundations / open_flamingo

An open-source framework for training large multimodal models.

Python 3,660 276 Updated Aug 31, 2024

chatchat-space / Langchain-Chatchat

Langchain-Chatchat（原Langchain-ChatGLM）基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…

TypeScript 31,290 5,452 Updated Sep 20, 2024

langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 92,523 14,813 Updated Sep 21, 2024

mindspore-lab / mindocr

A toolbox of ocr models and algorithms based on MindSpore

Python 203 50 Updated Aug 19, 2024

OpenGVLab / InternGPT

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editin…

Python 3,192 232 Updated Aug 20, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 19,380 2,128 Updated Aug 12, 2024

Vision-CAIR / MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Python 25,303 2,906 Updated Sep 2, 2024

buptlihang / CDLA

CDLA: A Chinese document layout analysis (CDLA) dataset

Python 240 30 Updated Sep 13, 2021

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"

Python 1,422 82 Updated Jan 23, 2024

facebookresearch / ConvNeXt-V2

Code release for ConvNeXt V2 model

Python 1,471 117 Updated Aug 14, 2024

CVCUDA / CV-CUDA

CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.

C++ 2,329 213 Updated Sep 19, 2024

rom1504 / img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Python 3,605 335 Updated Aug 7, 2024

YanjingLi0202 / Q-ViT

The official implementation of the NeurIPS 2022 paper Q-ViT.

Python 77 7 Updated May 22, 2023

cvat-ai / cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

TypeScript 12,272 2,955 Updated Sep 20, 2024

openvinotoolkit / anomalib

An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference.

Python 3,660 652 Updated Sep 20, 2024

mikel-brostrom / boxmot

BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models

Python 6,579 1,697 Updated Sep 21, 2024

debidatta / syndata-generation

Code used to generate synthetic scenes and bounding box annotations for object detection. This was used to generate data used in the Cut, Paste and Learn paper

Python 288 72 Updated Oct 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly