Skip to content
View idealwei's full-sized avatar

Block or report idealwei

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Python 646 36 Updated Aug 5, 2024

A RLHF Infrastructure for Vision-Language Models

Python 88 5 Updated Jun 12, 2024
Python 152 8 Updated Jul 23, 2024

 Now we have become very big, Different from the original idea. Collect premium software in various categories.

JavaScript 75,244 6,234 Updated Sep 22, 2024

Official code for Paper "Mantis: Multi-Image Instruction Tuning"

Python 159 14 Updated Sep 9, 2024

This repo contains the codes for supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) designed for vision LLMs.

Python 39 1 Updated Sep 21, 2024

LLM101n: Let's build a Storyteller

28,796 1,575 Updated Aug 1, 2024

🚀 免费订阅地址,🚀 免费节点,🚀 6小时更新一次,共享节点,节点质量高可用,完全免费。免费clash订阅地址,免费翻墙、免费科学上网、免费梯子、免费ss/v2ray/trojan节点、谷歌商店、翻墙梯子。注意:目前进入官网需开启代理。

11,714 782 Updated Sep 23, 2024

Multimodal Models in Real World

Jupyter Notebook 375 16 Updated Sep 21, 2024

Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)

Python 56,257 6,919 Updated Sep 20, 2024

Friends don't let friends make certain types of data visualization - What are they and why are they bad.

R 6,314 227 Updated Jul 11, 2024

Gemma 2B with 10M context length using Infini-attention.

Python 938 58 Updated May 12, 2024

Scenic: A Jax Library for Computer Vision Research and Beyond

Python 3,255 428 Updated Aug 28, 2024

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 1,809 145 Updated Sep 17, 2024

🚀 KIMI AI 长文本大模型逆向API白嫖测试【特长:长文本解读整理】,支持高速流式输出、智能体对话、联网搜索、长文档解读、图像OCR、多轮对话,零配置部署,多路token支持,自动清理会话痕迹。

TypeScript 3,660 588 Updated Jul 12, 2024

When do we not need larger vision models?

Python 316 9 Updated Aug 19, 2024
22 4 Updated Jun 19, 2019

[COLM-2024] List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs

Python 114 2 Updated Aug 23, 2024

Multimodal Graph Learning: how to encode multiple multimodal neighbors with their relations into LLMs

Python 52 4 Updated Jul 2, 2024

LLM&VLM Tutorial

Python 1,330 928 Updated Sep 6, 2024

Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (Qwen2.5, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Python 3,506 303 Updated Sep 23, 2024

An expert benchmark aiming to comprehensively evaluate the aesthetic perception capacities of MLLMs.

Python 202 7 Updated Aug 15, 2024

LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models

Python 87 5 Updated May 15, 2024

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

Python 299 15 Updated Sep 18, 2024
Python 292 23 Updated Apr 6, 2023

MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.

Jupyter Notebook 6,902 436 Updated Sep 18, 2024

Accelerating the development of large multimodal models (LMMs) with lmms-eval

Python 1,363 106 Updated Sep 23, 2024

Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)

Python 588 43 Updated Sep 3, 2024

Survey Paper List - Efficient LLM and Foundation Models

193 11 Updated Sep 22, 2024

Fast Multimodal LLM on Mobile Devices

C++ 406 51 Updated Sep 22, 2024
Next