Stars
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
Now we have become very big, Different from the original idea. Collect premium software in various categories.
Official code for Paper "Mantis: Multi-Image Instruction Tuning"
This repo contains the codes for supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) designed for vision LLMs.
🚀 免费订阅地址,🚀 免费节点,🚀 6小时更新一次,共享节点,节点质量高可用,完全免费。免费clash订阅地址,免费翻墙、免费科学上网、免费梯子、免费ss/v2ray/trojan节点、谷歌商店、翻墙梯子。注意:目前进入官网需开启代理。
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
Friends don't let friends make certain types of data visualization - What are they and why are they bad.
Gemma 2B with 10M context length using Infini-attention.
Scenic: A Jax Library for Computer Vision Research and Beyond
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
🚀 KIMI AI 长文本大模型逆向API白嫖测试【特长:长文本解读整理】,支持高速流式输出、智能体对话、联网搜索、长文档解读、图像OCR、多轮对话,零配置部署,多路token支持,自动清理会话痕迹。
When do we not need larger vision models?
[COLM-2024] List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs
Multimodal Graph Learning: how to encode multiple multimodal neighbors with their relations into LLMs
Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (Qwen2.5, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)
An expert benchmark aiming to comprehensively evaluate the aesthetic perception capacities of MLLMs.
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
Accelerating the development of large multimodal models (LMMs) with lmms-eval
Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)
Survey Paper List - Efficient LLM and Foundation Models