-
university of science and technology of China
- Hefei China
Highlights
- Pro
Stars
up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources
Paper collections of multi-modal LLM for Math/STEM/Code.
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Accompanying repo for 'Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs' project
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
A generative speech model for daily dialogue.
[Under Review] Official PyTorch implementation code for realizing the technical part of Mamba-based traversal of rationale (Meteor) to improve performance of numerous vision language performances f…
Converting Mixtral-8x7B to Mixtral-[1~7]x7B
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
A native PyTorch Library for large model training
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
Ongoing research training transformer models at scale
800,000 step-level correctness labels on LLM solutions to MATH problems
Must-read Papers on Large Language Model (LLM) as Optimizers and Automatic Optimization for Prompting LLMs.
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
An all-in-one LLMs Chat UI for Apple Silicon Mac using MLX Framework.
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
A repository sharing the literatures about long-context large language models, including the methodologies and the evaluation benchmarks
VideoSys: An easy and efficient system for video generation
YaRN: Efficient Context Window Extension of Large Language Models