Starred repositories
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
[ACL 2024 Best Paper] Deciphering Oracle Bone Language with Diffusion Models
real time face swap and one-click video deepfake with only a single image
A generative speech model for daily dialogue.
2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.
A High-efficiency Open-source Toolkit for Table-to-Latex Task
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
A Comprehensive Toolkit for High-Quality PDF Content Extraction
A framework for few-shot evaluation of language models.
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge manageme…
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
YOLOv10: Real-Time End-to-End Object Detection
This repo is used to release the ArxivFormula dataset.
Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
Llama3、Llama3.1 中文仓库(随书籍撰写中... 各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档)
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
本项目是一个面向小白开发者的大模型应用开发教程,在线阅读地址:https://datawhalechina.github.io/llm-universe/
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Convert PDF to markdown quickly with high accuracy
UniTable: Towards a Unified Table Foundation Model
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS ev…
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
[ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"
Streamlit — A faster way to build and share data apps.
UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition
OCR, layout analysis, reading order, line detection in 90+ languages