Starred repositories
Official inference repo for FLUX.1 models
The official Python library for the OpenAI API
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
Edit anything in images powered by segment-anything, ControlNet, StableDiffusion, etc. (ACM MM)
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Inpaint anything using Segment Anything and inpainting models.
YOLOv10: Real-Time End-to-End Object Detection
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
InvokeAI is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. Th…
Faster Whisper transcription with CTranslate2
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Open-Sora: Democratizing Efficient Video Production for All
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
A Unified Toolkit for Deep Learning Based Document Image Analysis
The fantastic ORM library for Golang, aims to be developer friendly
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
The most customizable typing website with a minimalistic design and a ton of features. Test yourself in various modes, track your progress and improve your speed.
为键盘工作者设计的单词记忆与英语肌肉记忆锻炼软件 / Words learning and English muscle memory training software designed for keyboard workers
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
LLaMA: Open and Efficient Foundation Language Models
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
An open source implementation of CLIP.
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.