Highlights
- Pro
Stars
[ECCV 24] Code for "EraseDraw: Learning to Insert Objects by Erasing Them from Images"
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
Officail Implementation for "ReNoise: Real Image Inversion Through Iterative Noising"
A beautiful, simple, clean, and responsive Jekyll theme for academics
1-shot image segmentation using Stable Diffusion
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
LayerDiffuse in pure diffusers without any GUI
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
Open neural machine translation models and web services
Official implemention of "Make It Count: Text-to-Image Generation with an Accurate Number of Objects"
This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"
Transparent Image Layer Diffusion using Latent Transparency
Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
✂️ Automated high-quality background removal framework for an image using neural networks. ✂️
Official code for "Style Aligned Image Generation via Shared Attention"
Pyrallis is a framework for structured configuration parsing from both cmd and files. Simply define your desired configuration structure as a dataclass and let pyrallis do the rest!
Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow" (ICLR 2024)
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference,…
Implementation of Key-Locked Rank One Editing, from Nvidia AI
Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning
ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning
Open Vocabulary Referring Expressions Generation with Discriminative CLIP