-
Zhejiang University, China
- Hangzhou, China
- luohao.site
Stars
Official inference repo for FLUX.1 models
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
VideoSys: An easy and efficient system for video generation
https://www.shoufachen.com/Awesome-Diffusion-Transformers/
A Collection of Papers and Codes for CVPR2024/ECCV2024 AIGC
Transparent Image Layer Diffusion using Latent Transparency
We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
A curated list of awesome research papers, projects, code, dataset, workshops etc. related to virtual try-on.
LightGlue: Local Feature Matching at Light Speed (ICCV 2023)
Adapting Meta AI's Segment Anything to Downstream Tasks with Adapters and Prompts
[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Official repository for the General Robust Image Task (GRIT) Benchmark
A Semantic Controllable Self-Supervised Learning Framework to learn general human representations from massive unlabeled human images, which can benefit downstream human-centric tasks to the maximu…
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
Efficient Dataset Distillation by Representative Matching
LAVIS - A One-stop Library for Language-Vision Intelligence
SSL4EO-S12: a large-scale dataset for self-supervised learning in Earth observation
DAMO-YOLO: a fast and accurate object detection method with some new techs, including NAS backbones, efficient RepGFPN, ZeroHead, AlignedOTA, and distillation enhancement.
Official Codes for "Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality"
The official repo for [TGRS'22] "Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model"
A comprehensive list [SAMRS@NeurIPS'23, RVSA@TGRS'22, RSP@TGRS'22] of our research works related to remote sensing, including papers, codes, and citations. Note: The repo for [TGRS'22] "An Empirica…
[ECCV 2022] "SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image", Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Humphrey Shi, Zhangyang Wang
Implementation of 'End-to-End Transformer Based Model for Image Captioning' [AAAI 2022]
State-of-the-art, simple, fast unbounded / large-scale NeRFs.
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation