Stars
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
Summary of related papers on visual attention. Related code will be released based on Jittor gradually.
Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.
Unsupervised Learning for Image Registration
Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).
Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)
Medical image registration using deep learning
A Change Detection Repo Standing on the Shoulders of Giants
Transformer-based image captioning extension for pytorch/fairseq
The official repo for [NeurIPS'23] "SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model"
Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"
Code for "Aligning Linguistic Words and Visual Semantic Units for Image Captioning", ACM MM 2019
The Most Faithful Implementation of Segment Anything (SAM) in 3D
Includes: Learning data augmentation strategies for object detection | GridMask data augmentation | Augmentation for small object detection in Numpy. Use RetinaNet with ResNet-18 to test these meth…
Global Reasoning module for visual recognition
"SOLQ: Segmenting Objects by Learning Queries", SOLQ is an end-to-end instance segmentation framework with Transformer.
SSL4EO-S12: a large-scale dataset for self-supervised learning in Earth observation
Implementation of the Object Relation Transformer for Image Captioning
Official repo for "SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing"
Official LEVIR-CC dataset and Pytorch implementation for Remote Sensing Image Change Captioning With Dual-Branch Transformers: A New Method and a Large Scale Dataset
[AAAI 2024] EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering