🚀Fast ConvMAE🚀

Pretraining FastConvMAE on MAF

This model was tested on MAF VM9 (v23.1.2), SDA size is Medium.128GB, and on Nvidia-A100 machine (haca1003). Original repo: https://github.com/Alpha-VL/FastConvMAE/blob/main/PRETRAIN.md

Usage

Clone the repo

git clone https://github.com/thuc-moreh/FastConvMAE_Moreh.git
cd FastConvMAE_Moreh

Install

Create a conda environment and activate it:

conda create -n fastconvmae python=3.8
conda activate fastconvmae
update-moreh --target 23.1.2 --nightly --force

Install pip packages

pip install -r requirements.txt

Data preparation

You can download the ImageNet-1K (suggest using a subset of ImageNet-100lcs from Moreh) here and prepare the ImageNet-1K follow this format:

imagenet
  ├── train
      ├── class1
      │   ├── img1.jpeg
      │   ├── img2.jpeg
      │   └── ...
      ├── class2
      │   ├── img3.jpeg
      │   └── ...
      └── ...

The repo uses a tiny subset of the ImageNet-1K dataset, which only contains 1 class. Data directory is /nas/common_data/imagenet_tiny

Pre-training on Moreh VM

To pretrain FastConvMAE, run

python main_pretrain.py

The training log of 20 epochs on Moreh VM is saved at training_log_moreh_vm9.txt.

Notes:

The training time of the first iteration on Moreh and Nvidia machines are roughly the same (3.5-4s), but on subsequent iterations NVIDIA VM only takes roughly 0.9s, while Moreh VM takes roughly 2,7s (3 times slower).
Max memory usage on NVIDIA VM is 34032 Mb, while that of Moreh is 47156 Mb
The learning rate and training loss convergence behaviors are roughly similar on both machines (depending on parameters initialization)

Pre-training on Nvidia A100 VM

Please follow PRETRAIN.md for pretraining The training log of 20 epochs on Nvidia A100 VM is saved at training_log_nvidia_hac1003.

Finetune and evaluation

Follow FINETUNE.md for fintuning and evaluation. Run

python main_finetune.py     --output_dir /nas/thuchk/FastConvMAE/output_dir     --batch_size 32     --model convvit_base_patch16     --finetune /nas/thuchk/FastConvMAE/output_dir/checkpoint-4.pth     --epochs 5    --blr 5e-4 --layer_decay 0.65     --weight_decay 0.05 --drop_path 0.1 --reprob 0.25 --mixup 0.8 --cutmix 1.0     --dist_eval --data_path /nas/common_data/imagenet_100cls

The log can be found in logs folder

Below is the orginial README

🚀Fast ConvMAE🚀

Fast ConvMAE: Fast Pretraining of ConvMAE

This repo is the faster implementation of ConvMAE: Masked Convolution Meets Masked Autoencoders

Updates

17/June/2022

Released the pre-training codes for ImageNet-1K.

Introduction

Fast ConvMAE framework is a superiorly fast masked modeling scheme via complementary masking and mixture of reconstrunctors based on the ConvMAE.

Pretrain on ImageNet-1K

The following table provides pretrained checkpoints and logs used in the paper.

	Fast ConvMAE-Base
50epoch pretrained checkpoints	N/A
logs	N/A

Main Results on COCO & ImageNet-1K

Models	Masking	Tokenizer	Backbone	PT Epochs	PT Hours	COCO FT Epochs	$AP^{Box}$	$AP^{Mask}$	ImageNet Finetune Epochs	Finetune acc@1(%)	ADE 20K mIoU
ConvMAE	25 %	RGB	ConvViT-B	200	512	25	50.8	45.4	100	84.4	48.5
ConvMAE	25 %	RGB	ConvViT-B	1600	4000	25	53.2	47.1	100	85.0	51.7
MAE	25 %	RGB	ViT-B	1600	2069	100	50.3	44.9	100	83.6	48.1
SimMIM	100 %	RGB	Swin-B	800	1609	36	50.4	44.4	100	84.0	-
GreenMIM	25 %	RGB	Swin-B	800	887	36	50.0	44.1	100	85.1	-
ConvMAE	100 %	RGB	ConvViT-B	50	266	25	51.0	45.4	100	84.4	48.3
ConvMAE	100 %	C+T	ConvViT-B	50	333	25	52.8	46.9	100	85.0	52.7
ConvMAE	100 %	C+T	ConvViT-B	100	666	25	53.3	47.3	100	85.2	52.8
ConvMAE	100 %	C+T	ConvViT-L	200	N/A	25	N/A	N/A	50	86.7	54.5

Visualizations

NOTE: Grey patches are masked and colored ones are kept.

Getting Started

Prerequisites

Linux
Python 3.7+
CUDA 10.2+
GCC 5+

Training and evaluation

See PRETRAIN.md for pretraining.

Acknowledgement

The pretraining and finetuning of our project are based on DeiT, MAE, and ConvMAE. Thanks for their wonderful work.

License

FastConvMAE is released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
figures		figures
logs		logs
util		util
.gitignore		.gitignore
FINETUNE.md		FINETUNE.md
LICENSE		LICENSE
PRETRAIN.md		PRETRAIN.md
README.md		README.md
engine_pretrain.py		engine_pretrain.py
main_pretrain.py		main_pretrain.py
models_fastconvmae.py		models_fastconvmae.py
requirements.txt		requirements.txt
submitit_pretrain.py		submitit_pretrain.py
vision_transformer.py		vision_transformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pretraining FastConvMAE on MAF

Usage

Clone the repo

Install

Data preparation

Pre-training on Moreh VM

Pre-training on Nvidia A100 VM

Finetune and evaluation

Below is the orginial README

🚀Fast ConvMAE🚀

Fast ConvMAE: Fast Pretraining of ConvMAE

Updates

Introduction

Pretrain on ImageNet-1K

Main Results on COCO & ImageNet-1K

Visualizations

Getting Started

Prerequisites

Training and evaluation

Acknowledgement

License

About

Releases

Packages

Languages

License

thuc-moreh/FastConvMAE_Moreh

Folders and files

Latest commit

History

Repository files navigation

Pretraining FastConvMAE on MAF

Usage

Clone the repo

Install

Data preparation

Pre-training on Moreh VM

Pre-training on Nvidia A100 VM

Finetune and evaluation

Below is the orginial README

🚀Fast ConvMAE🚀

Fast ConvMAE: Fast Pretraining of ConvMAE

Updates

Introduction

Pretrain on ImageNet-1K

Main Results on COCO & ImageNet-1K

Visualizations

Getting Started

Prerequisites

Training and evaluation

Acknowledgement

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages