Skip to content

thuc-moreh/FastConvMAE_Moreh

 
 

Repository files navigation

Pretraining FastConvMAE on MAF

This model was tested on MAF VM9 (v23.1.2), SDA size is Medium.128GB, and on Nvidia-A100 machine (haca1003). Original repo: https://github.com/Alpha-VL/FastConvMAE/blob/main/PRETRAIN.md

Usage

Clone the repo

git clone https://github.com/thuc-moreh/FastConvMAE_Moreh.git
cd FastConvMAE_Moreh

Install

  • Create a conda environment and activate it:
conda create -n fastconvmae python=3.8
conda activate fastconvmae
update-moreh --target 23.1.2 --nightly --force
  • Install pip packages
pip install -r requirements.txt

Data preparation

You can download the ImageNet-1K (suggest using a subset of ImageNet-100lcs from Moreh) here and prepare the ImageNet-1K follow this format:

imagenet
  ├── train
      ├── class1
      │   ├── img1.jpeg
      │   ├── img2.jpeg
      │   └── ...
      ├── class2
      │   ├── img3.jpeg
      │   └── ...
      └── ...

The repo uses a tiny subset of the ImageNet-1K dataset, which only contains 1 class. Data directory is /nas/common_data/imagenet_tiny

Pre-training on Moreh VM

To pretrain FastConvMAE, run

python main_pretrain.py

The training log of 20 epochs on Moreh VM is saved at training_log_moreh_vm9.txt.

Notes:

  • The training time of the first iteration on Moreh and Nvidia machines are roughly the same (3.5-4s), but on subsequent iterations NVIDIA VM only takes roughly 0.9s, while Moreh VM takes roughly 2,7s (3 times slower).
  • Max memory usage on NVIDIA VM is 34032 Mb, while that of Moreh is 47156 Mb
  • The learning rate and training loss convergence behaviors are roughly similar on both machines (depending on parameters initialization)

Pre-training on Nvidia A100 VM

Please follow PRETRAIN.md for pretraining The training log of 20 epochs on Nvidia A100 VM is saved at training_log_nvidia_hac1003.

Finetune and evaluation

Follow FINETUNE.md for fintuning and evaluation. Run

python main_finetune.py     --output_dir /nas/thuchk/FastConvMAE/output_dir     --batch_size 32     --model convvit_base_patch16     --finetune /nas/thuchk/FastConvMAE/output_dir/checkpoint-4.pth     --epochs 5    --blr 5e-4 --layer_decay 0.65     --weight_decay 0.05 --drop_path 0.1 --reprob 0.25 --mixup 0.8 --cutmix 1.0     --dist_eval --data_path /nas/common_data/imagenet_100cls

The log can be found in logs folder

Below is the orginial README

🚀Fast ConvMAE🚀

Fast ConvMAE: Fast Pretraining of ConvMAE

This repo is the faster implementation of ConvMAE: Masked Convolution Meets Masked Autoencoders

Updates

17/June/2022

Released the pre-training codes for ImageNet-1K.

Introduction

Fast ConvMAE framework is a superiorly fast masked modeling scheme via complementary masking and mixture of reconstrunctors based on the ConvMAE.

tenser

Pretrain on ImageNet-1K

The following table provides pretrained checkpoints and logs used in the paper.

Fast ConvMAE-Base
50epoch pretrained checkpoints N/A
logs N/A

Main Results on COCO & ImageNet-1K

Models Masking Tokenizer Backbone PT Epochs PT Hours COCO FT Epochs $AP^{Box}$ $AP^{Mask}$ ImageNet Finetune Epochs Finetune acc@1(%) ADE 20K mIoU
ConvMAE 25 % RGB ConvViT-B 200 512 25 50.8 45.4 100 84.4 48.5
ConvMAE 25 % RGB ConvViT-B 1600 4000 25 53.2 47.1 100 85.0 51.7
MAE 25 % RGB ViT-B 1600 2069 100 50.3 44.9 100 83.6 48.1
SimMIM 100 % RGB Swin-B 800 1609 36 50.4 44.4 100 84.0 -
GreenMIM 25 % RGB Swin-B 800 887 36 50.0 44.1 100 85.1 -
ConvMAE 100 % RGB ConvViT-B 50 266 25 51.0 45.4 100 84.4 48.3
ConvMAE 100 % C+T ConvViT-B 50 333 25 52.8 46.9 100 85.0 52.7
ConvMAE 100 % C+T ConvViT-B 100 666 25 53.3 47.3 100 85.2 52.8
ConvMAE 100 % C+T ConvViT-L 200 N/A 25 N/A N/A 50 86.7 54.5

Visualizations

NOTE: Grey patches are masked and colored ones are kept.

tenser

Getting Started

Prerequisites

  • Linux
  • Python 3.7+
  • CUDA 10.2+
  • GCC 5+

Training and evaluation

Acknowledgement

The pretraining and finetuning of our project are based on DeiT, MAE, and ConvMAE. Thanks for their wonderful work.

License

FastConvMAE is released under the MIT License.

visitors

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%