Skip to content
View wavinflaghxm's full-sized avatar
  • University of Chinese Academy of Sciences

Highlights

  • Pro

Block or report wavinflaghxm

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Collection of AWESOME vision-language models for vision tasks

2,275 203 Updated Aug 29, 2024

多模态 MM +Chat 合集

Python 192 15 Updated Sep 25, 2024

iBOT 🤖: Image BERT Pre-Training with Online Tokenizer (ICLR 2022)

Jupyter Notebook 670 77 Updated Apr 14, 2022

OLMoE: Open Mixture-of-Experts Language Models

Jupyter Notebook 395 30 Updated Sep 17, 2024

Code for our paper "Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers"

Python 34 2 Updated Oct 4, 2023

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 11,134 947 Updated Sep 30, 2024

[ECCV 2022]Code for paper "DaViT: Dual Attention Vision Transformer"

Python 323 33 Updated Feb 13, 2024

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538

Python 953 98 Updated Apr 19, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,702 112 Updated Sep 19, 2024

4M: Massively Multimodal Masked Modeling

Python 1,566 90 Updated Jul 17, 2024

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Python 19,814 2,978 Updated Aug 28, 2024
Python 281 7 Updated Jan 27, 2024

PyTorch code and models for the DINOv2 self-supervised learning method.

Jupyter Notebook 8,887 781 Updated Aug 7, 2024

ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering

Python 1,180 50 Updated Sep 25, 2024

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 31,655 4,711 Updated Sep 30, 2024

PyTorch implementation of MAE https//arxiv.org/abs/2111.06377

Python 7,201 1,203 Updated Jul 23, 2024

PyTorch implementation of multi-task learning architectures, incl. MTI-Net (ECCV2020).

Python 761 113 Updated Jan 13, 2022

Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch

Python 236 8 Updated Apr 24, 2024

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 13,207 1,058 Updated May 23, 2024

A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models

Python 614 47 Updated Sep 13, 2023
Jupyter Notebook 565 51 Updated Sep 17, 2024

A fast MoE impl for PyTorch

Python 1,535 186 Updated Jul 5, 2024

Official repository for "AM-RADIO: Reduce All Domains Into One"

Python 629 23 Updated Sep 24, 2024
Python 126 9 Updated May 6, 2024

🌍 Discover our global repository of countries, states, and cities! 🏙️ Get comprehensive data in JSON, SQL, PSQL, XML, YAML, and CSV formats. Access ISO2, ISO3 codes, country code, capital, native l…

PHP 7,270 2,534 Updated Sep 11, 2024

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

Python 543 57 Updated Jun 7, 2024

[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"

Python 753 43 Updated Aug 5, 2024

The official Meta Llama 3 GitHub site

Python 26,400 2,984 Updated Aug 12, 2024

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 5,635 439 Updated Sep 19, 2024

(ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Life

Python 308 11 Updated Jul 10, 2024
Next