Stars
8
stars
written in Jupyter Notebook
Clear filter
Google Research
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
꼼꼼한 딥러닝 논문 리뷰와 코드 실습
TorchXRayVision: A library of chest X-ray datasets and models. Classifiers, segmentation, and autoencoders.
[NeurIPS 2021] You Only Look at One Sequence
VQVAEs, GumbelSoftmaxes and friends
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models (ICCV 2023)
Baseline model for multimodal classification based on images and text. Text representation obtained from pretrained BERT base model and image representation obtained from VGG16 pretrained model.