Add SiglipForImageClassification and CLIPForImageClassification #28952

NielsRogge · 2024-02-10T11:12:42Z

What does this PR do?

This PR adds 2 new classes to the library, SiglipForImageClassification and CLIPForImageClassification.

This makes it easier to fine-tune SigLIP/CLIP for image classification, as otherwise people had to manually write a class based on SiglipVisionModel/CLIPVisionModel with a head on top. Given that SigLIP and CLIP are among the best vision encoders out there, it makes sense to add them.

HuggingFaceDocBuilderDev · 2024-02-10T11:31:31Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

amyeroberts

Looks great - thanks for adding!

NielsRogge · 2024-02-13T16:56:05Z

@amyeroberts I assume the "build PR documentation" check is failing but unrelated to this PR, so I can merge?

amyeroberts · 2024-02-13T18:25:19Z

@NielsRogge Yes, if you look on main you can see that the build is failing for lots of (unrelated) PRs at the moment. I'm happy for you to merge as all the other CIs are passing.

zhjohnchan · 2024-02-14T20:03:31Z

Hi @NielsRogge and @amyeroberts,

Thanks for your contribution to the community @NielsRogge! I am wondering if "do_rescale": true is necessary for SiglipImageProcessor? Let me know if I miss anything!

Also, did you encounter the same problem as in #28968?

Thanks!

Best,

amyeroberts · 2024-02-14T20:10:20Z

@zhjohnchan do_rescale=True is necessary if the input images are in the range [0, 255]. If they've already been rescaled to [0, 1] you can set it to False.

…ingface#28952) * First draft * Add CLIPForImageClassification * Remove scripts * Fix doctests

zhjohnchan · 2024-02-14T21:38:12Z

@amyeroberts Thank you so much! Just realized it should be rescaled for all Transformers' preprocessors.

NielsRogge · 2024-02-15T07:53:32Z

@zhjohnchan I haven't encountered the same problem, made a fine-tuning notebook for SigLIP here: https://github.com/NielsRogge/Transformers-Tutorials/blob/master/SigLIP/Fine_tuning_SigLIP_and_friends_for_multi_label_image_classification.ipynb.

zhjohnchan · 2024-02-15T08:28:05Z

@NielsRogge Got it! Thank you so much!

Andron00e · 2024-02-23T08:50:04Z

Hi, I am the one who proposed to add a CLIPForImageClassification. Now I am back to contributing with it.
I see "Build documentation" failure. How can it be fixed?

NielsRogge · 2024-02-23T17:31:29Z

HI @Andron00e, thanks for suggesting it, it's now available here: https://huggingface.co/docs/transformers/en/model_doc/siglip#transformers.SiglipForImageClassification

Andron00e · 2024-02-23T18:07:53Z

Thank you! I'll try to fix minor bugs with CLIPForImageClassification and place its code into SigLIP's directory.

* First draft * Add CLIPForImageClassification * Remove scripts * Fix doctests

NielsRogge added 2 commits February 10, 2024 11:59

First draft

41fab94

Add CLIPForImageClassification

12d4101

Remove scripts

3a0663a

NielsRogge requested a review from amyeroberts February 12, 2024 09:02

amyeroberts approved these changes Feb 12, 2024

View reviewed changes

Fix doctests

5b1ef31

NielsRogge merged commit 63ffd56 into huggingface:main Feb 14, 2024
20 of 21 checks passed

sbucaille pushed a commit to sbucaille/transformers that referenced this pull request Feb 14, 2024

Add SiglipForImageClassification and CLIPForImageClassification (hugg…

aa4944c

…ingface#28952) * First draft * Add CLIPForImageClassification * Remove scripts * Fix doctests

NielsRogge mentioned this pull request Feb 19, 2024

CLIPForImageClassification #27802

Closed

2 tasks

Xmaster6y mentioned this pull request Apr 21, 2024

Correct CLIPForImageClassification #30373

Closed

5 tasks

itazap pushed a commit that referenced this pull request May 14, 2024

Add SiglipForImageClassification and CLIPForImageClassification (#28952)

8fbcf3b

* First draft * Add CLIPForImageClassification * Remove scripts * Fix doctests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SiglipForImageClassification and CLIPForImageClassification #28952

Add SiglipForImageClassification and CLIPForImageClassification #28952

NielsRogge commented Feb 10, 2024

HuggingFaceDocBuilderDev commented Feb 10, 2024

amyeroberts left a comment

NielsRogge commented Feb 13, 2024

amyeroberts commented Feb 13, 2024

zhjohnchan commented Feb 14, 2024 •

edited

Loading

amyeroberts commented Feb 14, 2024

zhjohnchan commented Feb 14, 2024

NielsRogge commented Feb 15, 2024

zhjohnchan commented Feb 15, 2024

Andron00e commented Feb 23, 2024

NielsRogge commented Feb 23, 2024

Andron00e commented Feb 23, 2024

Add SiglipForImageClassification and CLIPForImageClassification #28952

Add SiglipForImageClassification and CLIPForImageClassification #28952

Conversation

NielsRogge commented Feb 10, 2024

What does this PR do?

HuggingFaceDocBuilderDev commented Feb 10, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

NielsRogge commented Feb 13, 2024

amyeroberts commented Feb 13, 2024

zhjohnchan commented Feb 14, 2024 • edited Loading

amyeroberts commented Feb 14, 2024

zhjohnchan commented Feb 14, 2024

NielsRogge commented Feb 15, 2024

zhjohnchan commented Feb 15, 2024

Andron00e commented Feb 23, 2024

NielsRogge commented Feb 23, 2024

Andron00e commented Feb 23, 2024

zhjohnchan commented Feb 14, 2024 •

edited

Loading