Stars
[IJCAI'24] Beyond Alignment: Blind Video Face Restoration via Parsing-Guided Temporal-Coherent Transformer
【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
A generative speech model for daily dialogue.
Official implementation of Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
A simple script for removing video watermark, using Lama Cleaner. Only tested at NVIDIA windows environment.
Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
ImageBind One Embedding Space to Bind Them All
Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
An Open-source Toolkit for LLM Development
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Retrieval and Retrieval-augmented LLMs
Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (Qwen2.5, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)
Wrapper to use DynamiCrafter models in ComfyUI
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
Collection of AWESOME vision-language models for vision tasks
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
《开源大模型食用指南》基于Linux环境快速部署开源大模型,更适合中国宝宝的部署教程
Code for Machine Learning for Algorithmic Trading, 2nd edition.