Skip to content
View gtarcoder's full-sized avatar

Block or report gtarcoder

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results
Python 139 10 Updated Sep 23, 2024

An unnecessarily tiny implementation of GPT-2 in NumPy.

Python 3,190 409 Updated Apr 24, 2023

A curated list of awesome C/C++ performance optimization resources: talks, articles, books, libraries, tools, sites, blogs. Inspired by awesome.

CSS 2,360 254 Updated Sep 22, 2022

A guidance language for controlling large language models.

Jupyter Notebook 18,738 1,032 Updated Sep 18, 2024

【A common used C++ DAG framework】 一个通用的、无三方依赖的、跨平台的、收录于awesome-cpp的、基于流图的并行计算框架。欢迎star & fork & 交流

C++ 1,719 318 Updated Sep 17, 2024

Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (Qwen2.5, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Python 3,513 306 Updated Sep 23, 2024

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 92,622 14,825 Updated Sep 23, 2024

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)

Python 31,245 3,852 Updated Sep 19, 2024

Redis Python client

Python 12,555 2,506 Updated Sep 19, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,247 913 Updated Sep 18, 2024

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Python 2,064 202 Updated Sep 23, 2024

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 4,240 382 Updated Sep 23, 2024

A distributed Kafka Consumer in Python using Ray

Python 23 3 Updated Feb 7, 2023

本项目旨在分享大模型相关技术原理以及实战经验。

HTML 9,281 909 Updated Sep 22, 2024

Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.

Jupyter Notebook 1,524 93 Updated Feb 16, 2024

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

C 6,119 1,774 Updated Jul 26, 2024

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 8,381 594 Updated Sep 20, 2024

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Rust 8,903 770 Updated Sep 17, 2024

C++ model train&inference framework

C++ 223 36 Updated Dec 25, 2019

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 27,229 3,993 Updated Sep 23, 2024

基于向量数据库与GPT3.5的通用本地知识库方案(A universal local knowledge base solution based on vector database and GPT3.5)

Python 3,619 318 Updated May 12, 2023

This is a text generation method which returns a generator, streaming out each token in real-time during inference, based on Huggingface/Transformers.

Python 96 14 Updated Mar 11, 2024

Large Language Model Text Generation Inference

Python 8,810 1,028 Updated Sep 23, 2024

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 132,429 26,369 Updated Sep 23, 2024

Demo of live progress bar developed in SpringBoot and React.js using server-sent events.

Java 14 8 Updated Dec 5, 2021

An ASGI web server, for Python. 🦄

Python 8,385 724 Updated Sep 21, 2024

Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.

Python 1,290 80 Updated Jan 24, 2024

Fast Inference Solutions for BLOOM

Python 557 112 Updated Aug 10, 2024

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 1,847 174 Updated Sep 11, 2024
Next