Stars
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition (ASR), and text-to-spee…
Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference"
Fast and memory-efficient exact attention
A simple yet powerful tool to turn traditional container/OS images into unprivileged sandboxes.
An official PyTorch implementation of "GNNCert: Deterministic Certification of Graph Neural Networks against Adversarial Perturbations" (ICLR 2024).
CS 61C at UC Berkeley with Stephan Kaminsky, Sean Farhat, Jenny Song - Summer 2020
shenzheyu / Megatron-LM
Forked from NVIDIA/Megatron-LMOngoing research training transformer models at scale
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…
Code and Data for "Long-context LLMs Struggle with Long In-context Learning"
[ACL'24 Outstanding] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
A Comprehensive Toolkit for High-Quality PDF Content Extraction
LlamaIndex is a data framework for your LLM applications
Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models