- Shanghai
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
On-device AI across mobile, embedded and edge for PyTorch
collection of benchmarks to measure basic GPU capabilities
Debugging torch distributed program
Tutel MoE: An Optimized Mixture-of-Experts Implementation
Examples demonstrating available options to program multiple GPUs in a single node or a cluster
The official PyTorch implementation of Google's Gemma models
Dynamic Memory Management for Serving LLMs without PagedAttention
The Hardware Sampling (hws) library can be used to track hardware performance like clock frequency, memory usage, temperatures, or power draw.
[Information Fusion 2024] A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation
📰 Must-read papers and blogs on Speculative Decoding ⚡️
Use PyTorch Models with CasADi for data-driven optimization or learning-based optimal control. Supports Acados.
This is originally a collection of papers on neural network accelerators. Now it's more like my selection of research on deep learning and computer architecture.
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
Odysseus: Playground of LLM Sequence Parallelism
A nanoGPT pipeline packed in a spreadsheet
A fast communication-overlapping library for tensor parallelism on GPUs.