Skip to content
View Mars2018's full-sized avatar

Block or report Mars2018

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Distributed training (multi-node) of a Transformer model

Python 37 15 Updated Apr 10, 2024

Attention is all you need implementation

Jupyter Notebook 557 237 Updated Jun 8, 2024

Material for gpu-mode lectures

Jupyter Notebook 2,528 254 Updated Sep 23, 2024

PyTorch native quantization and sparsity for training and inference

Python 801 98 Updated Sep 24, 2024

how to optimize some algorithm in cuda.

Cuda 1,460 119 Updated Sep 24, 2024

LLM101n: Let's build a Storyteller

28,872 1,579 Updated Aug 1, 2024

📰 Must-read papers and blogs on Speculative Decoding ⚡️

367 14 Updated Sep 20, 2024

The official GitHub page for the survey paper "A Survey of Large Language Models".

Python 10,059 791 Updated Aug 20, 2024

C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types and contiguous memory storage

C++ 2,438 308 Updated Sep 24, 2024

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,218 151 Updated Jun 25, 2024

Research and Materials on Hardware implementation of Transformer Model

Jupyter Notebook 199 28 Updated Aug 8, 2024

深度学习经典、新论文逐段精读

26,386 2,402 Updated Aug 8, 2024
Python 1,390 168 Updated Nov 9, 2023

Compress your input to ChatGPT or other LLMs, to let them process 2x more content and save 40% memory and GPU time.

Python 1 Updated Nov 8, 2023

My C++ deep learning framework & other machine learning algorithms

C++ 78 26 Updated Jun 26, 2023

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Python 2,113 208 Updated Sep 17, 2024

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 46,362 6,546 Updated Sep 24, 2024

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 34,852 4,054 Updated Sep 24, 2024

Making large AI models cheaper, faster and more accessible

Python 38,639 4,332 Updated Sep 24, 2024

Understanding Deep Learning - Simon J.D. Prince

Jupyter Notebook 6,114 1,283 Updated Sep 16, 2024

校招、秋招、春招、实习好项目!带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step

C++ 2,408 268 Updated Sep 24, 2024

Examples for using ONNX Runtime for machine learning inferencing.

C++ 1,152 327 Updated Sep 24, 2024

This is a list of interesting papers and projects about TinyML.

725 126 Updated Sep 20, 2024

机器学习算法的公式推导以及numpy实现

Jupyter Notebook 2,005 479 Updated May 2, 2023

My personal notes

1,578 417 Updated Feb 1, 2023

High-performance, scalable time-series database designed for Industrial IoT (IIoT) scenarios

C 23,242 4,836 Updated Sep 24, 2024

pybind11中文文档(个人翻译)

247 58 Updated Jul 24, 2023

Development repository for the Triton language and compiler

C++ 12,832 1,550 Updated Sep 24, 2024
Next