zyeric

Yi Zhu zyeric

honest, modest, reliable

24 followers · 26 following

MSRA
Beijing

Achievements

Stars

22 stars written in C++

Clear filter

ggerganov / llama.cpp

LLM inference in C/C++

C++ 65,659 9,421 Updated Sep 30, 2024

abseil / abseil-cpp

Abseil Common Libraries (C++)

C++ 14,812 2,588 Updated Sep 27, 2024

chenshuo / muduo

Event-driven network library for multi-threaded Linux server in C++11

C++ 14,753 5,155 Updated Aug 15, 2024

triton-lang / triton

Development repository for the Triton language and compiler

C++ 12,896 1,561 Updated Sep 30, 2024

NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 10,607 2,109 Updated Sep 24, 2024

Oneflow-Inc / oneflow

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

C++ 5,874 667 Updated Sep 6, 2024

NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT

C++ 5,803 889 Updated Mar 27, 2024

NVIDIA / cutlass

CUDA Templates for Linear Algebra Subroutines

C++ 5,439 921 Updated Sep 25, 2024

NVIDIA / thrust

[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl

C++ 4,910 757 Updated Feb 8, 2024

MegEngine / MegEngine

MegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架

C++ 4,756 538 Updated Sep 26, 2024

pytorch / ELF

ELF: a platform for game research with AlphaGoZero/AlphaZero reimplementation

C++ 3,365 567 Updated Jun 21, 2019

pytorch / glow

Compiler for Neural Network hardware accelerators

C++ 3,210 689 Updated May 11, 2024

bytedance / lightseq

LightSeq: A High Performance Library for Sequence Processing and Generation

C++ 3,177 328 Updated May 16, 2023

iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

C++ 2,582 579 Updated Sep 30, 2024

Tencent / TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

C++ 1,474 197 Updated Jun 12, 2023

tensor-compiler / taco

The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs

C++ 1,235 186 Updated Apr 14, 2024

microsoft / nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

C++ 954 159 Updated Sep 19, 2024

Tiramisu-Compiler / tiramisu

A polyhedral compiler for expressing fast and portable data parallel algorithms

C++ 916 132 Updated Sep 17, 2024

alibaba / BladeDISC

BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.

C++ 800 160 Updated Aug 28, 2024

llvm / Polygeist

C/C++ frontend for MLIR. Also features polyhedral optimizations, parallel optimizations, and more!

C++ 468 109 Updated Aug 22, 2024

google-research / sputnik

A library of GPU kernels for sparse matrix operations.

C++ 241 50 Updated Nov 24, 2020

pigirons / conv3x3_m1

This is a demo how to write a high performance convolution run on apple silicon

C++ 52 2 Updated Feb 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yi Zhu zyeric

Achievements

Achievements

Block or report zyeric

Stars

ggerganov / llama.cpp

abseil / abseil-cpp

chenshuo / muduo

triton-lang / triton

NVIDIA / TensorRT

Oneflow-Inc / oneflow

NVIDIA / FasterTransformer

NVIDIA / cutlass

NVIDIA / thrust

MegEngine / MegEngine

pytorch / ELF

pytorch / glow

bytedance / lightseq

iree-org / iree

Tencent / TurboTransformers

tensor-compiler / taco

microsoft / nnfusion

Tiramisu-Compiler / tiramisu

alibaba / BladeDISC

llvm / Polygeist

google-research / sputnik

pigirons / conv3x3_m1