Skip to content
View Williams-Hao's full-sized avatar

Block or report Williams-Hao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.

Jupyter Notebook 467 50 Updated Jul 11, 2024

中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。

Python 1,140 137 Updated Apr 20, 2024

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 27,123 3,024 Updated Sep 20, 2024

用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.

Python 2,452 302 Updated May 21, 2024

Official implementation of MotionClone: Training-Free Motion Cloning for Controllable Video Generation

Python 364 27 Updated Aug 7, 2024

LLM101n: Let's build a Storyteller

28,527 1,558 Updated Aug 1, 2024

ERNIE Pytorch Version

Python 909 118 Updated Jul 26, 2023

使用Bert,ERNIE,进行中文文本分类

Python 3,970 895 Updated Jun 28, 2024

Source code of K-BERT (AAAI2020)

Python 951 213 Updated Jan 27, 2023

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

Python 7,053 577 Updated Apr 30, 2024

A comprehensive library for implementing LLMs, including a unified training pipeline and comprehensive model evaluation.

Python 579 75 Updated Aug 29, 2024

YuLan: An Open-Source Large Language Model

Python 541 48 Updated Jul 2, 2024

Inference code for LLaMA models

Python 101 24 Updated Aug 13, 2023

ChatGPT中文资料库:为了帮助广大的中文开发者和学习者更好地理解和运用OpenAI的ChatGPT技术而创建的。我们会在这个仓库中持续更新有关ChatGPT的教程,工具介绍,中文资料,包括但不限于工具使用教程、资料,论文、应用实例和ChatGPT社区等。

374 43 Updated Nov 8, 2023

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 92,507 14,807 Updated Sep 20, 2024

本项目旨在分享大模型相关技术原理以及实战经验。

HTML 9,240 905 Updated Sep 11, 2024

Building a quick conversation-based search demo with Lepton AI.

TypeScript 7,724 980 Updated Sep 18, 2024

CIKM2023 Best Demo Paper Award. HugNLP is a unified and comprehensive NLP library based on HuggingFace Transformer. Please hugging for NLP now!😊

Python 375 45 Updated Oct 31, 2023

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

9,410 1,540 Updated May 23, 2024
Python 84 58 Updated Dec 17, 2020

今日头条中文新闻文本(多层)分类数据集

Python 388 122 Updated May 6, 2021

MiniRBT (中文小型预训练模型系列)

Python 244 16 Updated Apr 5, 2023

简洁易用版TinyBert:基于Bert进行知识蒸馏的预训练语言模型

Python 250 49 Updated Oct 24, 2020

搜集、整理、发布 中文 自然语言处理 语料/数据集,与 有志之士 共同 促进 中文 自然语言处理 的 发展。

Jupyter Notebook 5,813 1,395 Updated Jan 29, 2019

Turn Chinese natural language into structured data 中文自然语言理解

Python 1,507 422 Updated Jul 30, 2024

中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、…

Python 67,787 14,413 Updated May 10, 2024

Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。

Python 4,000 374 Updated Aug 13, 2024

Hot corners for Windows 10 & 11

Pascal 608 42 Updated Jul 27, 2024

A curated list of awesome Torch tutorials, projects and communities

627 142 Updated Mar 22, 2018

PyTorch Tutorial for Deep Learning Researchers

Python 29,858 8,087 Updated Aug 15, 2023
Next