Stars
Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step
Machine Learning Interviews from FAANG, Snapchat, LinkedIn. I have offers from Snapchat, Coupang, Stitchfix etc. Blog: mlengineer.io.
Retrieval and Retrieval-augmented LLMs
🤖 Free Search with AI, 💡 Open-Source Perplexity, 📍 Support Ollama/SearXNG, Support Docker deployment. 让AI大模型和搜索引擎回答你的问题,支持本地大模型(Ollama)、聚合搜索引擎SearXNG,支持Docker一键部署。
推荐系统入门教程,在线阅读地址:https://datawhalechina.github.io/fun-rec/
An AI-powered search engine with a generative UI
中文命名实体识别。包含目前最新的中文命名实体识别论文、中文实体识别相关工具、数据集,以及中文预训练模型、词向量、实体识别综述等。
Video+code lecture on building nanoGPT from scratch
tensorflow实战练习,包括强化学习、推荐系统、nlp等
搜索、推荐、广告、用增等工业界实践文章收集(来源:知乎、Datafuntalk、技术公众号)
计算广告/推荐系统/机器学习(Machine Learning)/点击率(CTR)/转化率(CVR)预估/点击率预估
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
Multilingual/multidomain question generation datasets, models, and python library for question generation.
Generate question/answer training pairs out of raw text.
Collection of data science projects in Python
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
llama3 implementation one matrix multiplication at a time
Curated list of data science interview questions and answers
Data science interview questions and answers
Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
An LLM-powered advanced RAG pipeline built from scratch
A beautiful resume/cover letter LaTeX template pair that are extraordinarily easy to use.
A framework for large scale recommendation algorithms.
Start building LLM-empowered multi-agent applications in an easier way.
CTR prediction model based on spark(LR, GBDT, DNN)