lm-human-preference-details

This repo aims to make a blog post similar to The 37 Implementation Details of Proximal Policy Optimization but for RLHF techniques used in https://github.com/openai/lm-human-preferences.

The goal of the repo is 1) to provide a simple-to-read and minimal reference implementation of RLHF and 2) to create rigorous benchmarks and to match the learning curves of openai/lm-human-preferences.

This repo is just for educational / learning purposes. For more advanced users, https://github.com/lvwerra/trl would be a great choice.

Get started

poetry install
poetry shell
accelerate launch \
    --num_processes 8 \
    lm_human_preference_details/train_both_accelerate.py \
    --reward.track --policy.track
accelerate launch \
    --num_processes 8 \
    lm_human_preference_details/train_both_accelerate.py \
    --reward.track \
    --reward.label_dataset=descriptiveness/offline_5k.json \
    --policy.track

You can also run stuff individually. For example, to train the reward model, run

accelerate launch \
    --num_processes 8 \
    lm_human_preference_details/train_reward_accelerate.py \
    --track

to train the policy model, run

accelerate launch \
    --num_processes 8 \
    lm_human_preference_details/train_policy_accelerate.py \
    --track

⚠️ NOTE: You can install the latest torch or jax with the following command:

poetry run pip install torch==2.0.1
poetry run pip install "jax[cuda11_cudnn82]==0.4.8" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
poetry run pip install git+https://github.com/huggingface/huggingface_hub@main

Current status

Currently, for reproduction, I used the same dataset, data processing pipeline, hyperparameters, and initial model architecture and weights (gpt2 124M pretrained model). The following chart shows our reproduction can match the learning curves of openai/lm-human-preferences.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
lm_human_preference_details		lm_human_preference_details
static		static
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
deepspeed.yaml		deepspeed.yaml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lm-human-preference-details

Get started

Current status

About

Releases

Packages

Languages

License

streettraffic/lm-human-preference-details-temp

Folders and files

Latest commit

History

Repository files navigation

lm-human-preference-details

Get started

Current status

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages