Contextual Bandit with Active Learning

This is an implementation of the preference-based active learning algorithm for contextual bandit outlined in Contextual Bandits and Imitation Learning via Preference-Based Active Queries. This paper considers the problem of contextual bandits and imitation learning, where the learner lacks direct knowledge of the executed action's reward. Under the assumption that the learner has access to a function class that can represent the expert's preference model under appropriate link functions, the paper proposed an algorithm that leverages an online regression oracle with respect to this function class for choosing its actions and deciding when to query.

Installation

git clone https://github.com/Cornell-RL/active_CB.git
cd active_cb
pip install numpy pandas torch ucimlrepo

Usage

python algo.py

This will run the preference learning algorithm on the Iris dataset. To run the reward learning algorithm or start with a different dataset, please follow

usage: algo.py [-h] [--dataset DATASET] [--query QUERY] [--model MODEL]

optional arguments:
  -h, --help         show this help message and exit
  --dataset DATASET  Name of the dataset (iris/car/knowledge)
  --query QUERY      Query type (active/passive)
  --model MODEL      Model type (reward/preference)

Running 1000 training iterations on the Iris dataset takes roughly three hours with evaluation. It is expected that running the algorithm on multi-class classification datasets with a large number of classes will take more episodes to converge and will take require a longer runtime.

Results

Here are the results on the Iris, Car Evaluation, and User Knowledge Modeling datasets. The hyperparameters required by the algorithm are set in the training loop based on the dataset.

Citation

@misc{sekhari2023contextual,
      title={Contextual Bandits and Imitation Learning via Preference-Based Active Queries}, 
      author={Ayush Sekhari and Karthik Sridharan and Wen Sun and Runzhe Wu},
      year={2023},
      eprint={2307.12926},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
active_CB		active_CB
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Contextual Bandit with Active Learning

Installation

Usage

Results

Citation

About

Releases

Packages

Languages

Cornell-RL/active_CB

Folders and files

Latest commit

History

Repository files navigation

Contextual Bandit with Active Learning

Installation

Usage

Results

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages