Skip to content

Latest commit

 

History

History
 
 

sim

Make sure actual video files are stored in video_server/video[1-6], then run

python get_video_sizes

Put training data in sim/cooked_traces and testing data in sim/cooked_test_traces (need to create folders). The trace format for simulation is [time_stamp (sec), throughput (Mbit/sec)]. Sample training/testing data we used can be downloaded separately from train_sim_traces and test_sim_traces in https://www.dropbox.com/sh/ss0zs1lc4cklu3u/AAB-8WC3cHD4PTtYT0E4M19Ja?dl=0. More details of data preparation can be found in traces/.

To train a model, run

python multi_agent.py

As reported by the A3C paper (http://proceedings.mlr.press/v48/mniha16.pdf) and a faithful implementation (https://openreview.net/pdf?id=Hk3mPK5gg), we also found the exploration factor in the actor network quite crucial for achieving good performance. A general strategy to train our system is to first set ENTROPY_WEIGHT in a3c.py to be a large value (in the scale of 1 to 5) in the beginning, then gradually reduce the value to 0.1 (after at least 100,000 iterations).

The training process can be monitored in sim/results/log_test (validation) and sim/results/log_central (training). Tensorboard (https://www.tensorflow.org/get_started/summaries_and_tensorboard) is also used to visualize the training process, which can be invoked by running

python -m tensorflow.tensorboard --logdir=./results/

where the plot can be viewed at localhost:6006 from a browser.

Trained model will be saved in sim/results/. We provided a sample pretrained model with linear QoE as the reward signal. It can be loaded by setting NN_MODEL = './results/pretrain_linear_reward.ckpt' in multi_agent.py.