Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
EricGuo5513 committed Dec 16, 2023
1 parent 108003b commit a1eca77
Show file tree
Hide file tree
Showing 13 changed files with 274 additions and 313 deletions.
151 changes: 147 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

<details>

### Conda Environment
### 1. Conda Environment
```
conda env create -f environment.yml
conda activate momask
Expand All @@ -21,7 +21,7 @@ pip install git+https://github.com/openai/CLIP.git
We test our code on Python 3.7.13 and PyTorch 1.7.1


### Models and Dependencies
### 2. Models and Dependencies

#### Download Pre-trained Models
```
Expand All @@ -38,9 +38,152 @@ bash prepare/download_glove.sh
#### (Optional) Download Mannually
Visit [[Google Drive]](https://drive.google.com/drive/folders/1b3GnAbERH8jAoO5mdWgZhyxHB73n23sK?usp=drive_link) to download the models and evaluators mannually.

### Get Data
### 3. Get Data

You have two options here:
* **Skip getting data**, if you just want to generate motions using *own* descriptions.
* **Get full data**, if you want to *re-train* and *evaluate* the model.

**(a). Full data (text + motion)**

**HumanML3D** - Follow the instruction in [HumanML3D](https://github.com/EricGuo5513/HumanML3D.git), then copy the result dataset to our repository:
```
cp -r ../HumanML3D/HumanML3D ./dataset/HumanML3D
```
**KIT**-Download from [HumanML3D](https://github.com/EricGuo5513/HumanML3D.git), then place result in `./dataset/KIT-ML`

####

</details>

## :rocket: Demo
<details>

### (a) Generate from a single prompt
```
python gen_t2m.py --gpu_id 1 --ext exp1 --text_prompt "A person is sitting on a chair"
```
### (b) Generate from a prompt file
An example of prompt file is given in `./assets/text_prompt.txt`. Please follow the format of `<text description>#<motion length>` at each line. Motion length indicates the number of poses, which must be integeter and will be rounded by 4. In our work, motion is in 20 fps.

If you write `<text description>#NA`, our model will determine a length. Note once there is **one** NA, all the others will be **NA** automatically.

```
python gen_t2m.py --gpu_id 1 --ext exp2 --text_path ./assets/text_prompt.txt
```


A few more parameters you may be interested:
* `--repeat_times`: number of replications for generation
* `--motion_length`: specify the number of poses for generation, only applicable in (a).

The output files are stored under folder `./generation/<ext>/`. They are
* `numpy files`: generated motions with shape of (nframe, 22, 3), under subfolder `./joints`.
* `video files`: stick figure animation in mp4 format, under subfolder `./animation`.
* `bvh files`: bvh files of the generated motion, under subfolder `./animation`.

We also apply naive foot ik to the generated motions, see files with suffix `_ik`. It sometimes works well, but sometimes will fail.

</details>

## :dancers: Visualization
<details>

All the animations are manually rendered in blender. We use the characters from [mixamo](https://www.mixamo.com/#/). You need to download the characters in T-Pose with skeleton.

### Retargeting
For retargeting, we found rokoko usually leads to large error on foot. On the other hand, [keemap.rig.transfer](https://github.com/nkeeline/Keemap-Blender-Rig-ReTargeting-Addon/releases) shows more precise retargetting. You could watch the [tutorial](https://www.youtube.com/watch?v=EG-VCMkVpxg) here.

Following these steps:
* Download keemap.rig.transfer from the github, and install it in blender.
* Import both the motion files (.bvh) and character files (.fbx) in blender.
* `Shift + Select` the both source and target skeleton. (Do not need to be Rest Position)
* Switch to `Pose Mode`, then unfold the `KeeMapRig` tool at the top-right corner of the view window.
* Load and read the bone mapping file `./assets/mapping.json`(or `mapping6.json` if it doesn't work). This file is manually made by us. It works for most characters in mixamo. You could make your own.
* Adjust the `Number of Samples`, `Source Rig`, `Destination Rig Name`.
* Clik `Transfer Animation from Source Destination`, wait a few seconds.

We didn't tried other retargetting tools. Welcome to comment if you find others are more useful.

### Scene

We use this [scene](https://drive.google.com/file/d/1lg62nugD7RTAIz0Q_YP2iZsxpUzzOkT1/view?usp=sharing) for animation.


</details>

## :clapper: Temporal Inpainting
<details>

To be continuted.
</details>

## :space_invader: Train Your Own Models
<details>


**Note**: You have to train RVQ **BEFORE** training masked/residual transformers. The latter two can be trained simultaneously.

### Train RVQ
```
python train_vq.py --name rvq_name --gpu_id 1 --dataset_name t2m --batch_size 512 --num_quantizers 6 --max_epoch 500 --quantize_drop_prob 0.2
```

### Train Masked Transformer
```
python train_t2m_transformer.py --name mtrans_name --gpu_id 2 --dataset_name t2m --batch_size 64 --vq_name rvq_name
```

### Train Residual Transformer
```
python train_res_transformer.py --name rtrans_name --gpu_id 2 --dataset_name t2m --batch_size 64 --vq_name rvq_name --cond_drop_prob 0.2 --share_weight
```

* `--dataset_name`: motion dataset, `t2m` for HumanML3D and `kit` for KIT-ML.
* `--name`: name your model. This will create to model space as `./checkpoints/<dataset_name>/<name>`
* `--gpu_id`: GPU id.
* `--batch_size`: we use `512` for rvq training. For masked/residual transformer, we use `64` on HumanML3D and `16` for KIT-ML.
* `--num_quantizers`: number of quantization layers, `6` is used in our case.
* `--quantize_drop_prob`: quantization dropout ratio, `0.2` is used.
* `--vq_name`: when training masked/residual transformer, you need to specify the name of rvq model for tokenization.
* `--cond_drop_prob`: condition drop ratio, for classifier-free guidance. `0.2` is used.
* `--share_weight`: whether to share the projection/embedding weights in residual transformer.

All the pre-trained models and intermediate results will be saved in space `./checkpoints/<dataset_name>/<name>`.
</details>

## :book: Evaluation
<details>

### Evaluate RVQ Reconstruction:
HumanML3D:
```
python eval_t2m_vq.py --gpu_id 0 --name rvq_nq6_dc512_nc512_noshare_qdp0.2 --dataset_name t2m --ext rvq_nq6
```
KIT-ML:
```
python eval_t2m_vq.py --gpu_id 0 --name rvq_nq6_dc512_nc512_noshare_qdp0.2_k --dataset_name kit --ext rvq_nq6
```

### Evaluate Text2motion Generation:
HumanML3D:
```
python eval_t2m_trans_res.py --res_name tres_nlayer8_ld384_ff1024_rvq6ns_cdp0.2_sw --dataset_name t2m --name t2m_nlayer8_nhead6_ld384_ff1024_cdp0.1_rvq6ns --gpu_id 1 --cond_scale 4 --time_steps 10 --ext evaluation
```
KIT-ML:
```
python eval_t2m_trans_res.py --res_name tres_nlayer8_ld384_ff1024_rvq6ns_cdp0.2_sw_k --dataset_name kit --name t2m_nlayer8_nhead6_ld384_ff1024_cdp0.1_rvq6ns_k --gpu_id 0 --cond_scale 2 --time_steps 10 --ext evaluation
```

* `--res_name`: model name of `residual transformer`.
* `--name`: model name of `masked transformer`.
* `--cond_scale`: scale of classifer-free guidance.
* `--time_steps`: number of iterations for inference.
* `--ext`: filename for saving evaluation results.

The final evaluation results will be saved in `./checkpoints/<dataset_name>/<name>/eval/<ext>.log`

</details>

### To be continued.
## Acknowlegements
1 change: 1 addition & 0 deletions assets/mapping.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions assets/mapping6.json

Large diffs are not rendered by default.

12 changes: 12 additions & 0 deletions assets/text_prompt.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
the person holds his left foot with his left hand, puts his right foot up and left hand up too.#132
a man bends down and picks something up with his left hand.#84
A man stands for few seconds and picks up his arms and shakes them.#176
A person walks with a limp, their left leg get injured.#192
a person jumps up and then lands.#52
a person performs a standing back kick.#52
A person pokes their right hand along the ground, like they might be planting seeds.#60
the person steps forward and uses the left leg to kick something forward.#92
the man walked forward, spun right on one foot and walked back to his original position.#92
the person was pushed but did not fall.#124
this person stumbles left and right while moving forward.#132
a person reaching down and picking something up.#148
Empty file added dataset/__init__.py
Empty file.
2 changes: 1 addition & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,6 @@ dependencies:
- fontconfig=2.13.1=h6c09931_0
- freetype=2.11.0=h70c0345_0
- frozenlist=1.3.3=py37h5eee18b_0
- gdown=4.5.1=pyhd8ed1ab_0
- giflib=5.2.1=h7b6447c_0
- glib=2.69.1=h4ff587b_1
- gst-plugins-base=1.14.0=h8213a91_2
Expand Down Expand Up @@ -188,6 +187,7 @@ dependencies:
- cachetools==5.3.1
- einops==0.6.1
- ftfy==6.1.1
- gdown==4.7.1
- google-auth==2.22.0
- google-auth-oauthlib==0.4.6
- grpcio==1.57.0
Expand Down
2 changes: 1 addition & 1 deletion eval_t2m_vq.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
from os.path import join as pjoin

import torch
from models.vq.model import RVQVAE, HVQVAE
from models.vq.model import RVQVAE
from options.vq_option import arg_parse
from motion_loaders.dataset_motion_loader import get_dataset_motion_loader
import utils.eval_t2m as eval_t2m
Expand Down
Loading

0 comments on commit a1eca77

Please sign in to comment.