Skip to content

Commit

Permalink
(enh) Dockerfile.j2: improve env vars for bash and activate in .bashrc (
Browse files Browse the repository at this point in the history
  • Loading branch information
tobitege committed Sep 17, 2024
1 parent 29b0e62 commit 52c5abc
Show file tree
Hide file tree
Showing 13 changed files with 75 additions and 55 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,6 @@ Félicitations !

## Explication technique

Le code pertinent est défini dans [ssh_box.py](https://github.com/All-Hands-AI/OpenHands/blob/main/openhands/runtime/docker/ssh_box.py) et [image_agnostic_util.py](https://github.com/All-Hands-AI/OpenHands/blob/main/openhands/runtime/docker/image_agnostic_util.py).

En particulier, ssh_box.py vérifie l'objet config pour ```config.sandbox.base_container_image``` et ensuite tente de récupérer l'image à l'aide de [get_od_sandbox_image](https://github.com/All-Hands-AI/OpenHands/blob/main/openhands/runtime/docker/image_agnostic_util.py#L72), qui est défini dans image_agnostic_util.py.

Lorsqu'une image personnalisée est utilisée pour la première fois, elle ne sera pas trouvée et donc elle sera construite (à l'exécution ultérieure, l'image construite sera trouvée et renvoyée).

L'image personnalisée est construite avec [_build_sandbox_image()](https://github.com/All-Hands-AI/OpenHands/blob/main/openhands/runtime/docker/image_agnostic_util.py#L29), qui crée un fichier docker en utilisant votre image personnalisée comme base et configure ensuite l'environnement pour OpenHands, comme ceci:
Expand Down
26 changes: 13 additions & 13 deletions docs/modules/usage/architecture/runtime.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,8 @@ graph TD
```

1. User Input: The user provides a custom base Docker image
2. Image Building: OpenHands builds a new Docker image (the "OD runtime image") based on the user-provided image. This new image includes OpenHands-specific code, primarily the "runtime client"
3. Container Launch: When OpenHands starts, it launches a Docker container using the OD runtime image
2. Image Building: OpenHands builds a new Docker image (the "OH runtime image") based on the user-provided image. This new image includes OpenHands-specific code, primarily the "runtime client"
3. Container Launch: When OpenHands starts, it launches a Docker container using the OH runtime image
4. Client Initialization: The runtime client initializes inside the container, setting up necessary components like a bash shell and loading any specified plugins
5. Communication: The OpenHands backend (`runtime.py`) communicates with the runtime client over RESTful API, sending actions and receiving observations
6. Action Execution: The runtime client receives actions from the backend, executes them in the sandboxed environment, and sends back observations
Expand All @@ -62,7 +62,7 @@ The role of the client:
- It formats and returns observations to the backend, ensuring a consistent interface for processing results


## How OpenHands builds and maintains OD Runtime images
## How OpenHands builds and maintains OH Runtime images

OpenHands' approach to building and managing runtime images ensures efficiency, consistency, and flexibility in creating and maintaining Docker images for both production and development environments.

Expand All @@ -80,9 +80,9 @@ OpenHands uses a dual-tagging system for its runtime images to balance reproduci
- This ensures reproducibility; the same hash always means the same image contents

2. Generic tag: `{target_image_repo}:{target_image_tag}`.
Example: `runtime:od_v0.8.3_ubuntu_tag_22.04`
Example: `runtime:oh_v0.9.3_ubuntu_tag_22.04`

- This tag follows the format: `runtime:od_v{OD_VERSION}_{BASE_IMAGE_NAME}_tag_{BASE_IMAGE_TAG}`
- This tag follows the format: `runtime:oh_v{OH_VERSION}_{BASE_IMAGE_NAME}_tag_{BASE_IMAGE_TAG}`
- It represents the latest build for a particular base image and OpenHands version combination
- This tag is updated whenever a new image is built from the same base image, even if the source code changes

Expand All @@ -94,11 +94,11 @@ The hash-based tag ensures reproducibility, while the generic tag provides a sta
- Hash-based tag: `{target_image_repo}:{target_image_hash_tag}`.
Example: `runtime:abc123def456`
- Generic tag: `{target_image_repo}:{target_image_tag}`.
Example: `runtime:od_v0.8.3_ubuntu_tag_22.04`
Example: `runtime:oh_v0.9.3_ubuntu_tag_22.04`

2. Build Process:
- a. Convert the base image name to an OD runtime image name
Example: `ubuntu:22.04` -> `runtime:od_v0.8.3_ubuntu_tag_22.04`
- a. Convert the base image name to an OH runtime image name
Example: `ubuntu:22.04` -> `runtime:oh_v0.9.3_ubuntu_tag_22.04`
- b. Generate a build context (Dockerfile and OpenHands source code) and calculate its hash
- c. Check for an existing image with the calculated hash
- d. If not found, check for a recent compatible image to use as a base
Expand All @@ -108,7 +108,7 @@ The hash-based tag ensures reproducibility, while the generic tag provides a sta
3. Image Reuse and Rebuilding Logic:
The system follows these steps to determine whether to build a new image or use an existing one from a user-provided (base) image (e.g., `ubuntu:22.04`):
- a. If an image exists with the same hash (e.g., `runtime:abc123def456`), it will be reused as is
- b. If the exact hash is not found, the system will try to rebuild using the latest generic image (e.g., `runtime:od_v0.8.3_ubuntu_tag_22.04`) as a base. This saves time by leveraging existing dependencies
- b. If the exact hash is not found, the system will try to rebuild using the latest generic image (e.g., `runtime:oh_v0.9.3_ubuntu_tag_22.04`) as a base. This saves time by leveraging existing dependencies
- c. If neither the hash-tagged nor the generic-tagged image is found, the system will build the image completely from scratch

4. Caching and Efficiency:
Expand All @@ -121,10 +121,10 @@ Here's a flowchart illustrating the build process:
```mermaid
flowchart TD
A[Start] --> B{Convert base image name}
B --> |ubuntu:22.04 -> runtime:od_v0.8.3_ubuntu_tag_22.04| C[Generate build context and hash]
B --> |ubuntu:22.04 -> runtime:oh_v0.9.3_ubuntu_tag_22.04| C[Generate build context and hash]
C --> D{Check for existing image with hash}
D -->|Found runtime:abc123def456| E[Use existing image]
D -->|Not found| F{Check for runtime:od_v0.8.3_ubuntu_tag_22.04}
D -->|Not found| F{Check for runtime:oh_v0.9.3_ubuntu_tag_22.04}
F -->|Found| G[Rebuild based on recent image]
F -->|Not found| H[Build from scratch]
G --> I[Tag with hash and generic tags]
Expand All @@ -137,13 +137,13 @@ This approach ensures that:

1. Identical source code and Dockerfile always produce the same image (via hash-based tags)
2. The system can quickly rebuild images when minor changes occur (by leveraging recent compatible images)
3. The generic tag (e.g., `runtime:od_v0.8.3_ubuntu_tag_22.04`) always points to the latest build for a particular base image and OpenHands version combination
3. The generic tag (e.g., `runtime:oh_v0.9.3_ubuntu_tag_22.04`) always points to the latest build for a particular base image and OpenHands version combination

## Runtime Plugin System

The OpenHands Runtime supports a plugin system that allows for extending functionality and customizing the runtime environment. Plugins are initialized when the runtime client starts up.

Check [an example of Jupyter plugin here](https://github.com/All-Hands-AI/OpenHands/blob/9c44d94cef32e6426ebd8deeeb52963153b2348a/openhands/runtime/plugins/jupyter/__init__.py#L30-L63) if you want to implement your own plugin.
Check [an example of Jupyter plugin here](https://github.com/All-Hands-AI/OpenHands/blob/ecf4aed28b0cf7c18d4d8ff554883ba182fc6bdd/openhands/runtime/plugins/jupyter/__init__.py#L21-L55) if you want to implement your own plugin.

*More details about the Plugin system are still under construction - contributions are welcomed!*

Expand Down
2 changes: 1 addition & 1 deletion evaluation/logic_reasoning/run_infer.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ def get_config(
base_container_image='xingyaoww/od-eval-logic-reasoning:v1.0',
enable_auto_lint=True,
use_host_network=False,
runtime_extra_deps='$OD_INTERPRETER_PATH -m pip install scitools-pyke',
runtime_extra_deps='$OH_INTERPRETER_PATH -m pip install scitools-pyke',
),
# do not mount workspace
workspace_base=None,
Expand Down
2 changes: 1 addition & 1 deletion evaluation/mint/run_infer.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ def get_config(
base_container_image='xingyaoww/od-eval-mint:v1.0',
enable_auto_lint=True,
use_host_network=False,
runtime_extra_deps=f'$OD_INTERPRETER_PATH -m pip install {" ".join(MINT_DEPENDENCIES)}',
runtime_extra_deps=f'$OH_INTERPRETER_PATH -m pip install {" ".join(MINT_DEPENDENCIES)}',
),
# do not mount workspace
workspace_base=None,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,14 @@
import pandas as pd

parser = argparse.ArgumentParser()
parser.add_argument('od_output_file', type=str)
parser.add_argument('oh_output_file', type=str)
args = parser.parse_args()
output_filepath = args.od_output_file.replace('.jsonl', '.swebench.jsonl')
print(f'Converting {args.od_output_file} to {output_filepath}')
output_filepath = args.oh_output_file.replace('.jsonl', '.swebench.jsonl')
print(f'Converting {args.oh_output_file} to {output_filepath}')

od_format = pd.read_json(args.od_output_file, orient='records', lines=True)
# model name is the folder name of od_output_file
model_name = os.path.basename(os.path.dirname(args.od_output_file))
oh_format = pd.read_json(args.oh_output_file, orient='records', lines=True)
# model name is the folder name of oh_output_file
model_name = os.path.basename(os.path.dirname(args.oh_output_file))


def process_git_patch(patch):
Expand Down Expand Up @@ -59,5 +59,5 @@ def convert_row_to_swebench_format(row):
}


swebench_format = od_format.apply(convert_row_to_swebench_format, axis=1)
swebench_format = oh_format.apply(convert_row_to_swebench_format, axis=1)
swebench_format.to_json(output_filepath, lines=True, orient='records')
8 changes: 4 additions & 4 deletions evaluation/swe_bench/scripts/eval_infer.sh
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,9 @@ FILE_NAME=$(basename $PROCESS_FILEPATH)
echo "Evaluating $FILE_NAME @ $FILE_DIR"

# ================================================
# detect whether PROCESS_FILEPATH is in OD format or in SWE-bench format
# detect whether PROCESS_FILEPATH is in OH format or in SWE-bench format
echo "=============================================================="
echo "Detecting whether PROCESS_FILEPATH is in OD format or in SWE-bench format"
echo "Detecting whether PROCESS_FILEPATH is in OH format or in SWE-bench format"
echo "=============================================================="
# SWE-bench format is a JSONL where every line has three fields: model_name_or_path, instance_id, and model_patch
function is_swebench_format() {
Expand All @@ -56,9 +56,9 @@ if [ $IS_SWEBENCH_FORMAT -eq 0 ]; then
else
echo "The file IS NOT in SWE-bench format."

# ==== Convert OD format to SWE-bench format ====
# ==== Convert OH format to SWE-bench format ====
echo "Merged output file with fine-grained report will be saved to $FILE_DIR"
poetry run python3 evaluation/swe_bench/scripts/eval/convert_od_output_to_swe_json.py $PROCESS_FILEPATH
poetry run python3 evaluation/swe_bench/scripts/eval/convert_oh_output_to_swe_json.py $PROCESS_FILEPATH
# replace .jsonl with .swebench.jsonl in filename
SWEBENCH_FORMAT_JSONL=${PROCESS_FILEPATH/.jsonl/.swebench.jsonl}
echo "SWEBENCH_FORMAT_JSONL: $SWEBENCH_FORMAT_JSONL"
Expand Down
8 changes: 4 additions & 4 deletions evaluation/swe_bench/scripts/setup/compare_patch_filename.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,10 @@ def extract_modified_files(patch):
return modified_files


def process_report(od_output_file):
def process_report(oh_output_file):
succ = 0
fail = 0
for line in open(od_output_file):
for line in open(oh_output_file):
line = json.loads(line)
instance_id = line['instance_id']
gold_patch = line['swe_instance']['patch']
Expand All @@ -48,7 +48,7 @@ def process_report(od_output_file):

if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--od_output_file', help='Path to the OD output file')
parser.add_argument('--oh_output_file', help='Path to the OH output file')
args = parser.parse_args()

process_report(args.od_output_file)
process_report(args.oh_output_file)
10 changes: 5 additions & 5 deletions evaluation/swe_bench/scripts/setup/prepare_swe_utils.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@ mkdir -p $EVAL_WORKSPACE

# 1. Prepare REPO
echo "==== Prepare SWE-bench repo ===="
OD_SWE_BENCH_REPO_PATH="https://github.com/All-Hands-AI/OD-SWE-bench.git"
OD_SWE_BENCH_REPO_BRANCH="eval"
git clone -b $OD_SWE_BENCH_REPO_BRANCH $OD_SWE_BENCH_REPO_PATH $EVAL_WORKSPACE/OD-SWE-bench
OH_SWE_BENCH_REPO_PATH="https://github.com/All-Hands-AI/SWE-bench.git"
OH_SWE_BENCH_REPO_BRANCH="eval"
git clone -b $OH_SWE_BENCH_REPO_BRANCH $OH_SWE_BENCH_REPO_PATH $EVAL_WORKSPACE/OH-SWE-bench

# 2. Prepare DATA
echo "==== Prepare SWE-bench data ===="
EVAL_IMAGE=ghcr.io/all-hands-ai/eval-swe-bench:builder_with_conda
EVAL_WORKSPACE=$(realpath $EVAL_WORKSPACE)
chmod +x $EVAL_WORKSPACE/OD-SWE-bench/swebench/harness/prepare_data.sh
chmod +x $EVAL_WORKSPACE/OH-SWE-bench/swebench/harness/prepare_data.sh
if [ -d $EVAL_WORKSPACE/eval_data ]; then
rm -r $EVAL_WORKSPACE/eval_data
fi
Expand All @@ -24,4 +24,4 @@ docker run \
-u $(id -u):$(id -g) \
-e HF_DATASETS_CACHE="/tmp" \
--rm -it $EVAL_IMAGE \
bash -c "cd OD-SWE-bench/swebench/harness && /swe_util/miniforge3/bin/conda run -n swe-bench-eval ./prepare_data.sh && mv eval_data /workspace/"
bash -c "cd OH-SWE-bench/swebench/harness && /swe_util/miniforge3/bin/conda run -n swe-bench-eval ./prepare_data.sh && mv eval_data /workspace/"
2 changes: 1 addition & 1 deletion evaluation/swe_bench/scripts/setup/swe_entry.sh
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ conda activate swe-bench-eval

mkdir -p $SWE_TASK_DIR/reset_testbed_temp
mkdir -p $SWE_TASK_DIR/reset_testbed_log_dir
SWE_BENCH_DIR=/swe_util/OD-SWE-bench
SWE_BENCH_DIR=/swe_util/OH-SWE-bench
output=$(
export PYTHONPATH=$SWE_BENCH_DIR && \
cd $SWE_BENCH_DIR && \
Expand Down
4 changes: 2 additions & 2 deletions openhands/core/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -194,8 +194,8 @@ class SandboxConfig:
runtime_extra_deps: The extra dependencies to install in the runtime image (typically used for evaluation).
This will be rendered into the end of the Dockerfile that builds the runtime image.
It can contain any valid shell commands (e.g., pip install numpy).
The path to the interpreter is available as $OD_INTERPRETER_PATH,
which can be used to install dependencies for the OD-specific Python interpreter.
The path to the interpreter is available as $OH_INTERPRETER_PATH,
which can be used to install dependencies for the OH-specific Python interpreter.
runtime_startup_env_vars: The environment variables to set at the launch of the runtime.
This is a dictionary of key-value pairs.
This is useful for setting environment variables that are needed by the runtime.
Expand Down
8 changes: 4 additions & 4 deletions openhands/runtime/remote/runtime.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@


class RemoteRuntime(Runtime):
"""This runtime will connect to a remote od-runtime-client."""
"""This runtime will connect to a remote oh-runtime-client."""

port: int = 60000 # default port for the remote runtime client

Expand Down Expand Up @@ -93,16 +93,16 @@ def __init__(
'Setting runtime_container_image is not supported in the remote runtime.'
)
self.container_image: str = self.config.sandbox.base_container_image
self.container_name = 'od-remote-runtime-' + self.instance_id
self.container_name = 'oh-remote-runtime-' + self.instance_id
logger.debug(f'RemoteRuntime `{sid}` config:\n{self.config}')
response = send_request(self.session, 'GET', f'{self.api_url}/registry_prefix')
response_json = response.json()
registry_prefix = response_json['registry_prefix']
os.environ['OD_RUNTIME_RUNTIME_IMAGE_REPO'] = (
os.environ['OH_RUNTIME_RUNTIME_IMAGE_REPO'] = (
registry_prefix.rstrip('/') + '/runtime'
)
logger.info(
f'Runtime image repo: {os.environ["OD_RUNTIME_RUNTIME_IMAGE_REPO"]}'
f'Runtime image repo: {os.environ["OH_RUNTIME_RUNTIME_IMAGE_REPO"]}'
)

if self.config.sandbox.runtime_extra_deps:
Expand Down
4 changes: 2 additions & 2 deletions openhands/runtime/utils/runtime_build.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@


def get_runtime_image_repo():
return os.getenv('OD_RUNTIME_RUNTIME_IMAGE_REPO', 'ghcr.io/all-hands-ai/runtime')
return os.getenv('OH_RUNTIME_RUNTIME_IMAGE_REPO', 'ghcr.io/all-hands-ai/runtime')


def _get_package_version():
Expand Down Expand Up @@ -365,7 +365,7 @@ def _build_sandbox_image(
on the contents of the docker build folder (source code and Dockerfile)
e.g. 1234567890abcdef
-target_image_tag (str): the tag for the target image that's generic and based on the base image name
e.g. od_v0.8.3_image_ubuntu_tag_22.04
e.g. oh_v0.9.3_image_ubuntu_tag_22.04
"""
target_image_hash_name = f'{target_image_repo}:{target_image_hash_tag}'
target_image_generic_name = f'{target_image_repo}:{target_image_tag}'
Expand Down
38 changes: 31 additions & 7 deletions openhands/runtime/utils/runtime_templates/Dockerfile.j2
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ RUN mkdir -p /openhands && \
mkdir -p /openhands/logs && \
mkdir -p /openhands/poetry

# Directory containing subdirectories for virtual environment.
ENV POETRY_VIRTUALENVS_PATH=/openhands/poetry

RUN if [ ! -d /openhands/miniforge3 ]; then \
Expand All @@ -46,24 +47,47 @@ RUN /openhands/miniforge3/bin/mamba install conda-forge::poetry python=3.11 -y
RUN if [ -d /openhands/code ]; then rm -rf /openhands/code; fi
COPY ./code /openhands/code

# Install/Update Dependencies
# 1. Install pyproject.toml via poetry
# 2. Install playwright and chromium
# 3. Clear poetry, apt, mamba caches
RUN cd /openhands/code && \
# Below RUN command sets up the Python environment using Poetry,
# installs project dependencies, and configures the container
# for OpenHands development.
# It creates and activates a virtual environment, installs necessary
# tools like Playwright, sets up environment variables, and configures
# the bash environment to ensure the correct Python interpreter and
# virtual environment are used by default.
WORKDIR /openhands/code
RUN \
# Configure Poetry and create virtual environment
/openhands/miniforge3/bin/mamba run -n base poetry config virtualenvs.path /openhands/poetry && \
/openhands/miniforge3/bin/mamba run -n base poetry env use python3.11 && \
# Install project dependencies
/openhands/miniforge3/bin/mamba run -n base poetry install --only main,runtime --no-interaction --no-root && \
# Update and install additional tools
apt-get update && \
/openhands/miniforge3/bin/mamba run -n base poetry run pip install playwright && \
/openhands/miniforge3/bin/mamba run -n base poetry run playwright install --with-deps chromium && \
export OD_INTERPRETER_PATH=$(/openhands/miniforge3/bin/mamba run -n base poetry run python -c "import sys; print(sys.executable)") && \
# Set environment variables
export OH_INTERPRETER_PATH=$(/openhands/miniforge3/bin/mamba run -n base poetry run python -c "import sys; print(sys.executable)") && \
export OH_VENV_PATH=$(/openhands/miniforge3/bin/mamba run -n base poetry env info --path) && \
# Install extra dependencies if specified
{{ extra_deps }} {% if extra_deps %} && {% endif %} \
# Clear caches
/openhands/miniforge3/bin/mamba run -n base poetry cache clear --all . && \
# Set permissions
{% if not skip_init %}chmod -R g+rws /openhands/poetry && {% endif %} \
mkdir -p /openhands/workspace && chmod -R g+rws,o+rw /openhands/workspace && \
# Clean up
apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
/openhands/miniforge3/bin/mamba clean --all

{% if not skip_init %}
RUN \
# Add the Poetry virtual environment to the bashrc
echo "export OH_INTERPRETER_PATH=\"$OH_INTERPRETER_PATH\"" >> /etc/bash.bashrc && \
echo "export OH_VENV_PATH=\"$OH_VENV_PATH\"" >> /etc/bash.bashrc && \
# Activate the Poetry virtual environment
echo 'source "$OH_VENV_PATH/bin/activate"' >> /etc/bash.bashrc && \
# Use the Poetry virtual environment's Python interpreter
echo 'alias python="$OH_INTERPRETER_PATH"' >> /etc/bash.bashrc
{% endif %}
# ================================================================
# END: Copy Project and Install/Update Dependencies
# ================================================================

0 comments on commit 52c5abc

Please sign in to comment.