(enh) Dockerfile.j2: improve env vars for bash and activate in .bashrc (

All-Hands-AI#3871)
SmartManoj · Sep 17, 2024 · 52c5abc · 52c5abc
1 parent 29b0e62
commit 52c5abc
Show file tree

Hide file tree

Showing 13 changed files with 75 additions and 55 deletions.
diff --git a/docs/i18n/fr/docusaurus-plugin-content-docs/current/usage/custom_sandbox_guide.md b/docs/i18n/fr/docusaurus-plugin-content-docs/current/usage/custom_sandbox_guide.md
@@ -59,10 +59,6 @@ Félicitations !
 
 ## Explication technique
 
-Le code pertinent est défini dans [ssh_box.py](https://github.com/All-Hands-AI/OpenHands/blob/main/openhands/runtime/docker/ssh_box.py) et [image_agnostic_util.py](https://github.com/All-Hands-AI/OpenHands/blob/main/openhands/runtime/docker/image_agnostic_util.py).
-
-En particulier, ssh_box.py vérifie l'objet config pour ```config.sandbox.base_container_image``` et ensuite tente de récupérer l'image à l'aide de [get_od_sandbox_image](https://github.com/All-Hands-AI/OpenHands/blob/main/openhands/runtime/docker/image_agnostic_util.py#L72), qui est défini dans image_agnostic_util.py.
-
 Lorsqu'une image personnalisée est utilisée pour la première fois, elle ne sera pas trouvée et donc elle sera construite (à l'exécution ultérieure, l'image construite sera trouvée et renvoyée).
 
 L'image personnalisée est construite avec [_build_sandbox_image()](https://github.com/All-Hands-AI/OpenHands/blob/main/openhands/runtime/docker/image_agnostic_util.py#L29), qui crée un fichier docker en utilisant votre image personnalisée comme base et configure ensuite l'environnement pour OpenHands, comme ceci:

diff --git a/docs/modules/usage/architecture/runtime.md b/docs/modules/usage/architecture/runtime.md
@@ -47,8 +47,8 @@ graph TD
 ```
 
 1. User Input: The user provides a custom base Docker image
-2. Image Building: OpenHands builds a new Docker image (the "OD runtime image") based on the user-provided image. This new image includes OpenHands-specific code, primarily the "runtime client"
-3. Container Launch: When OpenHands starts, it launches a Docker container using the OD runtime image
+2. Image Building: OpenHands builds a new Docker image (the "OH runtime image") based on the user-provided image. This new image includes OpenHands-specific code, primarily the "runtime client"
+3. Container Launch: When OpenHands starts, it launches a Docker container using the OH runtime image
 4. Client Initialization: The runtime client initializes inside the container, setting up necessary components like a bash shell and loading any specified plugins
 5. Communication: The OpenHands backend (`runtime.py`) communicates with the runtime client over RESTful API, sending actions and receiving observations
 6. Action Execution: The runtime client receives actions from the backend, executes them in the sandboxed environment, and sends back observations
@@ -62,7 +62,7 @@ The role of the client:
 - It formats and returns observations to the backend, ensuring a consistent interface for processing results
 
 
-## How OpenHands builds and maintains OD Runtime images
+## How OpenHands builds and maintains OH Runtime images
 
 OpenHands' approach to building and managing runtime images ensures efficiency, consistency, and flexibility in creating and maintaining Docker images for both production and development environments.
 
@@ -80,9 +80,9 @@ OpenHands uses a dual-tagging system for its runtime images to balance reproduci
    - This ensures reproducibility; the same hash always means the same image contents
 
 2. Generic tag: `{target_image_repo}:{target_image_tag}`.
-   Example: `runtime:od_v0.8.3_ubuntu_tag_22.04`
+   Example: `runtime:oh_v0.9.3_ubuntu_tag_22.04`
 
-   - This tag follows the format: `runtime:od_v{OD_VERSION}_{BASE_IMAGE_NAME}_tag_{BASE_IMAGE_TAG}`
+   - This tag follows the format: `runtime:oh_v{OH_VERSION}_{BASE_IMAGE_NAME}_tag_{BASE_IMAGE_TAG}`
    - It represents the latest build for a particular base image and OpenHands version combination
    - This tag is updated whenever a new image is built from the same base image, even if the source code changes
 
@@ -94,11 +94,11 @@ The hash-based tag ensures reproducibility, while the generic tag provides a sta
    - Hash-based tag: `{target_image_repo}:{target_image_hash_tag}`.
      Example: `runtime:abc123def456`
    - Generic tag: `{target_image_repo}:{target_image_tag}`.
-     Example: `runtime:od_v0.8.3_ubuntu_tag_22.04`
+     Example: `runtime:oh_v0.9.3_ubuntu_tag_22.04`
 
 2. Build Process:
-   - a. Convert the base image name to an OD runtime image name
-      Example: `ubuntu:22.04` -> `runtime:od_v0.8.3_ubuntu_tag_22.04`
+   - a. Convert the base image name to an OH runtime image name
+      Example: `ubuntu:22.04` -> `runtime:oh_v0.9.3_ubuntu_tag_22.04`
    - b. Generate a build context (Dockerfile and OpenHands source code) and calculate its hash
    - c. Check for an existing image with the calculated hash
    - d. If not found, check for a recent compatible image to use as a base
@@ -108,7 +108,7 @@ The hash-based tag ensures reproducibility, while the generic tag provides a sta
 3. Image Reuse and Rebuilding Logic:
    The system follows these steps to determine whether to build a new image or use an existing one from a user-provided (base) image (e.g., `ubuntu:22.04`):
    - a. If an image exists with the same hash (e.g., `runtime:abc123def456`), it will be reused as is
-   - b. If the exact hash is not found, the system will try to rebuild using the latest generic image (e.g., `runtime:od_v0.8.3_ubuntu_tag_22.04`) as a base. This saves time by leveraging existing dependencies
+   - b. If the exact hash is not found, the system will try to rebuild using the latest generic image (e.g., `runtime:oh_v0.9.3_ubuntu_tag_22.04`) as a base. This saves time by leveraging existing dependencies
    - c. If neither the hash-tagged nor the generic-tagged image is found, the system will build the image completely from scratch
 
 4. Caching and Efficiency:
@@ -121,10 +121,10 @@ Here's a flowchart illustrating the build process:
 ```mermaid
 flowchart TD
     A[Start] --> B{Convert base image name}
-    B --> |ubuntu:22.04 -> runtime:od_v0.8.3_ubuntu_tag_22.04| C[Generate build context and hash]
+    B --> |ubuntu:22.04 -> runtime:oh_v0.9.3_ubuntu_tag_22.04| C[Generate build context and hash]
     C --> D{Check for existing image with hash}
     D -->|Found runtime:abc123def456| E[Use existing image]
-    D -->|Not found| F{Check for runtime:od_v0.8.3_ubuntu_tag_22.04}
+    D -->|Not found| F{Check for runtime:oh_v0.9.3_ubuntu_tag_22.04}
     F -->|Found| G[Rebuild based on recent image]
     F -->|Not found| H[Build from scratch]
     G --> I[Tag with hash and generic tags]
@@ -137,13 +137,13 @@ This approach ensures that:
 
 1. Identical source code and Dockerfile always produce the same image (via hash-based tags)
 2. The system can quickly rebuild images when minor changes occur (by leveraging recent compatible images)
-3. The generic tag (e.g., `runtime:od_v0.8.3_ubuntu_tag_22.04`) always points to the latest build for a particular base image and OpenHands version combination
+3. The generic tag (e.g., `runtime:oh_v0.9.3_ubuntu_tag_22.04`) always points to the latest build for a particular base image and OpenHands version combination
 
 ## Runtime Plugin System
 
 The OpenHands Runtime supports a plugin system that allows for extending functionality and customizing the runtime environment. Plugins are initialized when the runtime client starts up.
 
-Check [an example of Jupyter plugin here](https://github.com/All-Hands-AI/OpenHands/blob/9c44d94cef32e6426ebd8deeeb52963153b2348a/openhands/runtime/plugins/jupyter/__init__.py#L30-L63) if you want to implement your own plugin.
+Check [an example of Jupyter plugin here](https://github.com/All-Hands-AI/OpenHands/blob/ecf4aed28b0cf7c18d4d8ff554883ba182fc6bdd/openhands/runtime/plugins/jupyter/__init__.py#L21-L55) if you want to implement your own plugin.
 
 *More details about the Plugin system are still under construction - contributions are welcomed!*
 

diff --git a/evaluation/logic_reasoning/run_infer.py b/evaluation/logic_reasoning/run_infer.py
@@ -52,7 +52,7 @@ def get_config(
             base_container_image='xingyaoww/od-eval-logic-reasoning:v1.0',
             enable_auto_lint=True,
             use_host_network=False,
-            runtime_extra_deps='$OD_INTERPRETER_PATH -m pip install scitools-pyke',
+            runtime_extra_deps='$OH_INTERPRETER_PATH -m pip install scitools-pyke',
         ),
         # do not mount workspace
         workspace_base=None,

diff --git a/evaluation/mint/run_infer.py b/evaluation/mint/run_infer.py
@@ -105,7 +105,7 @@ def get_config(
             base_container_image='xingyaoww/od-eval-mint:v1.0',
             enable_auto_lint=True,
             use_host_network=False,
-            runtime_extra_deps=f'$OD_INTERPRETER_PATH -m pip install {" ".join(MINT_DEPENDENCIES)}',
+            runtime_extra_deps=f'$OH_INTERPRETER_PATH -m pip install {" ".join(MINT_DEPENDENCIES)}',
         ),
         # do not mount workspace
         workspace_base=None,

diff --git a/...pts/eval/convert_od_output_to_swe_json.py → ...pts/eval/convert_oh_output_to_swe_json.py b/...pts/eval/convert_od_output_to_swe_json.py → ...pts/eval/convert_oh_output_to_swe_json.py
@@ -4,14 +4,14 @@
 import pandas as pd
 
 parser = argparse.ArgumentParser()
-parser.add_argument('od_output_file', type=str)
+parser.add_argument('oh_output_file', type=str)
 args = parser.parse_args()
-output_filepath = args.od_output_file.replace('.jsonl', '.swebench.jsonl')
-print(f'Converting {args.od_output_file} to {output_filepath}')
+output_filepath = args.oh_output_file.replace('.jsonl', '.swebench.jsonl')
+print(f'Converting {args.oh_output_file} to {output_filepath}')
 
-od_format = pd.read_json(args.od_output_file, orient='records', lines=True)
-# model name is the folder name of od_output_file
-model_name = os.path.basename(os.path.dirname(args.od_output_file))
+oh_format = pd.read_json(args.oh_output_file, orient='records', lines=True)
+# model name is the folder name of oh_output_file
+model_name = os.path.basename(os.path.dirname(args.oh_output_file))
 
 
 def process_git_patch(patch):
@@ -59,5 +59,5 @@ def convert_row_to_swebench_format(row):
     }
 
 
-swebench_format = od_format.apply(convert_row_to_swebench_format, axis=1)
+swebench_format = oh_format.apply(convert_row_to_swebench_format, axis=1)
 swebench_format.to_json(output_filepath, lines=True, orient='records')
diff --git a/evaluation/swe_bench/scripts/eval_infer.sh b/evaluation/swe_bench/scripts/eval_infer.sh
@@ -28,9 +28,9 @@ FILE_NAME=$(basename $PROCESS_FILEPATH)
 echo "Evaluating $FILE_NAME @ $FILE_DIR"
 
 # ================================================
-# detect whether PROCESS_FILEPATH is in OD format or in SWE-bench format
+# detect whether PROCESS_FILEPATH is in OH format or in SWE-bench format
 echo "=============================================================="
-echo "Detecting whether PROCESS_FILEPATH is in OD format or in SWE-bench format"
+echo "Detecting whether PROCESS_FILEPATH is in OH format or in SWE-bench format"
 echo "=============================================================="
 # SWE-bench format is a JSONL where every line has three fields: model_name_or_path, instance_id, and model_patch
 function is_swebench_format() {
@@ -56,9 +56,9 @@ if [ $IS_SWEBENCH_FORMAT -eq 0 ]; then
 else
     echo "The file IS NOT in SWE-bench format."
 
-    # ==== Convert OD format to SWE-bench format ====
+    # ==== Convert OH format to SWE-bench format ====
     echo "Merged output file with fine-grained report will be saved to $FILE_DIR"
-    poetry run python3 evaluation/swe_bench/scripts/eval/convert_od_output_to_swe_json.py $PROCESS_FILEPATH
+    poetry run python3 evaluation/swe_bench/scripts/eval/convert_oh_output_to_swe_json.py $PROCESS_FILEPATH
     # replace .jsonl with .swebench.jsonl in filename
     SWEBENCH_FORMAT_JSONL=${PROCESS_FILEPATH/.jsonl/.swebench.jsonl}
     echo "SWEBENCH_FORMAT_JSONL: $SWEBENCH_FORMAT_JSONL"

diff --git a/evaluation/swe_bench/scripts/setup/compare_patch_filename.py b/evaluation/swe_bench/scripts/setup/compare_patch_filename.py
@@ -19,10 +19,10 @@ def extract_modified_files(patch):
     return modified_files
 
 
-def process_report(od_output_file):
+def process_report(oh_output_file):
     succ = 0
     fail = 0
-    for line in open(od_output_file):
+    for line in open(oh_output_file):
         line = json.loads(line)
         instance_id = line['instance_id']
         gold_patch = line['swe_instance']['patch']
@@ -48,7 +48,7 @@ def process_report(od_output_file):
 
 if __name__ == '__main__':
     parser = argparse.ArgumentParser()
-    parser.add_argument('--od_output_file', help='Path to the OD output file')
+    parser.add_argument('--oh_output_file', help='Path to the OH output file')
     args = parser.parse_args()
 
-    process_report(args.od_output_file)
+    process_report(args.oh_output_file)
diff --git a/evaluation/swe_bench/scripts/setup/prepare_swe_utils.sh b/evaluation/swe_bench/scripts/setup/prepare_swe_utils.sh
@@ -6,15 +6,15 @@ mkdir -p $EVAL_WORKSPACE
 
 # 1. Prepare REPO
 echo "==== Prepare SWE-bench repo ===="
-OD_SWE_BENCH_REPO_PATH="https://github.com/All-Hands-AI/OD-SWE-bench.git"
-OD_SWE_BENCH_REPO_BRANCH="eval"
-git clone -b $OD_SWE_BENCH_REPO_BRANCH $OD_SWE_BENCH_REPO_PATH $EVAL_WORKSPACE/OD-SWE-bench
+OH_SWE_BENCH_REPO_PATH="https://github.com/All-Hands-AI/SWE-bench.git"
+OH_SWE_BENCH_REPO_BRANCH="eval"
+git clone -b $OH_SWE_BENCH_REPO_BRANCH $OH_SWE_BENCH_REPO_PATH $EVAL_WORKSPACE/OH-SWE-bench
 
 # 2. Prepare DATA
 echo "==== Prepare SWE-bench data ===="
 EVAL_IMAGE=ghcr.io/all-hands-ai/eval-swe-bench:builder_with_conda
 EVAL_WORKSPACE=$(realpath $EVAL_WORKSPACE)
-chmod +x $EVAL_WORKSPACE/OD-SWE-bench/swebench/harness/prepare_data.sh
+chmod +x $EVAL_WORKSPACE/OH-SWE-bench/swebench/harness/prepare_data.sh
 if [ -d $EVAL_WORKSPACE/eval_data ]; then
     rm -r $EVAL_WORKSPACE/eval_data
 fi
@@ -24,4 +24,4 @@ docker run \
     -u $(id -u):$(id -g) \
     -e HF_DATASETS_CACHE="/tmp" \
     --rm -it $EVAL_IMAGE \
-    bash -c "cd OD-SWE-bench/swebench/harness && /swe_util/miniforge3/bin/conda run -n swe-bench-eval ./prepare_data.sh && mv eval_data /workspace/"
+    bash -c "cd OH-SWE-bench/swebench/harness && /swe_util/miniforge3/bin/conda run -n swe-bench-eval ./prepare_data.sh && mv eval_data /workspace/"
diff --git a/evaluation/swe_bench/scripts/setup/swe_entry.sh b/evaluation/swe_bench/scripts/setup/swe_entry.sh
@@ -60,7 +60,7 @@ conda activate swe-bench-eval
 
 mkdir -p $SWE_TASK_DIR/reset_testbed_temp
 mkdir -p $SWE_TASK_DIR/reset_testbed_log_dir
-SWE_BENCH_DIR=/swe_util/OD-SWE-bench
+SWE_BENCH_DIR=/swe_util/OH-SWE-bench
 output=$(
     export PYTHONPATH=$SWE_BENCH_DIR && \
     cd $SWE_BENCH_DIR && \

diff --git a/openhands/core/config.py b/openhands/core/config.py
@@ -194,8 +194,8 @@ class SandboxConfig:
         runtime_extra_deps: The extra dependencies to install in the runtime image (typically used for evaluation).
             This will be rendered into the end of the Dockerfile that builds the runtime image.
             It can contain any valid shell commands (e.g., pip install numpy).
-            The path to the interpreter is available as $OD_INTERPRETER_PATH,
-            which can be used to install dependencies for the OD-specific Python interpreter.
+            The path to the interpreter is available as $OH_INTERPRETER_PATH,
+            which can be used to install dependencies for the OH-specific Python interpreter.
         runtime_startup_env_vars: The environment variables to set at the launch of the runtime.
             This is a dictionary of key-value pairs.
             This is useful for setting environment variables that are needed by the runtime.

diff --git a/openhands/runtime/remote/runtime.py b/openhands/runtime/remote/runtime.py
@@ -44,7 +44,7 @@
 
 
 class RemoteRuntime(Runtime):
-    """This runtime will connect to a remote od-runtime-client."""
+    """This runtime will connect to a remote oh-runtime-client."""
 
     port: int = 60000  # default port for the remote runtime client
 
@@ -93,16 +93,16 @@ def __init__(
                 'Setting runtime_container_image is not supported in the remote runtime.'
             )
         self.container_image: str = self.config.sandbox.base_container_image
-        self.container_name = 'od-remote-runtime-' + self.instance_id
+        self.container_name = 'oh-remote-runtime-' + self.instance_id
         logger.debug(f'RemoteRuntime `{sid}` config:\n{self.config}')
         response = send_request(self.session, 'GET', f'{self.api_url}/registry_prefix')
         response_json = response.json()
         registry_prefix = response_json['registry_prefix']
-        os.environ['OD_RUNTIME_RUNTIME_IMAGE_REPO'] = (
+        os.environ['OH_RUNTIME_RUNTIME_IMAGE_REPO'] = (
             registry_prefix.rstrip('/') + '/runtime'
         )
         logger.info(
-            f'Runtime image repo: {os.environ["OD_RUNTIME_RUNTIME_IMAGE_REPO"]}'
+            f'Runtime image repo: {os.environ["OH_RUNTIME_RUNTIME_IMAGE_REPO"]}'
         )
 
         if self.config.sandbox.runtime_extra_deps:

diff --git a/openhands/runtime/utils/runtime_build.py b/openhands/runtime/utils/runtime_build.py
@@ -16,7 +16,7 @@
 
 
 def get_runtime_image_repo():
-    return os.getenv('OD_RUNTIME_RUNTIME_IMAGE_REPO', 'ghcr.io/all-hands-ai/runtime')
+    return os.getenv('OH_RUNTIME_RUNTIME_IMAGE_REPO', 'ghcr.io/all-hands-ai/runtime')
 
 
 def _get_package_version():
@@ -365,7 +365,7 @@ def _build_sandbox_image(
         on the contents of the docker build folder (source code and Dockerfile)
         e.g. 1234567890abcdef
     -target_image_tag (str): the tag for the target image that's generic and based on the base image name
-        e.g. od_v0.8.3_image_ubuntu_tag_22.04
+        e.g. oh_v0.9.3_image_ubuntu_tag_22.04
     """
     target_image_hash_name = f'{target_image_repo}:{target_image_hash_tag}'
     target_image_generic_name = f'{target_image_repo}:{target_image_tag}'

diff --git a/openhands/runtime/utils/runtime_templates/Dockerfile.j2 b/openhands/runtime/utils/runtime_templates/Dockerfile.j2
@@ -23,6 +23,7 @@ RUN mkdir -p /openhands && \
     mkdir -p /openhands/logs && \
     mkdir -p /openhands/poetry
 
+# Directory containing subdirectories for virtual environment.
 ENV POETRY_VIRTUALENVS_PATH=/openhands/poetry
 
 RUN if [ ! -d /openhands/miniforge3 ]; then \
@@ -46,24 +47,47 @@ RUN /openhands/miniforge3/bin/mamba install conda-forge::poetry python=3.11 -y
 RUN if [ -d /openhands/code ]; then rm -rf /openhands/code; fi
 COPY ./code /openhands/code
 
-# Install/Update Dependencies
-# 1. Install pyproject.toml via poetry
-# 2. Install playwright and chromium
-# 3. Clear poetry, apt, mamba caches
-RUN cd /openhands/code && \
+# Below RUN command sets up the Python environment using Poetry,
+# installs project dependencies, and configures the container
+# for OpenHands development.
+# It creates and activates a virtual environment, installs necessary
+# tools like Playwright, sets up environment variables, and configures
+# the bash environment to ensure the correct Python interpreter and
+# virtual environment are used by default.
+WORKDIR /openhands/code
+RUN \
+    # Configure Poetry and create virtual environment
+    /openhands/miniforge3/bin/mamba run -n base poetry config virtualenvs.path /openhands/poetry && \
     /openhands/miniforge3/bin/mamba run -n base poetry env use python3.11 && \
+    # Install project dependencies
     /openhands/miniforge3/bin/mamba run -n base poetry install --only main,runtime --no-interaction --no-root && \
+    # Update and install additional tools
     apt-get update && \
     /openhands/miniforge3/bin/mamba run -n base poetry run pip install playwright && \
     /openhands/miniforge3/bin/mamba run -n base poetry run playwright install --with-deps chromium && \
-    export OD_INTERPRETER_PATH=$(/openhands/miniforge3/bin/mamba run -n base poetry run python -c "import sys; print(sys.executable)") && \
+    # Set environment variables
+    export OH_INTERPRETER_PATH=$(/openhands/miniforge3/bin/mamba run -n base poetry run python -c "import sys; print(sys.executable)") && \
+    export OH_VENV_PATH=$(/openhands/miniforge3/bin/mamba run -n base poetry env info --path) && \
+    # Install extra dependencies if specified
     {{ extra_deps }} {% if extra_deps %} && {% endif %} \
+    # Clear caches
     /openhands/miniforge3/bin/mamba run -n base poetry cache clear --all . && \
+    # Set permissions
     {% if not skip_init %}chmod -R g+rws /openhands/poetry && {% endif %} \
     mkdir -p /openhands/workspace && chmod -R g+rws,o+rw /openhands/workspace && \
+    # Clean up
     apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
     /openhands/miniforge3/bin/mamba clean --all
-
+{% if not skip_init %}
+RUN \
+    # Add the Poetry virtual environment to the bashrc
+    echo "export OH_INTERPRETER_PATH=\"$OH_INTERPRETER_PATH\"" >> /etc/bash.bashrc && \
+    echo "export OH_VENV_PATH=\"$OH_VENV_PATH\"" >> /etc/bash.bashrc && \
+    # Activate the Poetry virtual environment
+    echo 'source "$OH_VENV_PATH/bin/activate"' >> /etc/bash.bashrc && \
+    # Use the Poetry virtual environment's Python interpreter
+    echo 'alias python="$OH_INTERPRETER_PATH"' >> /etc/bash.bashrc
+{% endif %}
 # ================================================================
 # END: Copy Project and Install/Update Dependencies
 # ================================================================