Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deploying llm studio with docker image on RunPod with docker image fail because web server start on localhost instead of 0.0.0.0 - Should be configurable #557

Closed
fbellame opened this issue Dec 29, 2023 · 3 comments
Assignees
Labels
type/bug Bug in code

Comments

@fbellame
Copy link

fbellame commented Dec 29, 2023

🐛 Bug

RunPod is a popular non expensive cloud GPU. Actual doc. I want to build a tutorial on how to easily fine tune a small LLM like Mistral 7b without owning a GPU.
I love LLM Studio cause you can do it pretty easily.
I own myseft a pretty good GPU but most of the folks (developer not in Data science) don't own one.
So I decided to try to deploy LLM Studio with Runpod. It didn't work so I reached the Runpod support that told me that it is a requirement to start the server with 0.0.0.0 and not localhost.

I run through the LLM studio documentation and also a little bit the open source code but didn't manage to find a way to configure that.

Here is the scenario:

Deploying llm studio with docker image on RunPod with docker image fail because web server start on localhost instead of 0.0.0.0 - Should be configurable

To Reproduce

Deploy a RunPod container with docker image

Put your Runpod Key in RUNPOD_KEY (Need a Runpod account)

curl --request POST
--header 'content-type: application/json'
--url "https://api.runpod.io/graphql?api_key=${RUNPOD_KEY}"
--data '{"query": "mutation { podFindAndDeployOnDemand( input: { cloudType: ALL, gpuCount: 1, volumeInGb: 50, containerDiskInGb: 40, gpuTypeId: "NVIDIA GeForce RTX 3080", name: "h2o-llmstudio", imageName: "gcr.io/vorvan/h2oai/h2o-llmstudio:nightly", dockerArgs: "", ports: "10101/http", volumeMountPath: "/data" } ) { id imageName env machineId machine { podHostId } } }"}'

Deployment looks successful but when trying to access the app with the URL:

https://[pod-id]-10101.proxy.runpod.net (change pod-id by the pod id you just deployed)

Generate this error on the browser:

Disconnected. Reconnecting in 16s
Make sure your wave server is running and the environment network policies allow websocket connections

LLM Studio version

Any recent version, I use nightly docker image build: gcr.io/vorvan/h2oai/h2o-llmstudio:nightly

@fbellame fbellame added the type/bug Bug in code label Dec 29, 2023
@pascal-pfeiffer
Copy link
Collaborator

Thank you for reporting this.
Without testing it, yet, I'll quickly mention the ENV var H2O_WAVE_ADDRESS (+H2O_WAVE_LISTEN), maybe it can already unblock you, when setting this inside the docker container.

https://wave.h2o.ai/docs/configuration#h2o_wave_address

@fbellame
Copy link
Author

fbellame commented Dec 29, 2023

Thank a lot, it helped a little bit but now I have a new error message in the log on the containe:

{"err":"websocket: request origin not allowed by Upgrader.CheckOrigin","t":"socket_upgrade"}

Look like an other config is required to allow web socket to work!

@pascal-pfeiffer
Copy link
Collaborator

H2O Wave recently added a feature that allows configuration of the websocket origins. h2oai/wave#2279

Please check the latest H2O LLM Studio version that includes this feature of H2O Wave. The new env variable for the setting is H2O_WAVE_ALLOWED_ORIGINS.

Please reopen if that didn't resolve your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Bug in code
Projects
None yet
Development

No branches or pull requests

2 participants