This README provides guidance for setting up a Dockerized environment with CUDA to run various services, including llama-cpp-python, stable diffusion, mariadb, mongodb, redis, and grafana.
- Docker Engine Installation
- Nvidia Drivers Installation
- Nvidia Container Toolkit Installation
- Nvidia CUDA Toolkit Installation
- Change environment variables in .env (do not track changes to this file as it's not production-ready).
cd llama-docker
docker build -t base_image -f docker/Dockerfile.base . # build the base image
docker build -t cuda_image -f docker/Dockerfile.cuda . # build the cuda image
docker compose up --build -d # build and start the containers, detached
## useful commands
docker compose up -d # start the containers
docker compose stop # stop the containers
docker compose up --build -d # rebuild the containers
docker ps # list running containers
docker logs {container id} # show container logs
docker exec -it {container id} /bin/bash # enter container cli
- Multi-model support: Configuration and Multi-model Support
- Configuration file: llama_config.json
- OPENAI_BASE_URL and OPENAI_API_KEY are set in .env
- Interchangeable between local and OpenAI API.
- MariaDB {ip address}:6000
- MongoDB {ip address}:6001
- Redis {ip address}:6002