Add GPU support and improve Docker deployment
- Add GPU deployment support with NVIDIA runtime - Update Dockerfile.allinone with GPU environment variables - Add comprehensive GPU_DEPLOYMENT.md guide - Make port 11434 (Ollama) optional for security - Update DEPLOYMENT.md with CPU and GPU deployment options - Simplify default docker run commands - Update healthcheck to only check web application - Add memory requirements documentation - Create MEMORY_REQUIREMENTS.md with model comparison - Add build-8b.sh script for lower memory usage - Document OOM troubleshooting steps - Improve Docker build process - Add BUILD_TROUBLESHOOTING.md for common issues - Add DISTRIBUTION.md for image distribution methods - Update .gitignore to exclude large binary files - Improve docker-entrypoint.sh with better diagnostics - Update .dockerignore to include ollama-linux-amd64.tgz - Add backup file exclusions to .gitignore
This commit is contained in:
@@ -15,12 +15,18 @@ echo "=========================================="
|
||||
echo ""
|
||||
echo "This will create a Docker image containing:"
|
||||
echo " - Python application"
|
||||
echo " - Ollama service"
|
||||
echo " - Ollama service (v0.13.1)"
|
||||
echo " - qwen3:14b model"
|
||||
echo " - qwen3-embedding:4b model"
|
||||
echo ""
|
||||
echo "Target platform: linux/amd64 (x86_64)"
|
||||
echo ""
|
||||
echo "WARNING: The final image will be 10-20GB in size!"
|
||||
echo ""
|
||||
echo "NOTE: If you're building on Apple Silicon (M1/M2/M3),"
|
||||
echo " Docker will use emulation which may be slower."
|
||||
echo " The image will still work on x86_64 servers."
|
||||
echo ""
|
||||
|
||||
# Check if ollama-models directory exists
|
||||
if [ ! -d "ollama-models" ]; then
|
||||
@@ -33,6 +39,19 @@ fi
|
||||
echo "✓ Found ollama-models directory"
|
||||
echo ""
|
||||
|
||||
# Check if Ollama binary exists
|
||||
if [ ! -f "ollama-linux-amd64.tgz" ]; then
|
||||
echo "ERROR: ollama-linux-amd64.tgz not found!"
|
||||
echo ""
|
||||
echo "Please download it first:"
|
||||
echo " curl -L -o ollama-linux-amd64.tgz https://github.com/ollama/ollama/releases/download/v0.13.1/ollama-linux-amd64.tgz"
|
||||
echo ""
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "✓ Found ollama-linux-amd64.tgz"
|
||||
echo ""
|
||||
|
||||
# Check disk space
|
||||
AVAILABLE_SPACE=$(df -h . | awk 'NR==2 {print $4}')
|
||||
echo "Available disk space: $AVAILABLE_SPACE"
|
||||
@@ -50,7 +69,12 @@ echo ""
|
||||
echo "=========================================="
|
||||
echo "Building Docker image..."
|
||||
echo "=========================================="
|
||||
docker build -f Dockerfile.allinone -t ${IMAGE_NAME}:${IMAGE_TAG} .
|
||||
echo "Platform: linux/amd64 (x86_64)"
|
||||
echo "This may take 20-40 minutes depending on your machine..."
|
||||
echo ""
|
||||
|
||||
# Build for amd64 platform explicitly
|
||||
docker build --platform linux/amd64 -f Dockerfile.allinone -t ${IMAGE_NAME}:${IMAGE_TAG} .
|
||||
|
||||
echo ""
|
||||
echo "=========================================="
|
||||
@@ -83,14 +107,25 @@ echo "2. On target server, load the image:"
|
||||
echo " docker load -i ${EXPORT_FILE}"
|
||||
echo ""
|
||||
echo "3. Run the container:"
|
||||
echo ""
|
||||
echo " CPU mode:"
|
||||
echo " docker run -d \\"
|
||||
echo " --name system-prompt-optimizer \\"
|
||||
echo " -p 8010:8010 \\"
|
||||
echo " -p 11434:11434 \\"
|
||||
echo " -v \$(pwd)/outputs:/app/outputs \\"
|
||||
echo " --restart unless-stopped \\"
|
||||
echo " ${IMAGE_NAME}:${IMAGE_TAG}"
|
||||
echo ""
|
||||
echo " GPU mode (recommended if NVIDIA GPU available):"
|
||||
echo " docker run -d \\"
|
||||
echo " --name system-prompt-optimizer \\"
|
||||
echo " --gpus all \\"
|
||||
echo " -p 8010:8010 \\"
|
||||
echo " --restart unless-stopped \\"
|
||||
echo " ${IMAGE_NAME}:${IMAGE_TAG}"
|
||||
echo ""
|
||||
echo " Note: Port 11434 (Ollama) is optional and only needed for debugging."
|
||||
echo " GPU mode provides 5-10x faster inference. See GPU_DEPLOYMENT.md for details."
|
||||
echo ""
|
||||
echo "4. Access the application:"
|
||||
echo " http://<server-ip>:8010/ui/opro.html"
|
||||
echo ""
|
||||
|
||||
Reference in New Issue
Block a user