Commit Graph

12 Commits

Author SHA1 Message Date
0b5319b31c Add GPU support and improve Docker deployment
- Add GPU deployment support with NVIDIA runtime
  - Update Dockerfile.allinone with GPU environment variables
  - Add comprehensive GPU_DEPLOYMENT.md guide

- Make port 11434 (Ollama) optional for security
  - Update DEPLOYMENT.md with CPU and GPU deployment options
  - Simplify default docker run commands
  - Update healthcheck to only check web application

- Add memory requirements documentation
  - Create MEMORY_REQUIREMENTS.md with model comparison
  - Add build-8b.sh script for lower memory usage
  - Document OOM troubleshooting steps

- Improve Docker build process
  - Add BUILD_TROUBLESHOOTING.md for common issues
  - Add DISTRIBUTION.md for image distribution methods
  - Update .gitignore to exclude large binary files
  - Improve docker-entrypoint.sh with better diagnostics

- Update .dockerignore to include ollama-linux-amd64.tgz
- Add backup file exclusions to .gitignore
2025-12-08 17:08:45 +08:00
6426b73a5e fix: export only required models instead of entire Ollama directory
- Changed export-ollama-models.sh to selectively copy only qwen3:14b and qwen3-embedding:4b
- Parses manifest files to identify required blob files
- Significantly reduces Docker image size by excluding unrelated models
- Added summary showing which models were skipped

This prevents accidentally including other models (like deepseek-r1, bge-m3, etc.)
that may exist in the user's Ollama directory but are not needed for the project.
2025-12-08 12:00:11 +08:00
26f8e0c648 feat: add Docker support for offline deployment with qwen3:14b
Major additions:
- All-in-One Docker image with Ollama + models bundled
- Separate deployment option for existing Ollama installations
- Changed default model from qwen3:8b to qwen3:14b
- Comprehensive deployment documentation

Files added:
- Dockerfile: Basic app-only image
- Dockerfile.allinone: Complete image with Ollama + models
- docker-compose.yml: Easy deployment configuration
- docker-entrypoint.sh: Startup script for all-in-one image
- requirements.txt: Python dependencies
- .dockerignore: Exclude unnecessary files from image

Scripts:
- export-ollama-models.sh: Export models from local Ollama
- build-allinone.sh: Build complete offline-deployable image
- build-and-export.sh: Build and export basic image

Documentation:
- DEPLOYMENT.md: Comprehensive deployment guide
- QUICK_START.md: Quick reference for common tasks

Configuration:
- Updated config.py: DEFAULT_CHAT_MODEL = qwen3:14b
- Updated frontend/opro.html: Page title to 系统提示词优化
2025-12-08 10:10:38 +08:00
65cdcf29dc refactor: replace OPRO with simple iterative refinement
Major changes:
- Remove fake OPRO evaluation (no more fake 0.5 scores)
- Add simple refinement based on user selection
- New endpoint: POST /opro/refine (selected + rejected instructions)
- Update prompt generation to focus on comprehensive coverage instead of style variety
- All generated instructions now start with role definition (你是一个...)
- Update README to reflect new approach and API endpoints

Technical details:
- Added refine_based_on_selection() in prompt_utils.py
- Added refine_instruction_candidates() in user_prompt_optimizer.py
- Added OPRORefineReq model and /opro/refine endpoint in api.py
- Updated frontend handleContinueOptimize() to use new refinement flow
- Changed prompt requirements from 'different styles' to 'comprehensive coverage'
- Added role definition requirement as first item in all prompt templates
2025-12-08 09:43:20 +08:00
602875b08c refactor: remove execute instruction button to simplify UX
- Removed '执行此指令' button from candidate cards
- Prevents confusion between execution interactions and new task input
- Cleaner workflow: input box for new tasks, 继续优化 for iteration, 复制 for copying
- Each candidate now only has two actions: continue optimizing or copy
2025-12-06 22:41:05 +08:00
da30a0999c feat: implement session-based architecture for OPRO
- Add session layer above runs to group related optimization tasks
- Sessions use first task description as name instead of 'Session 1'
- Simplified sidebar: show sessions without expansion
- Add '+ 新建任务' button in header to create runs within session
- Fix: reload sessions after creating new run
- Add debugging logs for candidate generation
- Backend: auto-update session name with first task description
2025-12-06 21:26:24 +08:00
1376d60ed5 feat: implement true OPRO with Gemini-style UI
- Add true OPRO system instruction optimization (vs query rewriting)
- Implement iterative optimization with performance trajectory
- Add new OPRO API endpoints (/opro/create, /opro/generate_and_evaluate, /opro/execute)
- Create modern Gemini-style chat UI (frontend/opro.html)
- Optimize performance: reduce candidates from 20 to 10 (2x faster)
- Add model selector in UI toolbar
- Add collapsible sidebar with session management
- Add copy button for instructions
- Ensure all generated prompts use simplified Chinese
- Update README with comprehensive documentation
- Add .gitignore for local_docs folder
2025-12-06 17:24:28 +08:00
8f52fad41c Add .gitignore and remove tracked cache files 2025-12-05 16:14:00 +08:00
xxm
b4934dfe6d 更新 README.md 2025-12-05 15:47:44 +08:00
xxm
800bed638f 去掉无关代码 2025-12-05 07:34:51 +00:00
xxm
dd5339de32 原始代码 2025-12-05 07:11:25 +00:00
xxm
045e777a11 first commit 2025-12-05 07:06:50 +00:00