This document explains the complete workflow for using Prometheus to generate 3D models.
git clone https://github.com/Caraveo/Prometheus.git
cd Prometheus
./setup.shThis will:
- Create Python virtual environment (
env/) - Install all dependencies (PyTorch, Shap-E, etc.)
- Install Shap-E from GitHub
- Set up the environment for Apple Silicon M-Series optimization
If you want to use material generation:
./download_material_models.shThis downloads ~2-3GB of material generation models from HuggingFace.
./run.shThis builds and launches the macOS app with proper focus.
Best for: Creating 3D models from text descriptions
- Select Mode: Choose "Text to 3D" from the mode selector
- Enter Prompt: Type a description (e.g., "a red sports car", "a wooden chair")
- Optional - Generate Materials: Toggle "Generate Materials" if you want PBR textures
- Generate: Click "Generate 3D Model"
- Wait: First generation takes 5-10 minutes (downloads models), subsequent ones take 1-3 minutes
- View Results:
- PLY file saved in
output/directory - USDZ file automatically generated for iPhone/Vision Pro
- Material maps (if enabled) saved in
output/materials/
- PLY file saved in
output/{prompt}.ply- 3D mesh file (Blender, MeshLab compatible)output/{prompt}.usdz- Spatial computing format (iPhone/Vision Pro)output/materials/{prompt}_albedo.png- Albedo map (if materials enabled)output/materials/{prompt}_roughness.png- Roughness mapoutput/materials/{prompt}_metallic.png- Metallic mapoutput/materials/{prompt}_bump.png- Bump/normal map
Best for: Converting 2D images into 3D models
- Select Mode: Choose "Image to 3D" from the mode selector
- Drop Image: Drag and drop an image into the drop zone, or use the folder button
- Optional Prompt: Add a text description for additional guidance
- Optional - Generate Materials: Toggle "Generate Materials" for PBR textures
- Generate: Click "Generate 3D Model"
- Wait: Processing takes 1-3 minutes
- View Results: Same as Text-to-3D workflow
- Use clear, well-lit images for best results
- Images with depth information work better
- The prompt helps guide the 3D reconstruction
Best for: Reconstructing 3D scenes from multiple photographs
- Multiple images (at least 3, preferably 20+) of the same scene from different angles
- Camera poses (can be generated with COLMAP/LLFF tools)
- Note: NeRF requires TensorFlow 1.15 and CUDA, which may not work on Apple Silicon
-
Prepare Images:
- Take 20+ photos of your scene from different angles
- Place all images in a single directory
- Ensure consistent lighting and exposure
-
Generate Camera Poses (if needed):
- Use COLMAP or LLFF tools to compute camera poses
- Save poses as
poses_bounds.npyin the images directory
-
Select Mode: Choose "NeRF (Multi-Image)" from the mode selector
-
Select Directory: Click folder button and select your images directory
-
Generate: Click "Generate 3D Model"
-
Wait: NeRF training takes several hours (depending on scene complexity and hardware)
-
View Results:
- Trained NeRF model in
output/logs/ - Extracted mesh in
output/nerf_mesh.ply - Render video in
output/nerf_video.mp4
- Trained NeRF model in
β οΈ Apple Silicon Limitation: Original NeRF uses TensorFlow 1.15 which may not work on M-Series chips- Consider using modern NeRF implementations (like nerfstudio) for Apple Silicon compatibility
- Training time: 2-24 hours depending on scene and hardware
- Requires CUDA GPU for reasonable training times
Best for: Adding realistic textures to generated 3D models
- A generated 3D model (from Shap-E workflows)
- MaterialAnything models downloaded (
./download_material_models.sh)
- Generate 3D Model: Use Text-to-3D or Image-to-3D workflow first
- Enable Materials: Toggle "Generate Materials" before generating
- Wait: Material generation adds 2-5 minutes to processing time
- View Results: Material maps appear in the output section
- Albedo: Base color/diffuse texture
- Roughness: Surface roughness (affects how light scatters)
- Metallic: Metallic properties (for PBR rendering)
- Bump: Surface detail/normal map
- Import PLY mesh into Blender, Unreal Engine, Unity, etc.
- Apply material maps using PBR shader
- Adjust material properties based on the generated maps
-
Setup (one-time):
./setup.sh ./download_material_models.sh
-
Generate Model:
- Launch app:
./run.sh - Select "Text to 3D"
- Enter prompt: "a red sports car, detailed, realistic"
- Toggle "Generate Materials" ON
- Click "Generate 3D Model"
- Wait 5-10 minutes (first time) or 2-5 minutes (subsequent)
- Launch app:
-
Results:
output/a_red_sports_car.ply- 3D meshoutput/a_red_sports_car.usdz- For iPhone/Vision Prooutput/materials/a_red_sports_car_albedo.png- Car paint textureoutput/materials/a_red_sports_car_roughness.png- Surface finishoutput/materials/a_red_sports_car_metallic.png- Chrome/metal partsoutput/materials/a_red_sports_car_bump.png- Surface details
-
Use in 3D Software:
- Import PLY into Blender
- Create PBR material
- Load material maps
- Render or export
- β Optimized: Shap-E uses MPS (Metal Performance Shaders) for GPU acceleration
- β Fast: Text-to-3D generation: 1-3 minutes
- β Material Generation: Works in simplified mode
β οΈ NeRF: May not work (TensorFlow 1.15 limitation)
- β Supported: Falls back to CPU mode
β οΈ Slower: Generation takes 3-5 minutesβ οΈ No GPU: No CUDA support
- β Fastest: Full GPU acceleration
- β NeRF: Works with TensorFlow 1.15
- β Materials: Full MaterialAnything pipeline
- Run:
./setup.shto create the virtual environment
- Run:
./download_material_models.sh - Or disable material generation
- Expected behavior - TensorFlow 1.15 doesn't support M-Series chips
- Use modern NeRF implementations like nerfstudio instead
- First generation downloads models (~2GB) - be patient
- Ensure you're using Apple Silicon for best performance
- Check that MPS is being used (check status messages)
Prometheus/
βββ output/
β βββ your_model.ply # 3D mesh
β βββ your_model.usdz # iPhone/Vision Pro format
β βββ materials/ # Material maps (if generated)
β βββ your_model_albedo.png
β βββ your_model_roughness.png
β βββ your_model_metallic.png
β βββ your_model_bump.png
βββ env/ # Python environment
βββ ...
| Feature | Input | Output | Time | Apple Silicon |
|---|---|---|---|---|
| Text-to-3D | Text prompt | PLY + USDZ | 1-3 min | β Optimized |
| Image-to-3D | Single image | PLY + USDZ | 1-3 min | β Optimized |
| NeRF | Multiple images | PLY + Video | Hours | |
| Materials | 3D model | PBR maps | +2-5 min | β Simplified mode |
For more details, see the README.md.