Transform your unformatted documents into professionally styled documents using the power of AI. DocCraft learns formatting patterns from your reference documents and applies them intelligently to your raw content.
- AI-Powered Style Learning: Uses Google Gemini AI to understand and replicate document formatting patterns
- Intelligent Block Classification: Automatically categorizes text blocks (headings, paragraphs, lists, etc.)
- Seamless DOCX Processing: Works with Microsoft Word documents (.docx format)
- User-Friendly Interface: Built with Streamlit for an intuitive web-based experience
- Batch Processing: Format entire documents in seconds
- Style Preservation: Maintains font sizes, bold, italic, underline, and other formatting attributes
- Upload Raw Document: Provide your unformatted DOCX file
- Upload Reference Document: Provide a formatted DOCX file as a style template
- AI Analysis: DocCraft analyzes the formatting patterns in your reference document
- Smart Application: The AI applies similar formatting to your raw document
- Download Result: Get your professionally formatted document instantly
- Python 3.13+
- Google Gemini API key
- Google Cloud Service Account (optional, for enhanced features)
git clone <repository-url>
cd DocCraftpip install -r requirements.txtCreate a .env file in the project root:
export GEMINI_API_KEY=your_gemini_api_key_here
export GOOGLE_SERVICE_ACCOUNT_JSON='{"type": "service_account", ...}'streamlit run app.pyNavigate to http://localhost:8501 to start using DocCraft!
pip install -e .streamlit- Web interfacepython-docx- DOCX file processinglangchain- AI frameworklangchain-google-genai- Google Gemini integrationgoogle-api-python-client- Google API clientpython-dotenv- Environment variable managementwatchdog- File monitoring
from doccraft import DocCraft
# Initialize DocCraft
formatter = DocCraft(api_key="your_gemini_key")
# Format a document
formatted_doc = formatter.format_document(
raw_file="unformatted.docx",
reference_file="template.docx"
)
# Save the result
formatted_doc.save("formatted_output.docx")- Start the Streamlit app:
streamlit run app.py - Upload your raw DOCX file
- Upload your reference/template DOCX file
- Click "Format and Download DOCX"
- Download your formatted document
- Academic Papers: Apply consistent formatting to research documents
- Business Reports: Maintain corporate style guidelines across documents
- Legal Documents: Ensure uniform formatting for legal briefs and contracts
- Technical Documentation: Standardize formatting for manuals and guides
- Content Migration: Convert documents between different style formats
| Variable | Description | Required |
|---|---|---|
GEMINI_API_KEY |
Your Google Gemini API key | Yes |
GOOGLE_SERVICE_ACCOUNT_JSON |
Google Cloud service account JSON | Optional |
- Input: Microsoft Word (.docx)
- Output: Microsoft Word (.docx)
- Styling: Font size, bold, italic, underline, colors, alignment
DocCraft includes built-in rate limiting and retry logic for Google Gemini API:
- Automatic retry on rate limit exceeded
- 60-second backoff on resource exhaustion
- Optimized prompt engineering to minimize API calls
API Key Not Found
Error: Gemini API key not found
Solution: Ensure GEMINI_API_KEY is set in your environment variablesFile Upload Issues
Error: Could not process DOCX file
Solution: Ensure the file is a valid .docx format (not .doc)Memory Issues with Large Documents
Solution: Break large documents into smaller sections- Support for PDF files
- Advanced style customization
- Bulk document processing
- Integration with Google Docs
- Custom style templates
- API endpoint for programmatic access
This is a proprietary project. For collaboration opportunities or feature requests, please contact the project maintainer.
This project is proprietary software. See LICENSE file for details.
For support, feature requests, or licensing inquiries:
- Email: [pallavrai8953@gmail.com]
- Issues: GitHub Issues
- Time-Saving: Format documents in seconds, not hours
- Consistency: Ensure uniform styling across all documents
- AI-Powered: Leverage cutting-edge AI for intelligent formatting
- Professional: Create polished, professional-looking documents
- Easy to Use: No technical expertise required
Made with ❤️ for document formatting excellence