🎧 AudioFlow AI

A modern, advanced web application that converts PDF documents into highly realistic, neural-powered spoken audiobooks.

AudioFlow is engineered to make reading more immersive by transforming your PDF files into listenable MP3 audiobooks in seconds. Featuring a premium, dark-mode glassmorphism interface and a robust AI text-to-speech engine, it delivers unparalleled voice accuracy and natural intonation.

✨ Features

Ultra-Realistic Neural TTS: Upgraded to use Microsoft Edge Neural TTS (edge-tts) for extremely natural-sounding, high-fidelity AI voices (Aria, Swara, Elvira, Denise) that outperform traditional TTS models.
Intelligent Text Processing Engine: Automatically repairs hyphenated words broken across lines and cleans extracted text for flawless and continuous pronunciation.
Premium Glassmorphism UI: A stunning, modern, and interactive frontend complete with animated gradient orbs, frosted glass components, and an AI-wave loader.
Robust PDF Text Extraction: Leverages PyMuPDF to accurately extract text, even from documents with complex layouts.
Customizable Playback: Supports multiple languages (English, Hindi, Spanish, French) and adjustable reading speeds.

🛠️ Tech Stack

Backend: Python 3, Flask
Text Extraction: PyMuPDF (fitz)
AI Audio Generation: edge-tts (Microsoft Neural TTS)
Frontend: HTML5, CSS3 (Custom Glassmorphism + Phosphor Icons)

📂 Project Structure

AudioFlow/
├── app.py                  # Main Flask application logic (AI TTS + Text Engine)
├── requirements.txt        # Python dependencies
├── README.md               # Project documentation
├── static/
│   └── style.css           # Premium glassmorphism styles and animations
└── templates/
    └── index.html          # UI template with custom SVG icons

⚙️ Installation & Setup

Follow these steps to run the application locally on your machine.

1. Clone the Repository

git clone https://github.com/abhranilsingharoy-cloud/AudioFlow.git
cd AudioFlow

2. Create a Virtual Environment (Recommended)

Windows:

python -m venv venv
venv\Scripts\activate

macOS/Linux:

python3 -m venv venv
source venv/bin/activate

3. Install Dependencies

pip install -r requirements.txt

4. Run the Application

python app.py

5. Access the Web App

Open your web browser and navigate to: http://127.0.0.1:5000/

📖 Usage Guide

Upload PDF: Click the upload box or drag and drop your PDF file into the designated area.
Select Neural Voice: Choose your preferred AI voice (e.g., English - Aria, Hindi - Swara).
Select Speed: Choose between "Normal" or "Slow" synthesis speeds.
Synthesize Audio: Click the "Synthesize Audio" button to let the neural engine process your document.
Listen: The high-quality MP3 file will be downloaded automatically once processing is complete.

🛡️ Limitations

Scanned PDFs: The current text extraction method does not support OCR. Scanned images or image-only PDFs will not yield text.
File Size: Very large PDFs might experience longer processing times, as the Neural TTS engine communicates with the cloud to synthesize the high-fidelity audio.

🤝 Contributing

Contributions make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

This project is proprietary and confidential. Unauthorized copying, modification, or distribution of this software, via any medium, is strictly prohibited without the express written permission of the author.

Designed and Developed by Abhranil Singha Roy

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
static		static
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
test_app.py		test_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎧 AudioFlow AI

✨ Features

🛠️ Tech Stack

📂 Project Structure

⚙️ Installation & Setup

1. Clone the Repository

2. Create a Virtual Environment (Recommended)

3. Install Dependencies

4. Run the Application

5. Access the Web App

📖 Usage Guide

🛡️ Limitations

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎧 AudioFlow AI

✨ Features

🛠️ Tech Stack

📂 Project Structure

⚙️ Installation & Setup

1. Clone the Repository

2. Create a Virtual Environment (Recommended)

3. Install Dependencies

4. Run the Application

5. Access the Web App

📖 Usage Guide

🛡️ Limitations

🤝 Contributing

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages