This project integrates HTML, CSS, and JavaScript to create a web application primarily powered by Flask. Unlike utilizing Watson's API, it opts for Google's speech-to-text and text-to-speech services, reducing complexity while maintaining functionality. The API integration enriches the website's capabilities, facilitating seamless interaction. Leveraging template inheritance using the Jinja2 module, it ensures consistent design and functionality across multiple pages. The project also harnesses the NLTK library of Python for natural language processing tasks. Furthermore, it integrates with Google Cloud services for additional functionalities and scalability. Overall, the project embodies an embeddable AI concept, blending various technologies to deliver a sophisticated yet user-friendly web experience.
-
Set Up a Virtual Environment
- Ensure you have Python installed.
- Create a virtual environment:
python -m venv venv
- Activate the virtual environment:
- On Windows:
venv\Scripts\activate
- On macOS/Linux:
source venv/bin/activate
- On Windows:
-
Install Required Libraries
- Install the necessary libraries from the
requirements.txtfile:pip install -r requirements.txt
- Install the necessary libraries from the
-
Set Up Google Cloud Services
- Sign up for Google Cloud services.
- Enable Speech-to-Text and Text-to-Speech APIs.
- Obtain API keys and set up authentication as per Google Cloud's documentation.
-
Run the Application
- Start the Flask server:
flask run
- Open your web browser and navigate to
http://127.0.0.1:5000/to view the application.
- Start the Flask server:
- CognitiveClass.ai's course "Improve Customer Support with AI-powered Voice Services"
- Google Cloud's Speech-to-Text documentation
- Speech Recognition Accuracy: Sometimes the AI model fails to recognize spoken words, especially while recording customer addresses. Consider training the model with more diverse datasets.
- Bug Fixes: Occasionally, clicking the continue button results in an
AttributeErrorin theclean_text()function due to the AI model failing to store processed words. - Additional Input Methods: Adding alternate methods of input, such as text, could enhance usability.
- Enhancing Address Recognition: To improve address recording, add specific stop words relevant to your location, similar to how pizza size and toppings are handled. This will make the AI model more specific and accurate.
Feel free to reach out for any further assistance or queries! ✉️