Developers often need to search through extensive codebases to find relevant code snippets that match a specific functionality or concept. Traditional keyword-based search systems fall short in understanding the semantics of code and natural language queries, leading to inefficient and time-consuming search results. Contextual Code Search aims to solve this problem by providing a semantic search capability using vector embeddings, allowing for more accurate and relevant code retrieval.
- Semantic Search: Find relevant code snippets based on natural language queries.
- Advanced Embeddings: Utilizes state-of-the-art transformer models to generate code embeddings.
- API Service: Integrates search functionality into development environments via a FastAPI service.
- Efficient Storage & Retrieval: Scalable and efficient vector database management using Faiss.
To set up the Contextual Code Search project, follow these steps:
-
Clone the repository:
git clone https://github.com/yourusername/contextual-code-search.git cd contextual-code-search -
Build the Docker container:
docker build -t contextual-code-search . -
Run the Docker container:
docker run -p 8000:8000 contextual-code-search
Once the Docker container is running, you can start using the API for code search:
import requests
response = requests.post("http://localhost:8000/search", json={"query": "transform a list into a dictionary"})
results = response.json()
for result in results:
print(result)contextual-code-search/
│
├── app/
│ ├── main.py
│ ├── models.py
│ └── utils.py
│
├── data/
│ └── code_snippets.csv
│
├── tests/
│ ├── test_api.py
│ └── test_embeddings.py
│
├── Dockerfile
├── requirements.txt
└── README.md
- Python 3.11: As the primary programming language.
- Pandas: For data manipulation and analysis.
- Faiss: For efficient similarity search and clustering of dense vectors.
- Transformers: For generating vector embeddings using state-of-the-art models.
- FastAPI: For building the API service.
- Docker: For containerization and easy deployment.
We welcome contributions from the community! To contribute, please follow these steps:
- Fork the repository.
- Create a new branch (
git checkout -b feature-branch). - Make your changes and commit them (
git commit -m 'Add new feature'). - Push to the branch (
git push origin feature-branch). - Open a Pull Request.
Please ensure that your code follows the project's coding standards and include tests for any new features.
This project is licensed under the MIT License. See the LICENSE file for more information.