Papyrus is a smart web application that analyzes research paper abstracts and recommends the most relevant papers to users.
It uses Natural Language Processing (NLP) and data mining techniques to understand semantic similarity between papers and deliver accurate, context-aware recommendations.
- π Upload or input a research paper abstract
- π NLP-based similarity analysis using TF-IDF or Sentence-BERT
- π Retrieve most relevant papers from a large dataset (e.g., arXiv)
- π§© Topic modeling and trend detection
- βοΈ RESTful API built with Django REST Framework
- π» Frontend built using React for dynamic user experience
Backend: Django, Django REST Framework, Python
Frontend: React, HTML, CSS, JavaScript
Machine Learning: Scikit-learn, Sentence-BERT, Pandas, NumPy
Database: SQLite / PostgreSQL
The system uses freely available public datasets such as:
- arXiv Metadata Dataset
- Semantic Scholar Open Research Corpus
- DBLP Computer Science Bibliography
- CiteSeerX Dataset
- Microsoft Academic Graph (MAG)
- π§ Add user profile-based personalized recommendations
- β‘ Integrate FAISS or Elasticsearch for faster similarity search
- π Include citation network visualization
- π Implement research trend analytics and keyword frequency charts
- βοΈ Deploy using Render / Vercel with CI/CD integration
- π§Ύ Add PDF abstract extraction using PyMuPDF or spaCy
Author: Maruf Hossain
Dept. of Computer Science and Engineering (CSE)
Green University of Bangladesh
π§ Email: maruf.bshs@gmail.com