SPAA: Scientific Proposal Abstract Analyzer

Version 1.1.0 - November, 2025. Monterrey

Description

SPAA is an automated evaluator of scientific abstracts that integrates narrative/ rhetorical analysis with domain-dependent scientific–technical verification. It applies linguistic, semantic, and structural rules—together with curated domain-specific lexicons—to assess the quality of the background, hypothesis, methodology, expected outcomes, and impact sections. Using NLP (SpaCy), SPAA generates a report with expert-level feedback on the abstract’s coherence, rigor, and scientific soundness.

Purpose

SPAA evaluates whether a research abstract:

States a clear problem, context, and knowledge gap.
Formulates a specific and domain-relevant hypothesis.
Describes a coherent methodological plan.
Defines expected outcomes aligned with the study goals.
Identifies scientific or societal impact.
Uses domain-appropriate terminology based on curated lexicons.
Employs proper academic tone, causal structure, and strong action verbs.
Produces a consistent, reproducible evaluation report.
Generates a concise abstract summary based on keyword relevance.

Project Structure

SPAA/
├── abstract_validator.py               # Orchestrates the validation pipeline (main OOP engine)
├── loader.py                           # Class to load input text and configuration files
├── background_analysis.py              # Background section validator
├── hypothesis_analysis.py              # Hypothesis section validator
├── methodology_analysis.py             # Methodology section validator
├── outcomes_analysis.py                # Expected Outcomes section validator
├── impact_analysis.py                  # Impact section validator
├── ethics_analysis.py                  # Ethics validator
├── summarizer.py                       # Summarizes the abstract automatically
├── bloom_detection.py                  # Bloom's Taxonomy detection utilities
├── lexicon/
│   ├── <tag_1>/lexicon_<tag_1>.csv     # Lexicon list for <tag_1>
│   ├── <tag_2>/lexicon_<tag_2>.csv     # Lexicon list for <tag_2>
│   └── <tag_N>/lexicon_<tag_N>.csv     # Lexicon list for <tag_N>
├── config/
│   ├── config_keywords.json            # Keyword lists and synonyms
│   └── config_weights.json             # Scoring weights per section
├── input_data/
│   └── abstract_file.txt               # Abstract to be evaluated (structured with # sections)
├── output/
│   └── output_results.txt              # Generated validation report
└── README.md

Dependencies

Python 3.10
SpaCy

Install dependencies with:

pip install spacy
python -m spacy download en_core_web_sm

How to Run

From the root folder:

python abstract_validator.py --tag name_of_lexicon

Make sure your input file (abstract_file.txt) inside input_data/ follows this format:

# background
[Text about the background]

# hypothesis
[Text about the hypothesis]

# methodology
[Text about the planned methods]

# outcomes
[Text about the expected outcomes]

# impact
[Text about the long-term impact]

# keywords
keyword1, keyword2, keyword3, ..., keyword5

Output

A console message and a saved file under output/, containing:

Validation feedback section by section
Scores (%) for each evaluated part
Detection of strong/weak scientific tone
Detection of Bloom's action verbs (and their level)

Example console output:

Keywords matched: 3 of 5
Validation completed. Results saved to: output/output_results.txt

Notes

All keyword rules and scoring weights can be customized via JSON files (config/ folder).
Bloom verb detection includes synonyms to ensure broader linguistic coverage.
Designed for scientific proposal abstracts, not intended for finalized manuscripts.

Future Extensions

Domain taxonomies: High-level ontologies to classify concepts and detect topic alignment.
Multi-domain lexicon fusion: Combine lexicons for cross-disciplinary abstracts.
Extending to multi-language support (e.g., Spanish abstract).

Author

Developed by Flavio F. Contreras-Torres (Tecnológico de Monterrey)
Monterrey, Mexico - November 2025

Versions

v.1.0.0 - April 2025. Oviedo, Spain
v.1.1.0 - November 2025. Monterrey, Mexico

License

This project is licensed under the terms of the MIT License.
See the LICENSE file for full details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SPAA: Scientific Proposal Abstract Analyzer

Description

Purpose

Project Structure

Dependencies

How to Run

Output

Notes

Future Extensions

Author

Versions

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.vscode		.vscode
__pycache__		__pycache__
config		config
input_data		input_data
lexicon		lexicon
output		output
LICENSE		LICENSE
README.md		README.md
abstract_validator.py		abstract_validator.py
background_analysis.py		background_analysis.py
bloom_detection.py		bloom_detection.py
config.py		config.py
ethics_analysis.py		ethics_analysis.py
hypothesis_analysis.py		hypothesis_analysis.py
impact_analysis.py		impact_analysis.py
loader.py		loader.py
methodology_analysis.py		methodology_analysis.py
outcomes_analysis.py		outcomes_analysis.py
summarizer.py		summarizer.py

Folders and files

Latest commit

History

Repository files navigation

SPAA: Scientific Proposal Abstract Analyzer

Description

Purpose

Project Structure

Dependencies

How to Run

Output

Notes

Future Extensions

Author

Versions

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages