Scopus Survey API

Web API for bibliographic survey of Scopus articles

🌐 Idiomas: Leia em Português [pt-BR]

Instituto Federal de Educação, Ciência e Tecnologia de Mato Grosso do Sul • IFMS Campus Três Lagoas
Tecnologia em Análise e Desenvolvimento de Sistemas • TADS

Federal Institute of Education, Science and Technology of Mato Grosso do Sul
Technology in Systems Analysis and Development

Data provided by Scopus® • © Elsevier

Documentation: https://mauprogramador.github.io/scopus-survey-api/
Web API: http://127.0.0.1:8000/v2/scopus-survey/en-US/search-articles
Swagger UI: http://127.0.0.1:8000/

1. Overview

This web API is designed, within its limitations, to perform systematic bibliographic surveys using data from the Scopus database, promoting access to relevant and high-quality bibliographic sources through a simple and well-documented interface, thus reducing the initial barrier to entry for students and academics.

As a free, non-commercial academic automation tool, the application integrates multiple selection criteria, including multiple query parameters, keyword combinations, and Boolean search, with mechanisms for retrieval, validation, serialization, and customized filtering of large volumes of data from the Scopus APIs.

This way, only the most relevant and recent data will be retained and returned in a CSV file, making it suitable bibliometric studies and surveys, research, systematic reviews, etc., allowing students to quickly gather a set of peer-reviewed literature sources for a thesis or project.

2. Configuration

Create an .env file to configure the following options:

Parameter	Description	Default
`HOST`	Sets the host address to listen on	`127.0.0.1`
`PORT`	Sets the server port on which the application will run	`8000`
`RELOAD`	Enable auto-reload on file changes for local development	`false`
`WORKERS`	Sets multiple worker processes	`1`
`LOGGING_FILE`	Enable saving logs to files	`false`
`DEBUG`	Enable the debug mode and debug logs	`false`
`PROGRESS_BAR`	Displays the progress bar of the request process	`true`

The RELOAD and WORKERS options are mutually exclusive.
Setting the HOST to 0.0.0.0 makes the application externally available.

Note

The address 0.0.0.0 is not a valid domain for the Cross-Origin-Opener-Policy, use localhost instead.

Set WORKERS, maximum 4, to start multiple server processes.
In production, RELOAD, DEBUG, and PROGRESS_BAR are automatically disabled.

Tip

Take a look at the .env.example file.

3. Run

3.1. Set Up a Python Venv

You will need Python3.12 with Pip and Venv installed.

# Create new Venv (.venv)
make venv

# Activate Venv
source .venv/bin/activate

3.2. Development (Poetry)

Install Poetry with all dependencies: app, dev, tests, docs, and run with Uvicorn.

# Install all dependencies groups from pyproject.toml with Poetry
(.venv) make install-dev

# Run with Poetry
(.venv) make run-dev

3.3. Production (Pip)

Install only the main dependencies: app and run with Gunicorn.

# Install only main dependencies from requirements.txt with Pip
(.venv) make install-prod

# Run with Gunicorn
(.venv) make run-prod

3.4. Docker

You will need Docker installed. Build the scopus-survey-api image from the Dockerfile, install only the main dependencies from requirements.txt with Pip, and run with Uvicorn.

# Run in Docker Container from Dockerfile
make docker

# Follow and show the last logs
make docker-logs

4. Important Information

4.1. Data Source

We declare that all use of the Scopus® database and its APIs, owned and maintained by © Elsevier B.V., is intended only for non-commercial academic research, without implying endorsement or affiliation, and is subject to our Terms, as well as Elsevier's Terms and Scopus's Policy. All data we handle is retrieved and obtained "AS IS" and, therefore, despite its known reliability, we do not guarantee or assume responsibility for any errors or inaccuracies in the data in the Scopus database.

Caution

You are strictly prohibited from misuse or attempt to misuse data obtained from the Scopus APIs in violation of Elsevier API Service Agreement.

4.2. Data Manipulation

In general, the data will be preserved without any direct alteration; however, since they are obtained "AS IS", it will need to be properly validated based on the HTTP response fields from the APIs:

Those that returned a value will be kept as is;
Those that did not return any value will be set to "null" by default;
The "authors" field will be set to the first author ("dc:creator") or all authors ("authors") concatenated, depending on what is returned.

Finally, the documents will be filtered and removed in the following order:

Exact duplicates, where the first one will be kept.
Exactly the same title and same author(s), where the first one will be kept.
Same author(s) with similar titles, where the one with the most recent publication date will be kept.

4.3. Search

In accordance with the API Service Agreement and Use Policies, Elsevier will issue you an API Key that grants you a limited license to use the Scopus APIs, so that you can properly authenticate to query the Scopus database. It can be obtained by accessing the Elsevier Developer Portal and registering. If you are part of an educational institution, you can try to signing in using your organization's or academic email.

About the fields we use in the search to produce more relevant results:

The combined field "TITLE-ABS-KEY" to simultaneously search for keyword combinations in abstracts, keywords, and titles, and retrieve the literature where they are found.
The "date" and "sort" fields to delimit the period of interest for publications, and sort by year and date of publication and by relevance.
Other optional additional fields that we can send by combining them with the Boolean operator "AND", such as subject area and language.

The searches will be conducted as follows:

Retrieve the total number of results found for each keyword combination, concatenating them with the Boolean operator "AND".
Effectively perform the final search with the selected combination and obtain the Scopus ID of each result in the pagination.
Retrieve a complete dataset with comprehensive metadata by Scopus ID, obtaining all fields with relevant bibliographic information for each result of the previous search.

4.4. Institutional Network

Please be aware that the API Key will only authenticate correctly if you submit it while using your academic institution's network, which must be registered with Elsevier. This does not include VPN or proxy access. Therefore, if you are fully remote and off-campus, some data may not be returned.

4.5. Quota and Rate Limits

There's a maximum limit to the number of requests we can make to Scopus APIs using your API Key. This request quota resets every seven days, is unique to each API, and you can check its availability in the details panel after each operation. If requests exceed the quota or throttling rate, an error will be returned. See the API Key Settings.

Scopus API	Weekly Quota	Rate Limit
Search API	20,000	9req/s
Abstract Retrieval API	10,000	9req/s

4.6. Async HTTP Client

To avoid exceeding the API's request Rate Limit and Quota when making several requests, we built an asynchronous HTTP client with flow control and error handling mechanisms to handle this large volume of requests concurrently, while respecting the API limits based on the total number of requests to be made. We employ:

asyncio.Semaphore and asyncio.sleep to control concurrency and insert additional delays;
aiohttp.ClientSession and aiohttp.ClientTimeout to manage the client session, timeout, and connection;
aiolimiter.AsyncLimiter for rate limiting;
aiohttp_retry.RetryClient and aiohttp_retry.JitterRetry for automatic retry mechanisms, with jitter, backoff, and timeout.

For retries, up to 3 attempts will be made, and for rate limiting, a dynamic strategy will be used based on the number of requests to be made.

Requests	Rate Limit	Backoff Factor	Sleep	Concurrent Requests
100	8.0 req/s	2.0	0.0s	10
500	6.0 req/s	3.0	0.15s	5
1000	5.0 req/s	3.5	0.25s	3
2000	4.0 req/s	4.5	0.35s	2

5. Fields

5.1. Form Multi-Steps

STEP 1 - API Key

In this step, you will need to enter the API Key, issued by Elsevier. This is the main parameter without which the application cannot be run, as it is necessary for correct authentication and use of the Scopus APIs. If you have already conducted a survey before, you can also try downloading the previously generated CSV file, which may still be stored.

STEP 2 - Additional Params

In this step, you can enter and select multiple fields that will be combined using the AND operator and sent, when filled in, as parameters to perform a Boolean query in the Scopus database and produce more relevant results.

STEP 3 - Keyword Combination

In this step, you must select keywords based on the theme or subject of your research. These keywords will be concatenated using the Boolean operator AND, generating all possible combinations. Finally, the total number of documents found in Scopus will be retrieved, searching abstracts, keywords, and titles for each combination, thus narrowing the scope of your search based on the chosen combination and the total results returned.

STEP 4 - Final Survey

In this final step, all filled fields will be submitted for systematic information survey, removing duplicates and filtering similar documents, leaving only the most relevant and recent data.

5.2. Required Fields

API Key: The API key issued by Elsevier, obtained by accessing the Elsevier Portal and registering.
Keywords: The Keywords, with a minimum of two (required) and a maximum of four, that the documents you are searching for contain. They must be written in English, with a maximum of 70 characters, and can contain letters, numbers, spaces, hyphens, underscores, phrases, wildcards, and the Boolean operators OR and AND NOT.

Warning

Because AND NOT can generate unexpected results, it should be used in the last field.

Combination: The keyword combination option that best suits your needs based on the total number of documents found.

5.3. Optional Fields

Date: The range of years, from the last ten years to the current year, as the target of interest for published articles. By default, the last three years are considered.
Doctype: The type in which the document is classified.
Pubstage: The publication stage of the document.
Language: The language in which the original document was written.
Open Access: Whether the indexed content is open access or not.
Srctype: The type of source from which the document originates.
Subjarea: The subject area in which the document is classified.
Pages: Whether the document is short (up to 4 pages, such as research notes) or complete (5 pages or more), by the number of pages.
Similarity Threshold: The threshold value, in the range of 0 to 100, used to filter documents with the same author(s) and similar titles, keeping the one with the most recent publication date.

6. Results

6.1. Fields Retrieved

Mapped fields of the CSV file

Field	Column	Description
link `ref=scopus`	Article Preview Page URL	Scopus article preview page URL
`dc:identifier`	Scopus ID	Article Scopus ID
`authors` or `dc:creator`	Authors	Complete author list or only the first author
`dc:title`	Title	Article title
`prism:publicationName`	Publication Name	Source title
`dc:description`	Abstract	Article complete abstract
`prism:coverDate`	Date	Article complete abstract
`eid`	Electronic ID	Article Electronic ID
`prism:doi`	DOI	Document Object Identifier
`prism:volume`	Volume	Identifier for a serial publication
`citedby-count`	Citations	Cited-by count

6.2. CSV Metadada

Since the result of the survey is a CSV file, which is essentially a dataset obtained from the Scopus APIs, we must acknowledge both Scopus and Elsevier as data sources. Therefore, we will add some metadata at the top of the file (4 lines) as comments indicating the parameters used, survey details, and the date the data was obtained.

Example:

# GeneratedBy: ScopusSurveyAPI https://github.com/mauprogramador/scopus-survey-api
# Params: api_key=..., date=2022-2025, keywords=['Python', 'Web API', 'Scopus', 'bibliographic survey'], combination=Python AND Web API AND Scopus, ratio=80
# Survey: total=5, items_per_page=5, pages_count=1, loss=0doc / 0.00%
# Source: data retrieved from Scopus APIs on November 22, 2025 via http://api.elsevier.com and http://www.scopus.com.

It is possible to perform more than one survey using the keyword combinations from the table. Therefore, in order to avoid confusion, we will save the CSV files with the combination used to produce that result by default.

Example:

Combination	Filename
Python AND Scopus	[API Key]_python-scopus_docs.csv
Python AND Scopus AND Web API	[API Key]_python-scopus-web-api_docs.csv

6.3. Visualization

To visualize the documents (at least the preview), you can use:

The URL in the Article Preview Page URL column:
https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=[SCOPUS_ID]&origin=inward
The Digital Object Identifier (DOI) in the DOI column:
https://doi.org/[DOI]

6.4. Performance

Keywords	Total gather	Process time	Loss
Web API AND Scopus	25	3.63s	0doc / 0.00%
Python AND Scopus	141	19.26s	1doc / 0.71%
Bibliographic Survey	1073	246.38s (4.10m)	7doc / 0.65%

Tip

Download a sample survey CSV file and take a look.

For questions or concerns please contact me at sir.silvabmauricio@gmail.com.

Terms of Service • Privacy Policy • Cookie Policy • Attributions

License • Translations • Latest Release • Changelog

Name		Name	Last commit message	Last commit date
Latest commit History 541 Commits
.github/workflows		.github/workflows
docs		docs
legal		legal
src		src
tests		tests
web		web
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
README.pt_BR.md		README.pt_BR.md
TRANSLATIONS.md		TRANSLATIONS.md
app.yaml		app.yaml
client.http		client.http
cspell.json		cspell.json
mkdocs.yml		mkdocs.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
venv.sh		venv.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Scopus Survey API

1. Overview

2. Configuration

3. Run

3.1. Set Up a Python Venv

3.2. Development (Poetry)

3.3. Production (Pip)

3.4. Docker

4. Important Information

4.1. Data Source

4.2. Data Manipulation

4.3. Search

4.4. Institutional Network

4.5. Quota and Rate Limits

4.6. Async HTTP Client

5. Fields

5.1. Form Multi-Steps

5.2. Required Fields

5.3. Optional Fields

6. Results

6.1. Fields Retrieved

6.2. CSV Metadada

6.3. Visualization

6.4. Performance

About

Uh oh!

Releases 6

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

mauprogramador/scopus-survey-api

Folders and files

Latest commit

History

Repository files navigation

Scopus Survey API

1. Overview

2. Configuration

3. Run

3.1. Set Up a Python Venv

3.2. Development (Poetry)

3.3. Production (Pip)

3.4. Docker

4. Important Information

4.1. Data Source

4.2. Data Manipulation

4.3. Search

4.4. Institutional Network

4.5. Quota and Rate Limits

4.6. Async HTTP Client

5. Fields

5.1. Form Multi-Steps

5.2. Required Fields

5.3. Optional Fields

6. Results

6.1. Fields Retrieved

6.2. CSV Metadada

6.3. Visualization

6.4. Performance

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages