SINTA Journal Metadata CLI

A command-line tool for retrieving journal metadata from the Indonesian academic database SINTA.

Research tool for collecting Indonesian scholarly journal metadata from the SINTA journal search interface.

English / Japanese

Overview

A command-line tool for retrieving journal metadata from the current SINTA (Science and Technology Index) journal search website in Indonesia.

This tool is designed for the current SINTA web environment centered on:

https://sinta.kemdiktisaintek.go.id

Many existing scripts and examples available online were written for older SINTA site structures or outdated domain patterns. As the SINTA platform has evolved, some of those tools no longer work reliably.

This CLI tool provides a practical way to collect journal metadata from the current SINTA journal search interface in a controlled and reproducible way.

Note: In the SINTA system, "SINTA 1" through "SINTA 6" refer to journal ranking categories, not versions of the SINTA platform.

Features

Compatible with the current SINTA domain (sinta.kemdiktisaintek.go.id)
Keyword search for journals
Search mode selection: title-only or broader search
Affiliation filtering
JSON / CSV output
UNIX pipeline friendly (stdout-first design)
Built-in randomized delay and browser-like User-Agent

Extracted Metadata

The script extracts the following fields from search results:

journal_name
sinta_level
p_issn
e_issn
affiliation
sinta_score_3y
sinta_score_overall
h_index_google
h_index_sinta
citations_google
citations_sinta

These fields are visible in the current implementation where each journal result is normalized into a Python dictionary before output.

In version 1.1.0 the tool introduces two fetch modes:

basic : returns only fields available directly from search results
detail : retrieves additional metadata from each journal profile page

Architecture

flowchart TD
    A[User runs CLI] --> B[Parse command-line arguments]
    B --> C[Build SINTA 3 search request]
    C --> D[Wait random delay]
    D --> E[HTTP GET to SINTA 3 journals page]
    E --> F[Parse HTML with BeautifulSoup]
    F --> G[Extract journal metadata]
    G --> H{Affiliation filter?}
    H -- Yes --> I[Keep matching records only]
    H -- No --> J[Keep all records]
    I --> K{Output format}
    J --> K
    K -- JSON --> L[Print JSON to stdout]
    K -- CSV --> M[Print CSV to stdout]

Installation

Python 3.9 or later is recommended.

1. Clone the repository

git clone https://github.com/kimipooh/sinta-full-cli-v3
cd sinta-full-cli-v3

2. Create a virtual environment

python3 -m venv .venv
source .venv/bin/activate

3. Install dependencies

pip install -r requirements.txt

Or install them directly:

pip install requests beautifulsoup4

The current implementation imports requests, beautifulsoup4, csv, json, argparse, sys, time, random, and re; only requests and beautifulsoup4 need to be installed separately.

Usage

Show help

python sinta-full-cli-v3.py

Get detailed information (detail mode)

python sinta-full-cli-v3.py -q "AI" --fetch-mode detail

Basic search

python sinta-full-cli-v3.py -q "Engineering"

Search titles only

python sinta-full-cli-v3.py -q "Computer Science" -m title

Export CSV

python sinta-full-cli-v3.py -q "Physics" -f csv > results.csv

Filter by affiliation

python sinta-full-cli-v3.py -q "AI" -a "Universitas"

Search all fields and inspect the first result with jq

python sinta-full-cli-v3.py -q "Physics" -m all | jq '.[0]'

Run directly as an executable

chmod +x sinta-full-cli-v3.py
./sinta-full-cli-v3.py -q "Engineering"

The examples above follow the same behavior described in the Japanese draft guide, now unified under the repository script name sinta-full-cli-v3.py.

Command-Line Arguments

Option	Long option	Description
`-q`	`--query`	Search keyword
`-m`	`--mode`	Search mode: `title` or `all`
`-a`	`--affil`	Filter by affiliation / institution
`-f`	`--format`	Output format: `json` or `csv`
	`--fetch-mode`	`basic` or `detail`

These options match the current argparse configuration in the script.

Example Output

Example command:

python sinta-full-cli-v3.py -m title -q "Buletin ekonomi moneter dan perbankan" -f csv

Example output:

journal_name,sinta_level,p_issn,e_issn,affiliation,sinta_score_3y,sinta_score_overall,h_index_google,h_index_sinta,citations_google,citations_sinta
Buletin Ekonomi Moneter dan Perbankan,S1 Accredited,N/A,N/A,"Bank Indonesia Institute, Bank Indonesia",0,0,0,0,0,0

This example is adapted from the Japanese notes provided by the author.

Using with `jq`

Because the default output is JSON, the tool works well with jq.

Show only the first result

python sinta-full-cli-v3.py -q "AI" | jq '.[0]'

Count matched journals

python sinta-full-cli-v3.py -q "AI" | jq 'length'

Show only S1 journals

python sinta-full-cli-v3.py -q "AI" | jq '.[] | select(.sinta_level == "S1")'

List journal names only

python sinta-full-cli-v3.py -q "AI" | jq -r '.[].journal_name'

Sort by overall score

python sinta-full-cli-v3.py -q "AI" | jq 'sort_by(.sinta_score_overall | tonumber) | reverse'

Responsible and Ethical Use

This tool accesses publicly available search pages.

Please use it responsibly:

A randomized delay is built in to reduce request bursts
Avoid large-scale repeated queries
If you encounter HTTP 403 responses, stop and wait before retrying
Be aware that HTML structures may change and break selectors

The current code sleeps for a random interval before each request and uses a browser-like User-Agent string.

Limitations

The tool currently scrapes the public journal search page
HTML selectors may need updates if the SINTA layout changes
The current approach is intentionally conservative and does not try to crawl all pages aggressively
Search result pagination is not fully automated in the current workflow guidance because aggressive looping may increase the risk of IP blocking

Academic Use

This tool was originally developed as a practical research utility for working with Indonesian scholarly journal data, especially where current SINTA 3-compatible tools are limited.

If you use this software in research, a citation to the repository or DOI is appreciated.

DOI

This repository is archived on Zenodo.

https://doi.org/10.5281/zenodo.18994689

Citation

If you use this software in your research, please cite:

Kitani, K. (2026).
SINTA Data Retrieval Tool (Version 1.1.0).
Zenodo. https://doi.org/10.5281/zenodo.18994689

Files in This Repository

sinta-full-cli-v3.py — main CLI script
README.md — English documentation
README_ja.md — Japanese documentation
LICENSE — MIT License
requirements.txt — Python dependencies

Author

Kimiya Kitani
Center for Southeast Asian Studies, Kyoto University

License

New in 1.1.0

Added --fetch-mode basic|detail
basic preserves the original 1.0.1 core fields
detail enriches results using journal profile pages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
README_ja.md		README_ja.md
requirements.txt		requirements.txt
sinta-full-cli-v3.py		sinta-full-cli-v3.py

Folders and files

Latest commit

History

Repository files navigation

SINTA Journal Metadata CLI

Overview

Features

Extracted Metadata

Architecture

Installation

1. Clone the repository

2. Create a virtual environment

3. Install dependencies

Usage

Show help

Get detailed information (detail mode)

Basic search

Search titles only

Export CSV

Filter by affiliation

Search all fields and inspect the first result with jq

Run directly as an executable

Command-Line Arguments

Example Output

Using with jq

Show only the first result

Count matched journals

Show only S1 journals

List journal names only

Sort by overall score

Responsible and Ethical Use

Limitations

Academic Use

DOI

Citation

Files in This Repository

Author

License

New in 1.1.0

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Using with `jq`

Packages