Generates Awesome Indexes - tiny search engines build from curated sources:
- Awesome Lists (hence the name Awesome Indexes!)
- Zotero libraries and collections
- Zenodo Communities
- Any other source that can be used to create suitable JSON objects formatted as JSONL files.
The awindex tool gathers links and metadata from these sources, and uses them to build a static web page that provides a Pagefind faceted search interface. It can also package the index data as a downloadable database, to allow deeper analysis or custom visualisations to be created.
You can see a demonstration here.
To install awindex locally, you need Python 3.11 or later.
pip install git+https://github.com/digipres/awesome-indexer@main
awindex -c config -o ./indexAfter which, you will be able to run the awindex command.
Or, if uv is installed, the awindex tool can be run directly using:
uvx --from git+https://github.com/digipres/awesome-indexer@main awindex -c config.yaml -o ./indexBy default, the awindex command reads it's configuration from a file called ./config.yaml (this can overridden at the command line, run awindex -h for help).
The tool reads the config.yaml file, downloads and caches the information sources, and generates an Awesome Index in the ./index folder.
There are a set of fields that provide some basic information about the site, and then a list of sources to read in order to build the index. For example:
title: "My Awesome Index Title"
homepage: https://my.website/page-about-this-index
description: "A brief description about this index and what's in it."
sources:
- name: "Awesome Digital Preservation"
homepage: "https://github.com/digipres/awesome-digital-preservation/"
type: awesome-list
url: "https://raw.githubusercontent.com/digipres/awesome-digital-preservation/refs/heads/main/README.md"An example config.yaml is provided that shows how it works in more detail.
Each type of source should have a name and a homepage so people can find out more about the source that has been included in the index. Each source can also have a description, to be shown in the Awesome Index source summary.
The additional parameters for each source are...
type: awesome-list(required)url: A URL to download the Markdown source content of the Awesome List. (required)view_url: A URL pointing to a web version of the source content that allows linking and highlighting of lines using a#L10fragment on the end of the URL.
Note that awindex only supports public Zotero collections at present.
type: zotero(required)library_type: Eitheruserorgroup(required).library_id: The identification number for this library, e.g.8195999(required).collection_id: The key of a specific collection within this library, e.g.ERZIYJ3T(optional). If this is specified, the index will only include records that are included in that hierarchy of collections.api_key: A Zotero API key (optional). This can be used to access private groups, but note that writing this directly into the config file will mean this file needs to be kept private. (See this open issue for an alternative approach).
The pyzotero documentation has more information about these fields and how to find them.
type: zenodo(required)community: The unique identifier for this community, e.g.digital-preservation(required).
type: jsonl(required)file: A local file path for a set of records in JSONL format, e.g../test/ipres-awindex-test.jsonl(required).
Unfortunately, the index itself won't work without a web server. If you've got Python 3+ installed, you can run:
cd index
python -m http.server 8080and then the index will be accessible at http://localhost:8080.
To share your Awesome Index, you can upload your files to a static web host like GitHub Pages, Netlify (e.g. using Netlify Drop) or these EU alternatives.
You can look at the SQLite database that the indexer generates using e.g Datasette, like this:
uvx datasette serve index/records.db --metadata datasette-metadata.json
Building an index can be integrated into GitHub Action build like this:
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
python-version: 3.11
- name: Build the Awesome Index
run: uvx --from git+https://github.com/digipres/awesome-indexer@main awindex -c _awindex/config.yaml -o ./awesome-index
There is an example here.
As well as needing Python 3.11+, the development environment needs NodeJS installed (because Pagefind is written in JavaScript).
The search page template uses the Jinja2 templating library and the interface is built using Bootstrap (v5).
sudo apt install python3.11
sudo apt install python3.11-venv
python3.11 -m venv .venv
source .venv/bin/activate
pip install -e .Having installed in development mode (pip install -e), to run from source:
python -m awindex.cliTBA: JSONL or extend thusly