GRAIL: Grounding Relational Concepts in Neuro-Symbolic Reinforcement Learning

Hikaru Shindo¹, Henri Rößler¹, Quentin Delfosse¹, Kristian Kersting^1,2,3,4

¹TU Darmstadt, Germany ²Hessian AI, Germany ³German Research Center for Artificial Intelligence (DFKI), Germany ⁴Centre for Cognitive Science, TU Darmstadt, Germany

Neuro-symbolic agents rely on logical rules to infer their actions, which often requires knowledge about how objects are related to each other. Understanding concepts such as left of or nearby is therefore essential for solving abstract tasks. In existing systems, such relations are typically defined by human experts, which limits extensibility since the meaning of a concept can vary across different environments.

GRAIL introduces a framework that grounds relational concepts through interaction with the environment. We further utilize large language models to provide additional weak supervision and to complement sparse reward signals. Empirical evaluations on the ATARI environments Kangaroo and Seaquest demonstrate that our agents match, and in some cases exceed, the performance of logic agents with hand-crafted relational concepts.

Installation

Using Docker (Recommended)

docker build -t grail:base .

Local Installation

Install Python dependencies:
```
pip install -r requirements.txt
```

Install the logic reasoning libraries:

cd nsfr && pip install -e . && cd ..
cd nudge && pip install -e . && cd ..

(Optional) Install NEUMANN for memory-efficient reasoning (required for highly-parallelized environments, e.g., 512 envs in Seaquest):

cd neumann && pip install -e . && cd ..

This requires PyG and its dependencies:

pip install torch-geometric torch-sparse torch-scatter

For GPU support with CUDA 12.4:

pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.4.0+cu124.html

Install difflogic:
```
pip install -e third_party/difflogic
```

Usage

Training

Stage 1 — Train valuation functions (learns relational concept grounding with LLM weak supervision):

python train_valuation.py --env-name kangaroo --num-steps 128 --num-envs 5 --track

Stage 2 — Train the full agent (joint symbolic + neural policy):

python train_blenderl.py --env-name seaquest --joint-training --num-steps 128 --num-envs 5 --gamma 0.99

Baselines:

# NUDGE (symbolic-only)
python train_nudge.py --env-name kangaroo --num-steps 128 --num-envs 5

# Neural PPO
python train_neuralppo.py --env-name kangaroo --num-steps 128 --num-envs 5

Evaluation

python eval_valuation.py --env-name kangaroo
python evaluate.py --env-name seaquest --agent-path <path-to-checkpoint>

Interactive Play

python play_gui.py --env-name kangaroo --agent-path <path-to-checkpoint>

Use --track on any training script to log to Weights & Biases.

Supported Environments

Environment	Description
Kangaroo	Platformer with ladders, monkeys, and coconuts
Seaquest	Underwater navigation with divers, enemies, and missiles
Skiing	Downhill skiing with flags and obstacles

Adding New Environments

Create a new directory under in/envs/<env_name>/ containing:

Logic state extraction — translates raw environment states into logic representations
Valuation functions — each relation (e.g., closeby) maps to a differentiable function computing the probability that the relation holds
Action mapping — maps action-predicates predicted by the agent to environment actions

See in/envs/freeway/ for a minimal example.

Project Structure

├── blendrl/           # Core BlendRL framework (agents, evaluator, explainer)
├── cleanrl/           # Clean RL baselines
├── nsfr/              # Neural-Symbolic Forward Reasoner
├── nudge/             # NUDGE symbolic RL framework
├── neumann/           # Memory-efficient graph-based reasoner
├── valuation/         # Learned valuation functions module
├── in/
│   ├── envs/          # Environment definitions and logic rules
│   ├── config/        # Hyperparameter configurations
│   └── prompts/       # LLM prompt templates for proxy function generation
├── env_src/           # Environment source code
├── third_party/       # External dependencies (difflogic)
├── train_valuation.py # Stage 1: valuation function training
├── train_blenderl.py  # Stage 2: full BlendRL agent training
├── train_nudge.py     # NUDGE baseline training
├── train_neuralppo.py # Neural PPO baseline training
├── eval_valuation.py  # Valuation evaluation
├── sim_valuation.py   # Valuation simulation
└── play_gui.py        # Interactive GUI for trained agents

Acknowledgements

This work was partly funded by the German Federal Ministry of Education and Research, the Hessian Ministry of Higher Education, Research, Science and the Arts (HMWK) within their joint support of the National Research Center for Applied Cybersecurity ATHENE, via the "SenPai:XReLeaS" project. The work has benefited from the Clusters of Excellence "Reasonable AI" (EXC-3057) and "The Adaptive Mind" (EXC-3066), both funded by the German Research Foundation (DFG) under Germany's Excellence Strategy.

GRAIL builds upon BlendRL (ICLR 2025).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GRAIL: Grounding Relational Concepts in Neuro-Symbolic Reinforcement Learning

Installation

Using Docker (Recommended)

Local Installation

Usage

Training

Evaluation

Interactive Play

Supported Environments

Adding New Environments

Project Structure

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
blendrl		blendrl
cleanrl		cleanrl
env_src		env_src
in		in
neumann		neumann
nsfr		nsfr
nudge		nudge
third_party		third_party
valuation		valuation
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
eval_valuation.py		eval_valuation.py
evaluate.py		evaluate.py
explain.py		explain.py
generate_prompts.py		generate_prompts.py
play_gui.py		play_gui.py
requirements.txt		requirements.txt
sim_valuation.py		sim_valuation.py
train_blenderl.py		train_blenderl.py
train_neuralppo.py		train_neuralppo.py
train_nudge.py		train_nudge.py
train_valuation.py		train_valuation.py
utils.py		utils.py

Folders and files

Latest commit

History

Repository files navigation

GRAIL: Grounding Relational Concepts in Neuro-Symbolic Reinforcement Learning

Installation

Using Docker (Recommended)

Local Installation

Usage

Training

Evaluation

Interactive Play

Supported Environments

Adding New Environments

Project Structure

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages