Visualizing Attention Patterns in Molecular Data

A toolkit for analyzing and visualizing attention mechanisms in transformer models applied to molecular data, providing insights into how models process chemical information and molecular representations.

Interactive Artifacts

Sample Interactive Outputs Generated by the Toolkit:

↑ Return to Table of Contents

Installation

Install the required dependencies using pip:

pip install -r requirements.txt

↑ Return to Table of Contents

Dataset

This project utilizes the Open Molecules 2025 (OMol25) dataset, a collection of molecular structures and properties designed for machine learning applications in computational chemistry.

Source: Hugging Face Dataset Repository
Content: Chemical descriptions with associated molecular properties

↑ Return to Table of Contents

Model Architecture

The core model is built upon ChemBERTa-zinc-base-v1, a specialized transformer architecture optimized for chemical data processing:

Input: Chemical descriptions with molecular properties
Output: Language model trained for chemical property prediction and molecular representation learning
Architecture: BERT-based transformer with domain-specific adaptations for molecular data
Training Objective: Dual-purpose model for both property prediction and representation learning

↑ Return to Table of Contents

Training Pipeline

Figure 1: Training pipeline showing data preprocessing, model architecture, and optimization steps.

↑ Return to Table of Contents

Visualization Features

This toolkit provides comprehensive visualization capabilities for understanding attention patterns in molecular transformer models.

5.1. Molecular Clustering Analysis

UMAP Dimensionality Reduction: Interactive visualization of molecular embeddings in reduced dimensional space.

Figure 2: UMAP projection colored by number of atoms, revealing structural patterns in molecular space.

Figure 3: Sample molecule clusters formed in UMAP space, demonstrating learned chemical similarities.

↑ Return to Table of Contents

5.2. 3D Attention Landscape

Three-dimensional visualization combining attention patterns across all model layers and heads.

Figure 4: Combined attention patterns visualized in three dimensions, showing global attention flow patterns.

↑ Return to Table of Contents

5.3. Attention Focus Analysis

Regional Focus Patterns: Analysis of whether attention heads specialize in different regions of molecular sequences.

Figure 5: Parallel coordinates plot showing focus patterns across layers and heads.

Figure 6: Bar plots comparing attention distribution to beginning, middle, and end regions of molecular sequences.

↑ Return to Table of Contents

5.4. Position-wise Attention Analysis

Positional Attention Distribution: Detailed analysis of how attention is distributed across different positions in molecular formulas.

Figure 7: Multi-panel analysis showing:

Heatmap of average attention weights by position and layer
Variance plot highlighting positions with variable attention
Entropy plot distinguishing focused vs. diffuse attention patterns

↑ Return to Table of Contents

5.5. Head Specialization Analysis

Individual Head Patterns: Comprehensive analysis of how different attention heads develop specialized functions.

Figure 8: Grid visualization of attention matrices for each layer-head combination.

Figure 9: 3D surface plots revealing attention patterns for selected specialized heads.

↑ Return to Table of Contents

5.6. Dynamic Attention Flow

Attention Flow Animation: Interactive visualization showing relationships between molecular tokens and positions over time.

Figure 10: Dynamic attention flow animation revealing token-to-token attention relationships.

Attention River Plot: Flow visualization showing attention intensity evolution across model layers.

Figure 11: River plot visualization tracking attention flow intensity across transformer layers.

↑ Return to Table of Contents

Usage

Basic Attention Visualization

from attention_viz import CombinedAttentionVisualizer

visualizer = CombinedAttentionVisualizer("path/to/attention/maps")
visualizer.run_full_analysis(sample_id="5388")  # or None for auto-detection

↑ Return to Table of Contents

Results

The visualization toolkit provides several key insights into molecular attention patterns:

Layer-specific Specialization: Different transformer layers focus on distinct aspects of molecular structure
Position-dependent Attention: Systematic variations in attention based on token position within molecular sequences
Head Specialization: Individual attention heads develop specialized functions for different chemical features
Hierarchical Processing: Evidence of hierarchical information processing from local to global molecular features

↑ Return to Table of Contents

License

This project is licensed under the MIT License.

↑ Return to Table of Contents

Citation

If you use this toolkit in your research, please cite:

@software{molecular_attention_viz,
  title={Visualizing Attention Patterns in Molecular Data},
  author={[Nooshin Bahador]},
  year={2025},
  url={https://github.com/nbahador/attention-visualization-tool}
}

↑ Return to Table of Contents

Acknowledgments

Facebook AI Research for the OMol25 dataset
DeepChem community for ChemBERTa model architecture

↑ Return to Table of Contents

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
attention_viz_outputs		attention_viz_outputs
docs		docs
img		img
others		others
sample_attention_maps		sample_attention_maps
src		src
tests		tests
README.md		README.md
example_usage.py		example_usage.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Visualizing Attention Patterns in Molecular Data

Table of Contents

Interactive Artifacts

Sample Interactive Outputs Generated by the Toolkit:

Installation

Dataset

Model Architecture

Training Pipeline

Visualization Features

5.1. Molecular Clustering Analysis

5.2. 3D Attention Landscape

5.3. Attention Focus Analysis

5.4. Position-wise Attention Analysis

5.5. Head Specialization Analysis

5.6. Dynamic Attention Flow

Usage

Basic Attention Visualization

Results

License

Citation

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Visualizing Attention Patterns in Molecular Data

Table of Contents

Interactive Artifacts

Sample Interactive Outputs Generated by the Toolkit:

Installation

Dataset

Model Architecture

Training Pipeline

Visualization Features

5.1. Molecular Clustering Analysis

5.2. 3D Attention Landscape

5.3. Attention Focus Analysis

5.4. Position-wise Attention Analysis

5.5. Head Specialization Analysis

5.6. Dynamic Attention Flow

Usage

Basic Attention Visualization

Results

License

Citation

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages