Skip to content

zou-group/humanlm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HumanLM: Simulating Users with State Alignment Beats Response Imitation

License: Apache 2.0

Can language models truly act like specific humans — not just produce humanlike text, but reflect individual values, opinions, and communication styles? HumanLM tackles this challenge by aligning LMs to internal user states (stances, beliefs) rather than merely imitating surface-level responses.

Quick Start

1. Data Collection & Processing

We provide end-to-end tooling for collecting raw data from six sources and processing them into train/val/test splits with LLM-generated user personas. See humanual_datasets/README.md for full instructions.

2. Human Evaluation

The user study interface lets annotators compare their own responses against model-generated ones on Reddit posts.

# Start the required vLLM model servers
vllm serve Qwen/Qwen3-8B --dtype auto --host 0.0.0.0 --port 8000 --tensor-parallel-size 3 --max-model-len 7168
vllm serve snap-stanford/humanlm-opinions --dtype auto --host 0.0.0.0 --port 63456 --tensor-parallel-size 2 --max-model-len 7168

# Launch the Gradio annotation interface
cd user_study
python gradio_app.py          # add --debug to skip validation constraints

3. Training

The VERL recipe for HumanLM training is maintained as a git submodule at humanlm_train/verl-recipe-humanlm.

If you cloned this repository without submodules, run:

git submodule update --init --recursive

For first-time setup, run:

git clone --recurse-submodules https://github.com/zou-group/humanlm.git

The HumanLM-specific training code and setup instructions are in humanlm_train/verl-recipe-humanlm/humanlm/README.md.

Bibtex

@article{wu2026humanlm,
  title={HUMANLM: Simulating Users with State Alignment Beats Response Imitation},
  url={https://humanlm.stanford.edu/},
  author={Wu, Shirley and Choi, Evelyn and Khatua, Arpandeep and
          Wang, Zhanghan and He-Yueya, Joy and Weerasooriya, Tharindu Cyril and
          Wei, Wei and Yang, Diyi and Leskovec, Jure and Zou, James},
  year={2026}
}

About

HumanLM: Simulating Users with State Alignment Beats Response Imitation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors