Run Gemini-powered research workflows for agents, starting with Deep Research from the command line.
Built for agent-driven research, literature review automation, Gemini Deep Research workflows, and AI-assisted academic pipelines.
Part of the Yongan Toolkit for coding and academic research: SpeakFlow for Human Vibe Coding · Everything to MD for Agent
Quick Start · How It Works · Use Cases · Yongan Toolkit · 中文 · 日本語 · 한국어
Gemini has powerful agent-style capabilities, but the setup path is awkward if you want:
- local CLI execution instead of a web UI
- your own Google account instead of an API key workflow
- reusable output in Markdown and JSON
- automation from Claude Code or other agents
- a setup that researchers can actually repeat
This repo packages that workflow into a minimal Python toolchain with desktop OAuth, token refresh, polling, and report export. Today the main workflow is Deep Research, but the positioning is broader: this repo is for Gemini-powered workflows that agents can actually run.
- Authenticate with Google using desktop OAuth
- Refresh access tokens automatically from
refresh_token - Submit prompts to the Gemini Interactions API
- Poll long-running Deep Research jobs until completion
- Save both the final Markdown report and raw JSON payload
- Support both
desktop-oauthandgcloud-adcauth modes - Provide a Claude Code skill for Gemini-powered agent workflows
In Google Cloud:
- Enable
Generative Language API - Configure the OAuth consent screen
- Create a
Desktop appOAuth client - Download the JSON file as
scripts/client_secret.json
cd scripts
python oauth_login.py \
--client-secret client_secret.json \
--token-file token.jsonpython run_deep_research.py \
--client-secret client_secret.json \
--token-file token.json \
--project-id YOUR_PROJECT_ID \
--prompt "Survey recent methods for reservoir fluid identification from logging data" \
--save-report result.md \
--save-json result.jsonPrompt -> OAuth token -> Gemini Interactions API -> polling -> final report.md / result.json
More explicitly:
oauth_login.pygets the firstrefresh_tokengoogle_oauth.pyrefreshes short-lived access tokens when neededrun_deep_research.pysubmits the prompt and polls until the task is done- The extracted report can be saved as Markdown for direct use in notes, docs, or downstream agent workflows
| Mode | When To Use |
|---|---|
desktop-oauth |
default choice for most users |
gcloud-adc |
if you already use gcloud auth application-default login |
# Prompt from command line
python run_deep_research.py --prompt "Compare recent LLM-based weather downscaling methods"
# Prompt from a markdown file
python run_deep_research.py --prompt-file ../references/prompt-patterns.md
# Save both report and raw JSON
python run_deep_research.py \
--prompt "Review multimodal OCR pipelines for academic PDFs" \
--save-report report.md \
--save-json report.json
# Use gcloud ADC instead of desktop OAuth
python run_deep_research.py \
--auth-mode gcloud-adc \
--prompt "Map the research landscape of digital rock reconstruction"- Literature review before writing a proposal or paper
- Fast scouting of unfamiliar subfields
- Collecting citations, gaps, methods, and benchmark datasets
- Producing first-draft research briefs for human refinement
- Driving Claude Code workflows that need web-scale research, not just local code reasoning
This repository includes a ready-to-use SKILL.md.
Typical prompts:
Do a deep research on recent well-log reconstruction methods.
Compare Kalman filtering based denoising approaches in petrophysics.
Produce a citation-rich review and save the report as Markdown.
"This app is in testing mode": add your Google account under OAuth test usersHTTP 403: enableGenerative Language APIin the selected project- token refresh fails: delete
token.jsonand run the login flow again - consent screen looks empty: go to
BrandingandAudience, not the overview page
This repo is one part of the Yongan Toolkit: a small collection of coding and research tools that work well together.
| Project | What It Helps With |
|---|---|
| speakflow-for-human-vibe-coding | speak ideas, prompts, and notes directly into your workflow |
| everything-to-md-for-agent | turn papers and equations into AI-readable Markdown |
| gemini-workflows-for-agents | run Gemini-powered workflows for agents |
Recommended flow: capture ideas with speakflow-for-human-vibe-coding, research with gemini-workflows-for-agents, then process papers with everything-to-md-for-agent.
gemini-workflows-for-agents/
├── scripts/
│ ├── google_oauth.py
│ ├── oauth_login.py
│ ├── run_deep_research.py
│ ├── start_debug_chrome.ps1
│ └── stop_debug_chrome.ps1
├── references/
├── agents/
├── SKILL.md
└── README.md
这个仓库不只是普通的 Gemini API 示例,而是一个让 agent 运行 Gemini 工作流的工具包,当前首先支持 Deep Research:
- 用自己的 Google 账号做 OAuth 登录
- 自动刷新 token
- 从 CLI 直接发起 Deep Research
- 等待长任务完成并导出 Markdown 报告
- 可以被 Claude Code 当作 skill 直接调用
适合:
- 写开题、申博、综述前先做一次深度调研
- 快速摸清某个方向的方法谱系、常用数据集和研究空白
- 把研究结果保存成 Markdown,进入后续笔记或 Agent 流程
このリポジトリは、エージェントが Gemini ベースのワークフローを実行できるようにするツールで、現在は Deep Research を中心に提供します。
- API キーではなく OAuth ベースで利用
- 長時間タスクをポーリングして完了まで待機
- Markdown レポートと JSON を保存可能
- Claude Code の skill としても利用しやすい
向いている用途:
- 文献レビュー
- 研究テーマの探索
- 引用付きレポートの下書き作成
이 저장소는 에이전트가 Gemini 기반 워크플로를 실행할 수 있게 돕는 도구이며, 현재는 Deep Research가 중심입니다.
- API 키 대신 OAuth 기반 사용
- 장시간 작업을 폴링하며 완료까지 대기
- Markdown 보고서와 JSON 저장 가능
- Claude Code 스킬로도 연동하기 쉬움
적합한 용도:
- 문헌 조사
- 연구 주제 탐색
- 인용 포함 초안 보고서 작성
No license file is included in this repository yet. If you want broader reuse and contributions, adding a license is recommended.