What is deep-researcher?

The Deep Researcher skill is a professional-grade tool designed for exhaustive, multi-layered investigations and evidence-based synthesis. It utilizes a structured, file-based methodology to overcome context window limitations, allowing for the analysis of 5+ sources simultaneously. Specifically optimized for medical and scientific rigor, it enforces strict citation policies using PubMed MCP, ensuring all findings are backed by peer-reviewed literature, randomized controlled trials, and official clinical guidelines.

When should I use deep-researcher?

deep-researcher is useful in the following scenarios: • Medical Literature Reviews: Systematically searching and synthesizing clinical trial data from PubMed to compare the efficacy and safety profiles of different pharmaceutical interventions. • Evidence-Based Clinical Analysis: Developing comprehensive reports on medical mechanisms or treatment evolutions by cross-referencing primary research with official guidelines from organizations like ACC, ESC, or AHA. • Academic Paper Preparation: Managing complex research projects with parallel investigation threads, persistent progress tracking, and automated validation of citations and PMIDs. • Scientific Gap Analysis: Identifying contradictions in current research and documenting evidence gaps through a rigorous cross-referencing and source quality assessment workflow.

name	deep-researcher
description	Performs comprehensive, multi-layered research on any topic with structured analysis and synthesis of information from multiple sources. Uses file-based research tracking, parallel investigation threads, and context-efficient patterns for deep investigations. ALL MEDICAL CITATIONS FROM PUBMED MCP ONLY.

Deep Researcher v2.0

Comprehensive research methodology with file-based tracking, parallel execution, and context management for investigations requiring 5+ sources.

CRITICAL: All medical evidence and citations must come from PubMed MCP. No exceptions.

Research Modes

Quick Research (1-4 sources): Work in-context, no file structure needed.

Deep Research (5+ sources): Use file-based tracking below.

Research Sources (STRICT POLICY)

ALLOWED for Medical Citations

Source	Tool	Use Case
PubMed MCP	`pubmed_search_articles`, `pubmed_fetch_contents`, `pubmed_article_connections`	ALL medical evidence, trials, mechanisms
Official Guidelines	`web_fetch` to ACC/ESC/ADA/AHA URLs only	Guideline recommendations
AstraDB RAG	Knowledge pipeline	Textbook references, pre-loaded guidelines

NOT ALLOWED for Medical Citations

Source	Why Excluded	Allowed Use
~~OpenAlex~~	Quality variable	REMOVED
~~Perplexity~~	Not peer-reviewed	Trend discovery only, NEVER cite
~~General web search~~	Unreliable	Topic discovery only, NEVER cite
~~News articles~~	Not primary evidence	Background context only

PubMed Quality Filters

Prefer (Tier 1):

Randomized Controlled Trials (RCTs)
Meta-analyses and Systematic Reviews
Guidelines from ACC/ESC/ADA/AHA

Accept (Tier 2):

Large observational studies from Q1 journals
Cohort studies with >1000 patients
Registry data from established registries

Use Cautiously (Tier 3):

Case series (only if no better evidence)
Expert consensus statements
Narrative reviews (as background, not primary evidence)

Reject:

Case reports (except for rare conditions)
Letters to editor
Preprints without peer review
Animal studies (unless specifically about mechanisms)

Deep Research Workflow

Progress Tracking

Create this checklist and update after each step:

Deep Research Progress:
- [ ] Step 1: Initialize research project
- [ ] Step 2: Define scope and plan
- [ ] Step 3: Execute research threads (parallel when possible)
- [ ] Step 4: Validate and cross-reference
- [ ] Step 5: Synthesize from files
- [ ] Step 6: Generate final report

Step 1: Initialize Research Project

For research requiring 5+ sources, create a project structure:

mkdir -p ~/research_{topic}/sources
mkdir -p ~/research_{topic}/threads

Project Structure:

~/research_{topic}/
├── plan.md              # Research questions, scope, thread assignments
├── progress.md          # Living checklist, updated throughout
├── sources/
│   └── pubmed.md        # PubMed search results and abstracts
├── threads/
│   ├── thread_1.md      # Independent research thread
│   ├── thread_2.md      # Another thread
│   └── ...
├── validation.md        # Cross-reference and credibility check
├── synthesis.md         # Cross-thread analysis
└── report.md            # Final deliverable

Why file-based? Context windows fill up. Writing findings to files lets you:

Continue researching without context pressure
Synthesize from persistent storage, not memory
Produce larger, more comprehensive reports
Resume if interrupted

Step 2: Define Scope and Research Plan

Write plan.md with:

# Research Plan: {Topic}

## Primary Question
[The main thing we're trying to answer]

## Scope
- Include: [what's in scope]
- Exclude: [what's explicitly out]
- Depth: [overview | detailed | exhaustive]
- Deliverable: [report type and length]

## Research Threads

### Thread 1: {Subtopic A}
- Questions to answer: ...
- PubMed search strategy: [MeSH terms, filters]
- Expected study types: RCTs, meta-analyses, etc.
- Can run parallel? Yes/No

### Thread 2: {Subtopic B}
- Questions to answer: ...
- PubMed search strategy: ...
- Can run parallel? Yes/No

[Continue for 2-5 threads]

## Thread Dependencies
- Thread 3 depends on Thread 1 findings
- Threads 1, 2, 4 can run in parallel

## Synthesis Strategy
How will threads combine into final answer?

Planning Guidelines:

Research Type	Threads	Pattern
Simple fact-finding	1-2	Sequential
Drug comparison	1 per drug (max 5)	Parallel
Complex investigation	3-5 thematic	Mixed
Literature review	By time period or theme	Sequential

Step 3: Execute Research Threads

PubMed Search Strategy

For each thread, use structured PubMed queries:

# Example search for SGLT2 CV outcomes
pubmed_search_articles(
    queryTerm="SGLT2 inhibitor cardiovascular outcomes randomized controlled trial",
    maxResults=20,
    sortBy="relevance"
)

# Then fetch full details for top results
pubmed_fetch_contents(pmids=["PMID1", "PMID2", ...])

# Find related articles for key papers
pubmed_article_connections(
    sourcePmid="key_paper_pmid",
    relationshipType="pubmed_similar_articles"
)

Parallel Execution Pattern

For independent threads, execute PubMed searches in parallel (multiple tool calls in one turn), then write each to its thread file.

Example: Comparing SGLT2 Inhibitors

Thread 1: Empagliflozin → pubmed_search "empagliflozin cardiovascular RCT" → threads/empagliflozin.md
Thread 2: Dapagliflozin → pubmed_search "dapagliflozin cardiovascular RCT" → threads/dapagliflozin.md
Thread 3: Canagliflozin → pubmed_search "canagliflozin cardiovascular RCT" → threads/canagliflozin.md

Execute all three searches, then write findings to respective files.

Sequential Execution Pattern

For dependent threads, complete each fully before starting the next.

Thread File Format

Each threads/thread_N.md should contain:

# Thread: {Subtopic}

## PubMed Searches Executed
1. Query: [exact query] → [N results] → Top PMIDs: [list]
2. Query: [exact query] → [N results] → Top PMIDs: [list]

## Key Findings

### Finding 1: [Title]
- PMID: [number]
- Citation: [Authors, Journal, Year]
- Study type: RCT / Meta-analysis / Cohort / etc.
- Population: [N patients, characteristics]
- Key result: [HR/OR with 95% CI, p-value]
- Quality: High / Medium / Low [+ brief justification]

### Finding 2: [Title]
- PMID: [number]
...

## Contradictions Found
- PMID X says [claim], PMID Y says [different claim]
- Potential explanation: [patient population, endpoints, timing, etc.]

## Gaps Identified
- No RCT data on [specific question]
- Limited evidence in [patient subgroup]

## Thread Summary
[2-3 sentence synthesis of this thread's findings with key PMIDs cited]

Context Offloading

After every 5-7 tool calls:

Write current findings to appropriate file
Update progress.md with status
Continue with fresh context

Trigger for offload:

Context feeling "full" (responses slowing, losing track)
Switching between threads
Before any synthesis step

Step 4: Validate and Cross-Reference

Read all thread files, then create validation.md:

# Validation Report

## Facts Requiring Cross-Reference
| Claim | Thread Source | PMID | Verification Status | Confidence |
|-------|--------------|------|---------------------|------------|
| SGLT2i reduces HF hospitalization | Thread 1 | 12345678 | Confirmed by PMIDs 23456789, 34567890 | High |
| Benefit extends to HFpEF | Thread 2 | 45678901 | Conflicting: PMID 56789012 shows null | Investigate |

## Contradictions Analysis
### Contradiction 1: [Description]
- Position A: PMID [X], [study name], found [result]
- Position B: PMID [Y], [study name], found [result]
- Resolution: [Population difference / endpoint difference / timing / unresolved]

## Source Quality Assessment
| PMID | Study | Type | N | Quality | Notes |
|------|-------|------|---|---------|-------|
| 12345678 | EMPA-REG | RCT | 7,020 | High | Industry-funded but well-designed |
| 23456789 | Meta-analysis | MA | 45,000 | High | Published in Lancet |

## Validated Knowledge Base
[List of facts we're confident in, with PMIDs]

1. **SGLT2 inhibitors reduce CV death in T2DM with established CVD** (PMID: 12345678, 23456789)
2. **Benefit on HF hospitalization is consistent across the class** (PMID: 34567890, 45678901)
3. ...

Step 5: Synthesize from Files

Critical: Read from files, not memory.

# Read all thread files
cat ~/research_{topic}/threads/*.md

# Read validation
cat ~/research_{topic}/validation.md

Write synthesis.md:

# Synthesis: {Topic}

## Cross-Thread Patterns
[What themes emerge across multiple threads?]

## Key Insights
1. [Insight that required combining multiple threads]
2. [Insight that wasn't obvious in any single thread]
3. ...

## The Answer
[Direct response to the primary research question, with PMID citations]

## Evidence Strength Assessment
- **Strong evidence (multiple RCTs):** [claims]
- **Moderate evidence (single RCT or consistent observational):** [claims]
- **Limited evidence (observational only):** [claims]
- **Expert opinion / guideline extrapolation:** [claims]

## Remaining Gaps
[What we still don't know and would need to investigate further]

Step 6: Generate Final Report

Write report.md using the synthesis:

# {Title}

## Executive Summary
[3-5 sentences: question, key finding, main conclusion with strongest PMID]

## Research Question and Scope
[From plan.md]

## Methodology
- Database: PubMed via NCBI MCP
- Search date: [date]
- Total articles screened: [N]
- Articles included: [N]
- Study types: [breakdown]

## Findings

### {Theme 1}
[Narrative synthesis with inline PMID citations]

### {Theme 2}
...

## Analysis
[Patterns, implications, connections]

## Conclusions
1. [Primary conclusion with evidence level]
2. [Secondary conclusions]

## Clinical Implications
[If applicable: what this means for practice]

## Limitations
- [Search limitations]
- [Evidence gaps]
- [Potential biases]

## References
[Full reference list with PMIDs and DOIs]

1. Author A, Author B, et al. Title. Journal. Year;Vol:Pages. PMID: XXXXXXXX. DOI: XX.XXXX/XXXXX
2. ...

Parallel Research Patterns

Pattern A: Drug/Entity Comparison

Use when: Comparing 2-5 similar entities (drugs, devices, techniques)

User: "Compare CV outcomes of GLP-1 agonists"
→ Thread per drug (semaglutide, tirzepatide, liraglutide)
→ All threads parallel (same PubMed structure)
→ Comparison matrix synthesis

Pattern B: Pro/Con Analysis

Use when: Topic has debate or controversy

User: "Analyze the evidence on aggressive LDL lowering"
→ Thread 1: Evidence FOR aggressive targets (PubMed: LDL <55 outcomes)
→ Thread 2: Evidence AGAINST/concerns (PubMed: LDL lowering adverse effects)
→ Thread 3: Current guidelines (fetch ACC/ESC guideline URLs)
→ Threads 1-2 parallel, Thread 3 after

Pattern C: Evidence + Guidelines

Use when: Need both primary evidence and clinical guidance

User: "What's the evidence on TAVR durability?"
→ Thread 1: Trial data (PubMed: TAVR long-term outcomes RCT)
→ Thread 2: Registry data (PubMed: TAVR registry durability)
→ Thread 3: Guidelines (fetch ACC/ESC valve guidelines)
→ All parallel

Pattern D: Historical Evolution

Use when: Understanding how evidence has evolved

User: "How has heart failure treatment evolved?"
→ Thread 1: Pre-neurohormonal era (PubMed: heart failure treatment 1980-1990)
→ Thread 2: ACE/ARB/BB era (PubMed: heart failure ACE inhibitor landmark)
→ Thread 3: Modern era ARNI/SGLT2 (PubMed: heart failure SGLT2 ARNI)
→ Sequential (each builds context for next)

Quality Checkpoints

After Step 2 (Planning)

Research question is specific and answerable
PubMed search strategies are defined for each thread
Threads are independent where marked parallel
Expected study types are specified

After Step 3 (Execution)

Each thread has 3+ credible PubMed sources
Key claims have specific data (HR, CI, p-value)
All citations have PMIDs
Gaps and contradictions are documented
Thread summaries are written

After Step 4 (Validation)

Key facts cross-referenced across threads
Contradictions analyzed with potential explanations
Source quality assessed for each major citation
Validated knowledge base compiled

After Step 5 (Synthesis)

Cross-thread patterns identified
Primary question directly answered
Evidence strength honestly assessed
Insights go beyond any single thread

Before Delivery

Report structure matches user's requested format
All claims have PMID citations
Executive summary is truly executive (skimmable)
Reference list is complete with DOIs

Common Research Pitfalls

Pitfall	Symptom	Fix
Context overflow	Losing track of earlier findings	Write to files every 5-7 tool calls
Confirmation bias	All sources agree suspiciously	Search for contradicting evidence explicitly
Recency bias	Only 2023-2024 sources	Include landmark trials regardless of date
Source homogeneity	All RCTs, no guidelines	Add guideline thread for clinical context
Scope creep	Research expanding endlessly	Return to plan.md, enforce boundaries
Premature synthesis	Concluding before validation	Complete Step 4 before Step 5
Memory-based synthesis	Citing from recall	Read files explicitly during Step 5
Non-PubMed citations	Citing Perplexity/web	Delete and replace with PubMed source

Example: Full Research Session

User: "Research the current evidence on colchicine for cardiovascular prevention"

Step 1: Initialize

mkdir -p ~/research_colchicine_cv/sources
mkdir -p ~/research_colchicine_cv/threads

Step 2: Plan (write to plan.md)

Primary question: What's the evidence for colchicine in CV prevention?
Thread 1: Major RCTs (COLCOT, LoDoCo2, CLEAR SYNERGY)
- PubMed: "colchicine cardiovascular randomized controlled trial"
Thread 2: Mechanisms and anti-inflammatory hypothesis
- PubMed: "colchicine inflammation atherosclerosis mechanism"
Thread 3: Guidelines and clinical adoption
- Fetch: ACC/ESC guideline URLs for stable CAD
Thread 4: Safety and practical considerations
- PubMed: "colchicine adverse effects cardiovascular"
Threads 1, 2, 4 parallel; Thread 3 after 1 completes

Step 3: Execute

# Parallel searches
pubmed_search_articles(queryTerm="colchicine cardiovascular randomized controlled trial", maxResults=15)
pubmed_search_articles(queryTerm="colchicine inflammation atherosclerosis mechanism", maxResults=10)
pubmed_search_articles(queryTerm="colchicine adverse effects cardiovascular", maxResults=10)

# Fetch top results
pubmed_fetch_contents(pmids=["31733140", "32865377", "37634428"])  # COLCOT, LoDoCo2, CLEAR

# Write to thread files

Step 4: Validate

Read all thread files
Cross-reference mortality data across trials
Note: CLEAR SYNERGY neutral vs positive COLCOT/LoDoCo2
Analyze: Patient population differences (post-ACS vs chronic CAD)
Write validation.md

Step 5: Synthesize

Read from files
Pattern: Inflammation hypothesis supported, but patient selection matters
Insight: Post-ACS (COLCOT) benefit clear; chronic stable CAD (CLEAR) less certain
Write synthesis.md

Step 6: Report

Structured report with evidence summary
Clear recommendation by patient type
All PMIDs cited
Complete reference list

Integration with Other Skills

This skill provides research foundation for:

cardiology-editorial → Use research output for trial analysis
cardiology-newsletter-writer → Research before writing
youtube-script-master → Research for script evidence base
x-post-creator-skill → Research before tweet generation

Workflow:

User requests content on topic
Run deep-researcher first (this skill)
Pass validated findings to writing skill
Writing skill cites PMIDs from research output

When NOT to Use This Skill

Simple factual questions (use PubMed MCP directly)
Trend discovery (use Perplexity, but don't cite)
Non-medical topics (this skill is optimized for PubMed)
Quick content needs (use writing skill directly with inline research)

Use this skill when you need:

5+ sources synthesized
Complex multi-faceted questions
Rigorous evidence assessment
Comprehensive literature coverage

deep-researcher

When & Why to Use This Skill

Use Cases