deep-researcher

drshailesh88's avatarfrom drshailesh88

Performs comprehensive, multi-layered research on any topic with structured analysis and synthesis of information from multiple sources. Uses file-based research tracking, parallel investigation threads, and context-efficient patterns for deep investigations. ALL MEDICAL CITATIONS FROM PUBMED MCP ONLY.

0stars🔀0forks📁View on GitHub🕐Updated Jan 3, 2026

When & Why to Use This Skill

The Deep Researcher skill is a professional-grade tool designed for exhaustive, multi-layered investigations and evidence-based synthesis. It utilizes a structured, file-based methodology to overcome context window limitations, allowing for the analysis of 5+ sources simultaneously. Specifically optimized for medical and scientific rigor, it enforces strict citation policies using PubMed MCP, ensuring all findings are backed by peer-reviewed literature, randomized controlled trials, and official clinical guidelines.

Use Cases

  • Medical Literature Reviews: Systematically searching and synthesizing clinical trial data from PubMed to compare the efficacy and safety profiles of different pharmaceutical interventions.
  • Evidence-Based Clinical Analysis: Developing comprehensive reports on medical mechanisms or treatment evolutions by cross-referencing primary research with official guidelines from organizations like ACC, ESC, or AHA.
  • Academic Paper Preparation: Managing complex research projects with parallel investigation threads, persistent progress tracking, and automated validation of citations and PMIDs.
  • Scientific Gap Analysis: Identifying contradictions in current research and documenting evidence gaps through a rigorous cross-referencing and source quality assessment workflow.
namedeep-researcher
descriptionPerforms comprehensive, multi-layered research on any topic with structured analysis and synthesis of information from multiple sources. Uses file-based research tracking, parallel investigation threads, and context-efficient patterns for deep investigations. ALL MEDICAL CITATIONS FROM PUBMED MCP ONLY.

Deep Researcher v2.0

Comprehensive research methodology with file-based tracking, parallel execution, and context management for investigations requiring 5+ sources.

CRITICAL: All medical evidence and citations must come from PubMed MCP. No exceptions.


Research Modes

Quick Research (1-4 sources): Work in-context, no file structure needed.

Deep Research (5+ sources): Use file-based tracking below.


Research Sources (STRICT POLICY)

ALLOWED for Medical Citations

Source Tool Use Case
PubMed MCP pubmed_search_articles, pubmed_fetch_contents, pubmed_article_connections ALL medical evidence, trials, mechanisms
Official Guidelines web_fetch to ACC/ESC/ADA/AHA URLs only Guideline recommendations
AstraDB RAG Knowledge pipeline Textbook references, pre-loaded guidelines

NOT ALLOWED for Medical Citations

Source Why Excluded Allowed Use
OpenAlex Quality variable REMOVED
Perplexity Not peer-reviewed Trend discovery only, NEVER cite
General web search Unreliable Topic discovery only, NEVER cite
News articles Not primary evidence Background context only

PubMed Quality Filters

Prefer (Tier 1):

  • Randomized Controlled Trials (RCTs)
  • Meta-analyses and Systematic Reviews
  • Guidelines from ACC/ESC/ADA/AHA

Accept (Tier 2):

  • Large observational studies from Q1 journals
  • Cohort studies with >1000 patients
  • Registry data from established registries

Use Cautiously (Tier 3):

  • Case series (only if no better evidence)
  • Expert consensus statements
  • Narrative reviews (as background, not primary evidence)

Reject:

  • Case reports (except for rare conditions)
  • Letters to editor
  • Preprints without peer review
  • Animal studies (unless specifically about mechanisms)

Deep Research Workflow

Progress Tracking

Create this checklist and update after each step:

Deep Research Progress:
- [ ] Step 1: Initialize research project
- [ ] Step 2: Define scope and plan
- [ ] Step 3: Execute research threads (parallel when possible)
- [ ] Step 4: Validate and cross-reference
- [ ] Step 5: Synthesize from files
- [ ] Step 6: Generate final report

Step 1: Initialize Research Project

For research requiring 5+ sources, create a project structure:

mkdir -p ~/research_{topic}/sources
mkdir -p ~/research_{topic}/threads

Project Structure:

~/research_{topic}/
├── plan.md              # Research questions, scope, thread assignments
├── progress.md          # Living checklist, updated throughout
├── sources/
│   └── pubmed.md        # PubMed search results and abstracts
├── threads/
│   ├── thread_1.md      # Independent research thread
│   ├── thread_2.md      # Another thread
│   └── ...
├── validation.md        # Cross-reference and credibility check
├── synthesis.md         # Cross-thread analysis
└── report.md            # Final deliverable

Why file-based? Context windows fill up. Writing findings to files lets you:

  • Continue researching without context pressure
  • Synthesize from persistent storage, not memory
  • Produce larger, more comprehensive reports
  • Resume if interrupted

Step 2: Define Scope and Research Plan

Write plan.md with:

# Research Plan: {Topic}

## Primary Question
[The main thing we're trying to answer]

## Scope
- Include: [what's in scope]
- Exclude: [what's explicitly out]
- Depth: [overview | detailed | exhaustive]
- Deliverable: [report type and length]

## Research Threads

### Thread 1: {Subtopic A}
- Questions to answer: ...
- PubMed search strategy: [MeSH terms, filters]
- Expected study types: RCTs, meta-analyses, etc.
- Can run parallel? Yes/No

### Thread 2: {Subtopic B}
- Questions to answer: ...
- PubMed search strategy: ...
- Can run parallel? Yes/No

[Continue for 2-5 threads]

## Thread Dependencies
- Thread 3 depends on Thread 1 findings
- Threads 1, 2, 4 can run in parallel

## Synthesis Strategy
How will threads combine into final answer?

Planning Guidelines:

Research Type Threads Pattern
Simple fact-finding 1-2 Sequential
Drug comparison 1 per drug (max 5) Parallel
Complex investigation 3-5 thematic Mixed
Literature review By time period or theme Sequential

Step 3: Execute Research Threads

PubMed Search Strategy

For each thread, use structured PubMed queries:

# Example search for SGLT2 CV outcomes
pubmed_search_articles(
    queryTerm="SGLT2 inhibitor cardiovascular outcomes randomized controlled trial",
    maxResults=20,
    sortBy="relevance"
)

# Then fetch full details for top results
pubmed_fetch_contents(pmids=["PMID1", "PMID2", ...])

# Find related articles for key papers
pubmed_article_connections(
    sourcePmid="key_paper_pmid",
    relationshipType="pubmed_similar_articles"
)

Parallel Execution Pattern

For independent threads, execute PubMed searches in parallel (multiple tool calls in one turn), then write each to its thread file.

Example: Comparing SGLT2 Inhibitors

Thread 1: Empagliflozin → pubmed_search "empagliflozin cardiovascular RCT" → threads/empagliflozin.md
Thread 2: Dapagliflozin → pubmed_search "dapagliflozin cardiovascular RCT" → threads/dapagliflozin.md
Thread 3: Canagliflozin → pubmed_search "canagliflozin cardiovascular RCT" → threads/canagliflozin.md

Execute all three searches, then write findings to respective files.

Sequential Execution Pattern

For dependent threads, complete each fully before starting the next.

Thread File Format

Each threads/thread_N.md should contain:

# Thread: {Subtopic}

## PubMed Searches Executed
1. Query: [exact query] → [N results] → Top PMIDs: [list]
2. Query: [exact query] → [N results] → Top PMIDs: [list]

## Key Findings

### Finding 1: [Title]
- PMID: [number]
- Citation: [Authors, Journal, Year]
- Study type: RCT / Meta-analysis / Cohort / etc.
- Population: [N patients, characteristics]
- Key result: [HR/OR with 95% CI, p-value]
- Quality: High / Medium / Low [+ brief justification]

### Finding 2: [Title]
- PMID: [number]
...

## Contradictions Found
- PMID X says [claim], PMID Y says [different claim]
- Potential explanation: [patient population, endpoints, timing, etc.]

## Gaps Identified
- No RCT data on [specific question]
- Limited evidence in [patient subgroup]

## Thread Summary
[2-3 sentence synthesis of this thread's findings with key PMIDs cited]

Context Offloading

After every 5-7 tool calls:

  1. Write current findings to appropriate file
  2. Update progress.md with status
  3. Continue with fresh context

Trigger for offload:

  • Context feeling "full" (responses slowing, losing track)
  • Switching between threads
  • Before any synthesis step

Step 4: Validate and Cross-Reference

Read all thread files, then create validation.md:

# Validation Report

## Facts Requiring Cross-Reference
| Claim | Thread Source | PMID | Verification Status | Confidence |
|-------|--------------|------|---------------------|------------|
| SGLT2i reduces HF hospitalization | Thread 1 | 12345678 | Confirmed by PMIDs 23456789, 34567890 | High |
| Benefit extends to HFpEF | Thread 2 | 45678901 | Conflicting: PMID 56789012 shows null | Investigate |

## Contradictions Analysis
### Contradiction 1: [Description]
- Position A: PMID [X], [study name], found [result]
- Position B: PMID [Y], [study name], found [result]
- Resolution: [Population difference / endpoint difference / timing / unresolved]

## Source Quality Assessment
| PMID | Study | Type | N | Quality | Notes |
|------|-------|------|---|---------|-------|
| 12345678 | EMPA-REG | RCT | 7,020 | High | Industry-funded but well-designed |
| 23456789 | Meta-analysis | MA | 45,000 | High | Published in Lancet |

## Validated Knowledge Base
[List of facts we're confident in, with PMIDs]

1. **SGLT2 inhibitors reduce CV death in T2DM with established CVD** (PMID: 12345678, 23456789)
2. **Benefit on HF hospitalization is consistent across the class** (PMID: 34567890, 45678901)
3. ...

Step 5: Synthesize from Files

Critical: Read from files, not memory.

# Read all thread files
cat ~/research_{topic}/threads/*.md

# Read validation
cat ~/research_{topic}/validation.md

Write synthesis.md:

# Synthesis: {Topic}

## Cross-Thread Patterns
[What themes emerge across multiple threads?]

## Key Insights
1. [Insight that required combining multiple threads]
2. [Insight that wasn't obvious in any single thread]
3. ...

## The Answer
[Direct response to the primary research question, with PMID citations]

## Evidence Strength Assessment
- **Strong evidence (multiple RCTs):** [claims]
- **Moderate evidence (single RCT or consistent observational):** [claims]
- **Limited evidence (observational only):** [claims]
- **Expert opinion / guideline extrapolation:** [claims]

## Remaining Gaps
[What we still don't know and would need to investigate further]

Step 6: Generate Final Report

Write report.md using the synthesis:

# {Title}

## Executive Summary
[3-5 sentences: question, key finding, main conclusion with strongest PMID]

## Research Question and Scope
[From plan.md]

## Methodology
- Database: PubMed via NCBI MCP
- Search date: [date]
- Total articles screened: [N]
- Articles included: [N]
- Study types: [breakdown]

## Findings

### {Theme 1}
[Narrative synthesis with inline PMID citations]

### {Theme 2}
...

## Analysis
[Patterns, implications, connections]

## Conclusions
1. [Primary conclusion with evidence level]
2. [Secondary conclusions]

## Clinical Implications
[If applicable: what this means for practice]

## Limitations
- [Search limitations]
- [Evidence gaps]
- [Potential biases]

## References
[Full reference list with PMIDs and DOIs]

1. Author A, Author B, et al. Title. Journal. Year;Vol:Pages. PMID: XXXXXXXX. DOI: XX.XXXX/XXXXX
2. ...

Parallel Research Patterns

Pattern A: Drug/Entity Comparison

Use when: Comparing 2-5 similar entities (drugs, devices, techniques)

User: "Compare CV outcomes of GLP-1 agonists"
→ Thread per drug (semaglutide, tirzepatide, liraglutide)
→ All threads parallel (same PubMed structure)
→ Comparison matrix synthesis

Pattern B: Pro/Con Analysis

Use when: Topic has debate or controversy

User: "Analyze the evidence on aggressive LDL lowering"
→ Thread 1: Evidence FOR aggressive targets (PubMed: LDL <55 outcomes)
→ Thread 2: Evidence AGAINST/concerns (PubMed: LDL lowering adverse effects)
→ Thread 3: Current guidelines (fetch ACC/ESC guideline URLs)
→ Threads 1-2 parallel, Thread 3 after

Pattern C: Evidence + Guidelines

Use when: Need both primary evidence and clinical guidance

User: "What's the evidence on TAVR durability?"
→ Thread 1: Trial data (PubMed: TAVR long-term outcomes RCT)
→ Thread 2: Registry data (PubMed: TAVR registry durability)
→ Thread 3: Guidelines (fetch ACC/ESC valve guidelines)
→ All parallel

Pattern D: Historical Evolution

Use when: Understanding how evidence has evolved

User: "How has heart failure treatment evolved?"
→ Thread 1: Pre-neurohormonal era (PubMed: heart failure treatment 1980-1990)
→ Thread 2: ACE/ARB/BB era (PubMed: heart failure ACE inhibitor landmark)
→ Thread 3: Modern era ARNI/SGLT2 (PubMed: heart failure SGLT2 ARNI)
→ Sequential (each builds context for next)

Quality Checkpoints

After Step 2 (Planning)

  • Research question is specific and answerable
  • PubMed search strategies are defined for each thread
  • Threads are independent where marked parallel
  • Expected study types are specified

After Step 3 (Execution)

  • Each thread has 3+ credible PubMed sources
  • Key claims have specific data (HR, CI, p-value)
  • All citations have PMIDs
  • Gaps and contradictions are documented
  • Thread summaries are written

After Step 4 (Validation)

  • Key facts cross-referenced across threads
  • Contradictions analyzed with potential explanations
  • Source quality assessed for each major citation
  • Validated knowledge base compiled

After Step 5 (Synthesis)

  • Cross-thread patterns identified
  • Primary question directly answered
  • Evidence strength honestly assessed
  • Insights go beyond any single thread

Before Delivery

  • Report structure matches user's requested format
  • All claims have PMID citations
  • Executive summary is truly executive (skimmable)
  • Reference list is complete with DOIs

Common Research Pitfalls

Pitfall Symptom Fix
Context overflow Losing track of earlier findings Write to files every 5-7 tool calls
Confirmation bias All sources agree suspiciously Search for contradicting evidence explicitly
Recency bias Only 2023-2024 sources Include landmark trials regardless of date
Source homogeneity All RCTs, no guidelines Add guideline thread for clinical context
Scope creep Research expanding endlessly Return to plan.md, enforce boundaries
Premature synthesis Concluding before validation Complete Step 4 before Step 5
Memory-based synthesis Citing from recall Read files explicitly during Step 5
Non-PubMed citations Citing Perplexity/web Delete and replace with PubMed source

Example: Full Research Session

User: "Research the current evidence on colchicine for cardiovascular prevention"

Step 1: Initialize

mkdir -p ~/research_colchicine_cv/sources
mkdir -p ~/research_colchicine_cv/threads

Step 2: Plan (write to plan.md)

  • Primary question: What's the evidence for colchicine in CV prevention?
  • Thread 1: Major RCTs (COLCOT, LoDoCo2, CLEAR SYNERGY)
    • PubMed: "colchicine cardiovascular randomized controlled trial"
  • Thread 2: Mechanisms and anti-inflammatory hypothesis
    • PubMed: "colchicine inflammation atherosclerosis mechanism"
  • Thread 3: Guidelines and clinical adoption
    • Fetch: ACC/ESC guideline URLs for stable CAD
  • Thread 4: Safety and practical considerations
    • PubMed: "colchicine adverse effects cardiovascular"
  • Threads 1, 2, 4 parallel; Thread 3 after 1 completes

Step 3: Execute

# Parallel searches
pubmed_search_articles(queryTerm="colchicine cardiovascular randomized controlled trial", maxResults=15)
pubmed_search_articles(queryTerm="colchicine inflammation atherosclerosis mechanism", maxResults=10)
pubmed_search_articles(queryTerm="colchicine adverse effects cardiovascular", maxResults=10)

# Fetch top results
pubmed_fetch_contents(pmids=["31733140", "32865377", "37634428"])  # COLCOT, LoDoCo2, CLEAR

# Write to thread files

Step 4: Validate

  • Read all thread files
  • Cross-reference mortality data across trials
  • Note: CLEAR SYNERGY neutral vs positive COLCOT/LoDoCo2
  • Analyze: Patient population differences (post-ACS vs chronic CAD)
  • Write validation.md

Step 5: Synthesize

  • Read from files
  • Pattern: Inflammation hypothesis supported, but patient selection matters
  • Insight: Post-ACS (COLCOT) benefit clear; chronic stable CAD (CLEAR) less certain
  • Write synthesis.md

Step 6: Report

  • Structured report with evidence summary
  • Clear recommendation by patient type
  • All PMIDs cited
  • Complete reference list

Integration with Other Skills

This skill provides research foundation for:

  • cardiology-editorial → Use research output for trial analysis
  • cardiology-newsletter-writer → Research before writing
  • youtube-script-master → Research for script evidence base
  • x-post-creator-skill → Research before tweet generation

Workflow:

  1. User requests content on topic
  2. Run deep-researcher first (this skill)
  3. Pass validated findings to writing skill
  4. Writing skill cites PMIDs from research output

When NOT to Use This Skill

  • Simple factual questions (use PubMed MCP directly)
  • Trend discovery (use Perplexity, but don't cite)
  • Non-medical topics (this skill is optimized for PubMed)
  • Quick content needs (use writing skill directly with inline research)

Use this skill when you need:

  • 5+ sources synthesized
  • Complex multi-faceted questions
  • Rigorous evidence assessment
  • Comprehensive literature coverage