long-session

eihli's avatarfrom eihli

Set up artifacts for multi-session autonomous work.Use when: (1) starting overnight/long autonomous runs,(2) user says "long session", "overnight", "autonomous",(3) work will span multiple sessions without user interaction.

0stars🔀0forks📁View on GitHub🕐Updated Jan 11, 2026

When & Why to Use This Skill

This Claude skill provides a structured framework for managing long-running, autonomous AI agent sessions. It streamlines the preparation process for multi-session work by establishing a clear workflow—covering exploration, goal setting, planning, and Test-Driven Development (TDD)—ensuring the agent can operate independently and reliably over extended periods without constant user supervision.

Use Cases

  • Overnight Autonomous Development: Setting up a comprehensive environment and success criteria for the AI to implement complex features or refactor codebases while the user is offline.
  • Multi-Session Project Management: Creating persistent artifacts like 'features.json' and 'claude-progress.txt' to maintain state and context across different interaction sessions, preventing loss of progress.
  • Structured TDD Loops: Establishing a rigorous 'Test-Before-Code' workflow where the agent is provided with failing tests and clear completion promises to ensure high-quality autonomous output.
  • Complex Task Delegation: Breaking down large, ambiguous requests into manageable phases with explicit user confirmation checkpoints before launching an unattended execution loop.
namelong-session
description|
Use when(1) starting overnight/long autonomous runs,

Long Session: Ralph Loop Prep

Structured prep before entering autonomous loop. Each phase requires user confirmation.

Phase 1: Explore

Understand the codebase before planning.

- Read README, CLAUDE.md, project structure
- Identify tech stack, patterns, conventions
- Find existing tests and how they run
- Summarize findings to user (keep brief)

Phase 2: Goals

Ask user what they want built.

Questions to ask:
- What feature(s) should this session produce?
- Any constraints or requirements?
- What does "done" look like?
- How long can this run unattended? (set --max-iterations)

Create/update features.json:

{
  "features": [
    {"id": 1, "name": "feature name", "done_when": "criteria", "tests": []}
  ],
  "max_iterations": 50
}

Phase 3: Plan

Design implementation approach.

For each feature:
- Files to create/modify
- Dependencies or prerequisites
- Risks or unknowns
- Order of implementation

Present plan to user. Do NOT proceed without explicit approval.

Phase 4: Write Tests

TDD: tests define "done" for the ralph loop.

- Write failing tests that capture each feature's acceptance criteria
- Tests should be runnable (pytest, jest, etc.)
- Verify tests fail for the right reasons
- Update features.json with test file paths

Show user the test names/descriptions. Confirm they capture intent.

Phase 5: Confirm & Launch

Final checklist before autonomous loop:

1. User approved plan? [y/n]
2. Tests written and failing? [y/n]
3. Tests capture all features? [y/n]
4. max_iterations set? [value]
5. Completion criteria clear? [y/n]

If all confirmed, create ralph prompt:

Implement features to make all tests pass.

Features: [list from features.json]
Tests: [test commands]
Constraints: [from CLAUDE.md + user input]

After each iteration:
- Run tests
- If all pass + lint clean + types pass → output <promise>COMPLETE</promise>
- If stuck after 5 iterations on same issue → document blocker, continue

Output <promise>COMPLETE</promise> only when:
- All tests green
- No type errors
- Lint clean
- features.json updated with done_when satisfied

Then launch:

/ralph-loop "<prompt>" --max-iterations <n> --completion-promise "COMPLETE"

Artifacts Created

File Purpose
features.json Feature list with completion criteria
claude-progress.txt Session handoff notes (updated by ralph)

Constraints

  • User confirms each phase - no skipping ahead
  • Tests before loop - ralph needs pass/fail signal
  • One feature at a time in implementation (ralph handles this)
  • max_iterations always set - safety net