Question 1

What is Server Management?

Accepted Answer

This Claude skill serves as a comprehensive guide for server management and production operations, emphasizing architectural principles and strategic decision-making over simple command memorization. It provides structured frameworks for process management, monitoring, log rotation, scaling strategies, and security protocols to ensure high availability and system stability.

Question 2

When should I use Server Management?

Accepted Answer

Server Management is useful in the following scenarios: • Establishing robust process management using PM2 or systemd to ensure auto-recovery and zero-downtime reloads for production applications. • Designing a multi-tiered monitoring and alerting strategy to track system health, performance metrics, and resource utilization effectively. • Implementing structured logging and rotation policies to maintain system auditability while preventing disk space exhaustion. • Determining the optimal scaling approach (vertical vs. horizontal) based on real-time resource bottlenecks and traffic patterns. • Executing a prioritized troubleshooting workflow to systematically resolve service outages, resource leaks, or connectivity issues. • Hardening server security through SSH-only access, firewall configuration, and regular patch management principles.

name	server-management
description	Server management principles and decision-making. Process management, monitoring strategy, and scaling decisions. Teaches thinking, not commands.
allowed-tools	Read, Write, Edit, Glob, Grep, Bash

Scenario	Tool
Node.js app	PM2 (clustering, reload)
Any app	systemd (Linux native)
Containers	Docker/Podman
Orchestration	Kubernetes, Docker Swarm

Goal	What It Means
Restart on crash	Auto-recovery
Zero-downtime reload	No service interruption
Clustering	Use all CPU cores
Persistence	Survive server reboot

Category	Key Metrics
Availability	Uptime, health checks
Performance	Response time, throughput
Errors	Error rate, types
Resources	CPU, memory, disk

Level	Response
Critical	Immediate action
Warning	Investigate soon
Info	Review daily

Server Management

When & Why to Use This Skill

Use Cases

Server Management

1. Process Management Principles

Tool Selection

Process Management Goals

2. Monitoring Principles

What to Monitor

Alert Severity Strategy

Monitoring Tool Selection

3. Log Management Principles

Log Strategy

Log Principles

4. Scaling Decisions

When to Scale

Scaling Strategy

5. Health Check Principles

What Constitutes Healthy

Health Check Implementation

6. Security Principles

7. Troubleshooting Priority

8. Anti-Patterns

Need	Options
Simple/Free	PM2 metrics, htop
Full observability	Grafana, Datadog
Error tracking	Sentry
Uptime	UptimeRobot, Pingdom

Log Type	Purpose
Application logs	Debug, audit
Access logs	Traffic analysis
Error logs	Issue detection

Symptom	Solution
High CPU	Add instances (horizontal)
High memory	Increase RAM or fix leak
Slow response	Profile first, then scale
Traffic spikes	Auto-scaling

Type	When to Use
Vertical	Quick fix, single instance
Horizontal	Sustainable, distributed
Auto	Variable traffic

Check	Meaning
HTTP 200	Service responding
Database connected	Data accessible
Dependencies OK	External services reachable
Resources OK	CPU/memory not exhausted

Area	Principle
Access	SSH keys only, no passwords
Firewall	Only needed ports open
Updates	Regular security patches
Secrets	Environment vars, not files
Audit	Log access and changes

❌ Don't	✅ Do
Run as root	Use non-root user
Ignore logs	Set up log rotation
Skip monitoring	Monitor from day one
Manual restarts	Auto-restart config
No backups	Regular backup schedule