Preventing Silent Bot Deaths: 24/7 Supervision Strategies

Published on

How to keep your Discord bot running without mysterious crashes. Covers process managers, health checks, automatic restarts, and monitoring for bots that die silently.

Written by Jochem, Infrastructure Expert, 5-10 years experience in game server hosting, VPS infrastructure, and 24/7 streaming solutions. Read author bio →

The worst kind of bot failure is the silent one. Your bot crashes at 3 AM, no errors in the logs, and you don't notice until someone messages you at noon asking why the bot is down.

Discord bot monitoring dashboard

Why Bots Die Silently

CauseFrequencyDetection
Memory leak (OOM kill)CommonSystem logs only
Discord gateway disconnectCommonNo error if not caught
Unhandled promise rejectionVery commonProcess exits silently
Host restartsOccasionalNo notification
Rate limitingRareBot stops responding but stays "alive"

The most insidious is the memory leak. Your Node.js or Python bot slowly consumes more RAM until the OS kills it. The process just disappears. No crash log, no error output, nothing.

Process Managers

PM2 (Node.js)

npm install -g pm2
pm2 start bot.js --name "my-bot"
pm2 save
pm2 startup

PM2 automatically:

  • Restarts the bot if it crashes
  • Starts the bot on system boot
  • Logs all output (including crash reasons)
  • Monitors memory usage

Set a memory limit to prevent OOM kills:

pm2 start bot.js --max-memory-restart 200M

When the bot exceeds 200MB RAM, PM2 restarts it cleanly instead of letting the OS kill it violently.

Systemd (Any Language)

[Unit]
Description=My Discord Bot
After=network.target

[Service]
Type=simple
User=botuser
WorkingDirectory=/home/botuser/bot
ExecStart=/usr/bin/node bot.js
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

Restart=always means the bot restarts after any exit, including crashes, OOM kills, and manual stops.

Health Checks

Internal Health Check

Your bot should report its own status:

// Every 5 minutes, check if the bot is actually working
setInterval(() => {
    if (!client.ws.ping) {
        console.error('WebSocket connection lost, restarting...');
        process.exit(1); // PM2 will restart
    }
    
    const memUsage = process.memoryUsage().heapUsed / 1024 / 1024;
    console.log(`Health: OK | Memory: ${memUsage.toFixed(1)}MB | Ping: ${client.ws.ping}ms`);
}, 300000);

External Health Check

Run a separate monitoring script that checks if the bot is responsive:

import discord
import aiohttp
import asyncio

async def check_bot():
    try:
        async with aiohttp.ClientSession() as session:
            # Check if bot is in the expected guild
            headers = {'Authorization': f'Bot {BOT_TOKEN}'}
            async with session.get('https://discord.com/api/v10/users/@me', headers=headers) as resp:
                if resp.status == 200:
                    return True
    except Exception:
        pass
    return False

Notification System

When the bot goes down, you need to know immediately:

Discord Webhook Alert

#!/bin/bash
# check_bot.sh - Run via cron every 2 minutes

BOT_PROCESS="bot.js"
WEBHOOK="https://discord.com/api/webhooks/YOUR_WEBHOOK"

if ! pgrep -f "$BOT_PROCESS" > /dev/null; then
    curl -H "Content-Type: application/json"         -d '{"content":"Bot is DOWN! Process not found. Auto-restart triggered."}'         "$WEBHOOK"
    
    cd /home/botuser/bot && pm2 restart my-bot
fi

Uptime Monitoring

If your bot has a web dashboard or health endpoint:

  • UptimeRobot (free) pings your endpoint every 5 minutes
  • Sends notifications via email, SMS, or webhook

Hosting Matters

Silent deaths are more common on:

  • Free hosting (Replit, Glitch) that kills idle processes
  • Shared hosting with aggressive resource limits
  • Home computers that sleep or restart

Space-Node's Discord bot hosting starts FREE for small bots. The always-on infrastructure means your bot's process manager keeps running even when you're asleep.

The difference between a bot that's "usually online" and one that has 99.9% uptime is entirely about monitoring and automatic recovery. Set up PM2, health checks, and alerts, and your bot will recover from crashes before anyone notices.

Jochem

About the Author

Jochem, Infrastructure Expert, expert in game server hosting, VPS infrastructure, and 24/7 streaming solutions with 5-10 years experience.

Since 2023
500+ servers hosted
4.8/5 avg rating

I specialize in Minecraft, FiveM, Rust, and 24/7 streaming infrastructure, operating enterprise-grade AMD Ryzen 9 hardware in Netherlands datacenters.

View my full bio and credentials →

Keep Your Bot Online 24/7

Reliable Discord bot hosting powered by enterprise AMD Ryzen 9 hardware. Start free, upgrade anytime with guaranteed uptime.

Preventing Silent Bot Deaths: 24/7 Supervision Strategies