Real-Time Invite Tracking and Fake Account Detection

Discord servers live and die by their communities. Server admins need to know where their members are coming from, which invite links are working, which promoters are bringing real users, and which are flooding the server with fake accounts. We built a system to answer those questions in real time.

The Invite Tracking Problem

Discord provides basic invite tracking through its API, but the data is surprisingly limited. You can see invite codes and their use counts, but you cannot directly see which specific user used which invite. The API only gives you a snapshot of invite counts, not a per-join attribution.

The solution is differential tracking: capture the invite counts before and after each join event, then calculate which invite code's count increased.

// On member join:
// 1. Fetch current invite counts
// 2. Compare with cached counts from before the join
// 3. The invite whose count increased by 1 = the invite used
// 4. Update cache

This sounds simple, but at scale it gets tricky. If two members join within milliseconds of each other, the differential tracking can produce race conditions. I solved this with a per-guild mutex that serialises join event processing.

Building the Attribution System

Each join produces an attribution record:

{
  "guildId": "123456",
  "userId": "789012",
  "inviterId": "345678",
  "inviteCode": "abc123",
  "joinedAt": "2023-03-10T14:22:00Z",
  "accountAge": "2 days",
  "flags": ["new_account"]
}

This record links the new member to the person who invited them. Over time, you build a complete map of who invited whom, which is exactly what community managers need when running referral programmes or identifying the highest-value community ambassadors.

Fake Account Detection

This is where it gets interesting. Fake accounts, created in bulk to inflate server numbers or spam, have telltale patterns:

Heuristic 1: Account Age

Accounts created within the last 7 days are flagged. Accounts created within the last 24 hours are escalated. Legitimate users occasionally have new accounts, but bulk-created fake accounts almost always have creation dates within hours of joining.

Heuristic 2: Join Velocity

If the same invite link is used by 10 accounts within 5 minutes, that is suspicious. Legitimate invite sharing produces a steady trickle of joins. Fake account floods produce bursts.

Heuristic 3: Behavioural Signals

After joining, fake accounts exhibit predictable behaviour:

No activity. They join and go silent. Real users at least look around.
Immediate DM spam. Fake accounts often send direct messages to members within minutes of joining.
Default avatar. While legitimate users can have default avatars, the combination of a new account + default avatar + burst join pattern is a strong signal.

Scoring System

Each heuristic that fires contributes a weighted amount to a combined risk score between 0 and 1:

Account age < 24h: 0.4
Burst join pattern: 0.3
Default avatar: 0.1
No activity after 1 hour: 0.2

A combined score above 0.7 triggers automatic action (configurable per server: kick, ban, quarantine role, or just flag for review).

Real-Time Dashboard

Admins see their invite data in real time:

Leaderboard: which members have invited the most people
Invite breakdown: which links are performing best
Risk alerts: flagged accounts with their risk scores
Retention data: what percentage of invited members are still active after 7 days

The retention metric was the most popular feature. Server admins could finally see not just who was joining, but who was staying. An inviter who brings 100 members who all leave within a day is less valuable than one who brings 10 members who become active community participants.

The Scale Challenge

Processing join events for thousands of servers means handling thousands of invite cache comparisons per minute. The invite cache alone, storing invite counts for every active invite across every server, consumed significant memory.

I moved from in-memory caching to Redis, keyed by guild ID. This allowed horizontal scaling (multiple bot shards sharing the same cache) and persistence across restarts. The trade-off was latency, since Redis adds a network hop compared to in-memory access, but the reliability was worth it.

Lessons Learned

Differential tracking is fragile. Race conditions are real. Protect shared state with locks.
Heuristics beat ML for this scale. I considered training a classifier, but hand-tuned heuristics were faster to implement, easier to explain to admins, and more predictable in production.
Admins want control. Every automated action should be configurable. What is spam in one server is normal in another.

As I described in Building Discord Bots at Scale, the event processing pipeline needs to be efficient, and invite tracking is one of the most processing-intensive features in the entire bot.

The Invite Tracking Problem

The solution is differential tracking: capture the invite counts before and after each join event, then calculate which invite code's count increased.

// On member join:
// 1. Fetch current invite counts
// 2. Compare with cached counts from before the join
// 3. The invite whose count increased by 1 = the invite used
// 4. Update cache

Building the Attribution System

Each join produces an attribution record:

{
  "guildId": "123456",
  "userId": "789012",
  "inviterId": "345678",
  "inviteCode": "abc123",
  "joinedAt": "2023-03-10T14:22:00Z",
  "accountAge": "2 days",
  "flags": ["new_account"]
}

Fake Account Detection

This is where it gets interesting. Fake accounts, created in bulk to inflate server numbers or spam, have telltale patterns:

Heuristic 1: Account Age

Heuristic 2: Join Velocity

If the same invite link is used by 10 accounts within 5 minutes, that is suspicious. Legitimate invite sharing produces a steady trickle of joins. Fake account floods produce bursts.

Heuristic 3: Behavioural Signals

After joining, fake accounts exhibit predictable behaviour:

No activity. They join and go silent. Real users at least look around.
Immediate DM spam. Fake accounts often send direct messages to members within minutes of joining.
Default avatar. While legitimate users can have default avatars, the combination of a new account + default avatar + burst join pattern is a strong signal.

Scoring System

Each heuristic that fires contributes a weighted amount to a combined risk score between 0 and 1:

Account age < 24h: 0.4
Burst join pattern: 0.3
Default avatar: 0.1
No activity after 1 hour: 0.2

A combined score above 0.7 triggers automatic action (configurable per server: kick, ban, quarantine role, or just flag for review).

Real-Time Dashboard

Admins see their invite data in real time:

Leaderboard: which members have invited the most people
Invite breakdown: which links are performing best
Risk alerts: flagged accounts with their risk scores
Retention data: what percentage of invited members are still active after 7 days

The Scale Challenge

Lessons Learned

Differential tracking is fragile. Race conditions are real. Protect shared state with locks.
Heuristics beat ML for this scale. I considered training a classifier, but hand-tuned heuristics were faster to implement, easier to explain to admins, and more predictable in production.
Admins want control. Every automated action should be configurable. What is spam in one server is normal in another.

As I described in Building Discord Bots at Scale, the event processing pipeline needs to be efficient, and invite tracking is one of the most processing-intensive features in the entire bot.

Real-Time Invite Tracking and Fake Account Detection

The Invite Tracking Problem

Building the Attribution System

Fake Account Detection

Heuristic 1: Account Age

Heuristic 2: Join Velocity

Heuristic 3: Behavioural Signals

Scoring System

Real-Time Dashboard

The Scale Challenge

Lessons Learned

Blog

Say Hello

Real-Time Invite Tracking and Fake Account Detection

The Invite Tracking Problem

Building the Attribution System

Fake Account Detection

Heuristic 1: Account Age

Heuristic 2: Join Velocity

Heuristic 3: Behavioural Signals

Scoring System

Real-Time Dashboard

The Scale Challenge

Lessons Learned

The Invite Tracking Problem

Building the Attribution System

Fake Account Detection

Heuristic 1: Account Age

Heuristic 2: Join Velocity

Heuristic 3: Behavioural Signals

Scoring System

Real-Time Dashboard

The Scale Challenge

Lessons Learned

Stay in the loop

Related Posts

MongoDB Patterns I Learned from 7,500 Discord Servers

Building a Zero-Fee Payment API: Technical Decisions

Building Discord Bots at Scale: Chat Guard Architecture

Blog

Say Hello

The Invite Tracking Problem

Building the Attribution System

Fake Account Detection

Heuristic 1: Account Age

Heuristic 2: Join Velocity

Heuristic 3: Behavioural Signals

Scoring System

Real-Time Dashboard

The Scale Challenge

Lessons Learned

Stay in the loop

Related Posts

MongoDB Patterns I Learned from 7,500 Discord Servers

Building a Zero-Fee Payment API: Technical Decisions

Building Discord Bots at Scale: Chat Guard Architecture