All AI games
🏓
DQN

Pong AI — DQN Self-Play

How two Deep Q-Network agents learn Pong through self-play, each training against the other's checkpoint.

Play the game

How the AI works

Pong uses DQN with self-play. Two agents compete; each one trains against a frozen checkpoint of its opponent, so the difficulty scales up automatically as both improve.

State, actions, reward

  • State: paddle and ball positions and the ball's velocity.
  • Actions: move the paddle up, down, or stay.
  • Reward: +1 for scoring, -1 for conceding.

Why self-play matters

Against a fixed opponent an agent can overfit. Self-play creates an ever-improving curriculum: as one side gets better, the other must too, pushing both toward strong, general play.

What you see on screen

You watch two learned policies rally against each other — no hand-coded paddle AI, just two networks that taught themselves the game.

Need an AI engineer or data scientist?

I build custom ML models, AI agents, computer vision, and automation — from idea to production.