All AI games
🧱
DQN

Breakout AI — DQN with ε-greedy

How a Deep Q-Network learns Breakout — reward shaping, experience replay, and the epsilon-decay curve.

Play the game

How the AI works

Breakout is trained with a DQN. The agent moves the paddle to keep the ball alive and clear bricks, learning Q-values for each action from experience.

State, actions, reward

  • State: paddle position, ball position and velocity.
  • Actions: move the paddle left, right, or stay.
  • Reward: positive for breaking bricks, negative for losing the ball.

Exploration vs exploitation

An ε-greedy policy starts almost random (high ε) and gradually exploits the learned policy as ε decays — balancing trying new moves against using what works.

What you see on screen

The epsilon-decay curve and live Q-value bars show the shift from exploring to exploiting as the agent gets good at clearing the wall.

Need an AI engineer or data scientist?

I build custom ML models, AI agents, computer vision, and automation — from idea to production.