All AI games
🐍
DQN

Snake AI — Deep Q-Network (DQN)

How a Deep Q-Network learns to play Snake — states, rewards, replay buffer, and the live Q-value chart.

Play the game

How the AI works

Snake is trained with a DQN (Deep Q-Network) — reinforcement learning. The agent learns a function Q(state, action) that estimates the future reward of each move, then picks the action with the highest value.

State, actions, reward

  • State: danger in each direction, current heading, and the relative direction to the food.
  • Actions: turn left, go straight, turn right.
  • Reward: positive for eating, negative for dying, and a small shaping reward for moving toward the food.

How it learns

Experiences are stored in a replay buffer and sampled in mini-batches. An ε-greedy policy explores early and exploits later, while a target network stabilizes the updates.

What you see on screen

The Q-value chart updates every frame, so you can watch the agent's confidence in each action shift as it learns to chase food and avoid walls.

Need an AI engineer or data scientist?

I build custom ML models, AI agents, computer vision, and automation — from idea to production.