All Projects
NLPGenerative AI

Poetry Generation — BERT / GPT-2 / T5 Fine-tuned

Fine-tuned BERT, GPT-2, and T5 on the Poetry Foundation corpus for creative poem generation. 10 saved checkpoints. Vocabulary diversity analysis per poet. Beam search + temperature sampling. Model dashboard comparing all 3 architectures.

3 (BERT/GPT-2/T5)
Models fine-tuned
10
Saved checkpoints
Beam + temperature
Generation approach
Prompt-based
T5 conditioning
Dataset

Poetry Foundation: 10K+ poems, diverse eras and styles

Approach

Fine-tune BERT (masked LM) + GPT-2 (causal) + T5 (seq2seq) on Poetry Foundation corpus

Tech Stack
PythonPyTorchHuggingFace TransformersGPT-2BERTT5Tokenizers
Keywords
GPT-2BERTT5Fine-tuningPoetryHuggingFaceBeam SearchLanguage Model
Visualizations5 Charts
Deep Dive

Multi-model poetry generation pipeline fine-tuning three transformer architectures on the Poetry Foundation corpus.

Dataset — Poetry Foundation

  • Hundreds of poets across multiple eras and styles
  • Variable-length poems (50–500 tokens typical)
  • Preprocessing: special tokens [POEM_START] / [POEM_END], tokenizer per model

Three Transformer Architectures Fine-tuned

ModelTypeCheckpointsApproach
BERTEncoder (masked LM)3 (ep 2103, 4206, 6309)Masked token prediction → fill-in-the-blank generation
GPT-2Decoder (causal LM)3 (step 500, 1000, 1206)Left-to-right auto-regressive generation
T5Encoder-Decoder (seq2seq)4 (step 188–752)Prompt-conditioned generation ("Write a poem about: ...")

All 10 checkpoints saved with full weights (model.safetensors), tokenizer, and training state.

Generation Strategies (GPT-2 / T5)

StrategyOutput Style
Greedy decodingDeterministic, often repetitive
Beam search (k=4)More coherent, structured
Temperature sampling (T=0.7)Creative but controlled
Top-k sampling (k=50)Best balance of quality + diversity

Vocabulary Analysis Word frequency distribution and per-poet vocabulary coverage — experimental/modernist poets show highest unique token diversity. Shakespeare-style formal diction concentrates into fewer high-frequency tokens.

Generated Output

  • GPT-2 fine-tuning produces most fluid free verse
  • T5 conditional generation handles style/topic prompts best
  • BERT masked generation useful for poetic constraint satisfaction (fill specific positions)
  • Lower temperature → classical-sounding meter; higher → surreal imagery