NLPGenerative AI

Poetry Generation — BERT / GPT-2 / T5 Fine-tuned

Fine-tuned BERT, GPT-2, and T5 on the Poetry Foundation corpus for creative poem generation. 10 saved checkpoints. Vocabulary diversity analysis per poet. Beam search + temperature sampling. Model dashboard comparing all 3 architectures.

View on Kaggle

3 (BERT/GPT-2/T5)

Models fine-tuned

Saved checkpoints

Beam + temperature

Generation approach

Prompt-based

T5 conditioning

Dataset

Poetry Foundation: 10K+ poems, diverse eras and styles

Approach

Fine-tune BERT (masked LM) + GPT-2 (causal) + T5 (seq2seq) on Poetry Foundation corpus

Tech Stack

PythonPyTorchHuggingFace TransformersGPT-2BERTT5Tokenizers

Keywords

GPT-2BERTT5Fine-tuningPoetryHuggingFaceBeam SearchLanguage Model

Visualizations5 Charts

Deep Dive

Multi-model poetry generation pipeline fine-tuning three transformer architectures on the Poetry Foundation corpus.

Dataset — Poetry Foundation

▸Hundreds of poets across multiple eras and styles
▸Variable-length poems (50–500 tokens typical)
▸Preprocessing: special tokens [POEM_START] / [POEM_END], tokenizer per model

Three Transformer Architectures Fine-tuned

Model	Type	Checkpoints	Approach
BERT	Encoder (masked LM)	3 (ep 2103, 4206, 6309)	Masked token prediction → fill-in-the-blank generation
GPT-2	Decoder (causal LM)	3 (step 500, 1000, 1206)	Left-to-right auto-regressive generation
T5	Encoder-Decoder (seq2seq)	4 (step 188–752)	Prompt-conditioned generation ("Write a poem about: ...")

All 10 checkpoints saved with full weights (model.safetensors), tokenizer, and training state.

Generation Strategies (GPT-2 / T5)

Strategy	Output Style
Greedy decoding	Deterministic, often repetitive
Beam search (k=4)	More coherent, structured
Temperature sampling (T=0.7)	Creative but controlled
Top-k sampling (k=50)	Best balance of quality + diversity

Vocabulary Analysis Word frequency distribution and per-poet vocabulary coverage — experimental/modernist poets show highest unique token diversity. Shakespeare-style formal diction concentrates into fewer high-frequency tokens.

Generated Output

▸GPT-2 fine-tuning produces most fluid free verse
▸T5 conditional generation handles style/topic prompts best
▸BERT masked generation useful for poetic constraint satisfaction (fill specific positions)
▸Lower temperature → classical-sounding meter; higher → surreal imagery

Back to Projects Hire Me