All Projects
NLP

Book Recommender Systems — Full Taxonomy

Complete recommender system taxonomy on BookCrossing (1.1M ratings): User-CF, Item-CF, SVD/NMF/ALS, Content-Based, Hybrid, NCF, AutoRec, GRU4Rec. User-CF RMSE 1.6645, P@10 0.6629, R@10 0.6910.

1.6645
User-CF RMSE
0.6629
Precision@10
0.6910
Recall@10
8 (CF → DL)
Architectures
Dataset

BookCrossing: 1,149,780 ratings, 271,360 books, 278,858 users

Approach

Full taxonomy: User-CF → Item-CF → MF → Content-Based → Hybrid → NCF/AutoRec/GRU4Rec

Tech Stack
PythonPyTorchSurprise (SVD/NMF)Scipy (ALS)scikit-learn
Keywords
Collaborative FilteringSVDNCFGRU4RecMatrix FactorizationRecommender Systems
Visualizations2 Charts
Deep Dive

Comprehensive implementation of all major recommender paradigms on the Book-Crossing dataset.

Dataset

  • Raw: 271,360 books, 278,858 users, 1,149,780 ratings
  • Filtered (explicit ≥ 1): 118,699 ratings, 7,027 users, 9,438 books
  • Rating scale: 1–10 (explicit) + implicit feedback (page views)

Full Taxonomy

Recommender Systems
├── 1. Collaborative Filtering
│   ├── User-based CF (cosine similarity, K=20)
│   ├── Item-based CF
│   └── Matrix Factorization: SVD / NMF / ALS
├── 2. Content-Based Filtering
│   └── TF-IDF on book metadata (title, author, genre)
├── 3. Hybrid
│   ├── Weighted combination (CF + CB)
│   └── Switching (CB for cold-start users)
└── 4. Deep Learning
    ├── NCF (Neural Collaborative Filtering)
    ├── AutoRec (Autoencoder CF)
    └── GRU4Rec (Session-based sequential)

Key Results (User-CF)

MetricValue
RMSE1.6645
Precision@100.6629
Recall@100.6910

Cold-Start Handling New users (<5 ratings) fall back to Content-Based or popularity-based recommendations. Hybrid switching strategy avoids CF failure on cold-start.

Key Insight GRU4Rec (session-based) is the most practically valuable architecture — it handles anonymous users and captures short-term intent without needing a user history, which is the real-world default for most e-commerce sessions.