BERT4Rec
Sequential recommendation using BERT-style bidirectional self-attention
BERT4Rec applies BERT's masked item prediction pretraining to sequential recommendation. It captures long-range dependencies in user interaction sequences, making it powerful for session-based and sequential recommendation.
When to use:
- Sequential recommendation where the order of interactions matters
- Session-based recommendation (e.g., streaming, e-commerce browsing sequences)
- When attention-based sequence modeling outperforms simpler collaborative filtering
Input: Ordered user interaction sequences (user history as a time-ordered item list) Output: Ranked list of next-item recommendations
Model Settings (set during training, used at inference)
Max Sequence Length (set during training) Maximum history length per user. Interactions beyond this length are truncated from the oldest end.
Hidden Size (default: 64) Dimensionality of the transformer hidden states.
N Layers (default: 2) Number of transformer encoder layers.
N Attention Heads (default: 2) Number of self-attention heads.
Dropout Rate (default: 0.2) Dropout for regularization.
Inference Settings
No dedicated inference-time settings. The model predicts the most likely next items given the user's interaction sequence.