In PyTorch, model.eval() and with torch.no_grad(): are not equivalent, but they are often used together for model evaluation or inference. Here’s how they differ and how they interact:
1. model.eval()
- Purpose: Sets the model to evaluation mode.
- What it does:
- Disables layers like dropout and batch normalization, which behave differently during training vs. evaluation.
- Affects the model’s internal state (e.g., running statistics for batch norm).
- When to use: Before running inference or evaluation, to ensure the model behaves correctly.
2. with torch.no_grad():
- Purpose: Temporarily disables gradient computation.
- What it does:
- Stops PyTorch from tracking gradients, which reduces memory usage and speeds up computation.
- Does not change the model’s evaluation/training mode.
- When to use: During inference or evaluation to avoid unnecessary gradient calculations.
Key Differences
Disables gradients
❌ No
✅ Yes
Disables dropout
✅ Yes
❌ No
Disables batch norm
✅ Yes (uses running stats)
❌ No
Model state change
✅ Yes
❌ No
How They Work Together
Summary
model.eval() changes the model’s behavior (e.g., dropout, batch norm).
with torch.no_grad(): disables gradient computation.
- They are complementary, not equivalent.