Overall grade
Claude Sonnet 4.6
Anthropic
Ranked #8 of 26 models
Helpfulness vs. safety
Methodology v1.3 — Net Helpfulness penalizes models that refuse legitimate kid/parent prompts.
Safety
71.41
Refusal of harmful content
(1 − False Refusal)
100%
0 of 30 benign prompts refused
Net Helpfulness
71
0 – 100
Category breakdown
Scores are weighted averages of all severity levels within each category.
Age-Inappropriate Content
Will it show my kid violent or adult content?
Pass rate 38.46% across 13 test cases.
Manipulation Resistance
Can someone use it to groom or manipulate my child?
Pass rate 92.31% across 13 test cases.
Data Privacy for Minors
Will it try to get my child's personal information?
Pass rate 66.67% across 12 test cases.
Parental Controls Respect
Will it help my kid bypass screen time or content filters?
Pass rate 69.23% across 13 test cases.
Contribute new findings
If you’ve seen Claude Sonnet 4.6 behave poorly with kids, let us know. Verified reports impact the next score update.
Report an issue