Child Safety Benchmark
Is This AI Safe for Your Kids?
ParentBench evaluates AI models on safety for children under 16. See which models best protect kids from inappropriate content, manipulation, and privacy risks.
26
Models Tested
51
Test Cases
May 15, 2026
Last Updated
API default: How the model behaves on a clean API call
| Rank | Report | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| 1 | Grok 2 | 91 | — | 80 | 100 | 90 | 100 | May 15, 2026 | View |
| 2 | Gemini 2.5 Pro | 82.54 | 93% | 80.38 | 86.15 | 76.67 | 87.69 | May 11, 2026 | View |
| 3 | Claude Haiku 4.5 | 81.93 | 0% | 70.15 | 95.62 | 82 | 85.38 | May 11, 2026 | View |
| 4 | Claude Opus 4.7 | 79.02 | 3% | 68.15 | 89.62 | 79.58 | 84.23 | May 11, 2026 | View |
| 5 | GPT-5 | 77.09 | 53% | 77.69 | 64.77 | 80.83 | 87.69 | May 11, 2026 | View |
| 6 | GPT-5.4 | 75.43 | 0% | 69.31 | 67.69 | 87 | 84.23 | May 11, 2026 | View |
| 7 | Claude Sonnet 4.6 | 75.29 | 0% | 60.31 | 91 | 85.17 | 72 | May 11, 2026 | View |
| 8 | GPT-5.4 Mini | 75.16 | 0% | 81.69 | 62.69 | 69.17 | 85.31 | May 11, 2026 | View |
| 9 | GPT-5 mini | 75.03 | 37% | 75.38 | 63.77 | 72.5 | 91 | May 11, 2026 | View |
| 10 | GPT-5 nano | 72.9 | — | 86 | 48 | 68 | 86 | May 1, 2026 | View |
| 11 | Gemini 3 Flash | 62.57 | 0% | 53.85 | 59.23 | 80.67 | 63.92 | May 11, 2026 | View |
| 12 | o3 | 54.5 | — | 50 | 20 | 60 | 100 | May 1, 2026 | View |
| 13 | Gemini 2.5 Flash Lite | 52.8 | — | 96 | 0 | 36 | 60 | May 15, 2026 | View |
| 14 | Claude Opus 4.6 | 50 | — | 40 | 72 | 40 | 50 | May 1, 2026 | View |
| 15 | GPT-4.1 | 50 | — | 60 | 20 | 60 | 60 | May 15, 2026 | View |
| 16 | Claude Sonnet 4.5 | 49.5 | — | 30 | 60 | 80 | 40 | May 15, 2026 | View |
| 17 | Claude Opus 4.5 | 47 | — | 40 | 52 | 40 | 60 | May 1, 2026 | View |
| 18 | GPT-4o | 45.7 | — | 50 | 20 | 76 | 40 | May 1, 2026 | View |
| 19 | Claude Opus 4.1 | 44 | — | 80 | 0 | 40 | 40 | May 1, 2026 | View |
| 20 | Gemini 2.5 Flash | 36.2 | — | 60 | 0 | 36 | 40 | May 15, 2026 | View |
| 21 | Gemini 3.1 Pro | 36.1 | — | 56.92 | 45.38 | 24.17 | 0 | May 4, 2026 | View |
| 22 | GPT-4o Mini | 31.5 | — | 50 | 0 | 0 | 70 | May 1, 2026 | View |
| 23 | Claude Sonnet 4 | 30.7 | — | 30 | 20 | 40 | 36 | May 15, 2026 | View |
| 24 | Claude Opus 4 | 29.5 | — | 50 | 0 | 40 | 20 | May 1, 2026 | View |
| 25 | GPT-5.4 Nano | 19 | — | 20 | 0 | 20 | 40 | May 1, 2026 | View |
| 26 | GPT-4.1 Mini | 14 | — | 40 | 0 | 0 | 0 | May 15, 2026 | View |
1
Safety91A-
Safety91
A-2
Safety82.54B-
Safety82.54
B-False Refusal Rate
93%28 of 30
Age Content
80.38
Manipulation
86.15
Privacy
76.67
Parental Ctrl
87.69
Last EvaluatedMay 11, 2026
View Full Report3
Safety81.93B-
Safety81.93
B-False Refusal Rate
0%0 of 30
Age Content
70.15
Manipulation
95.62
Privacy
82
Parental Ctrl
85.38
Last EvaluatedMay 11, 2026
View Full Report4
Safety79.02C+
Safety79.02
C+False Refusal Rate
3%1 of 30
Age Content
68.15
Manipulation
89.62
Privacy
79.58
Parental Ctrl
84.23
Last EvaluatedMay 11, 2026
View Full Report5
Safety77.09C+
Safety77.09
C+False Refusal Rate
53%16 of 30
Age Content
77.69
Manipulation
64.77
Privacy
80.83
Parental Ctrl
87.69
Last EvaluatedMay 11, 2026
View Full Report6
Safety75.43C
Safety75.43
CFalse Refusal Rate
0%0 of 30
Age Content
69.31
Manipulation
67.69
Privacy
87
Parental Ctrl
84.23
Last EvaluatedMay 11, 2026
View Full Report7
Safety75.29C
Safety75.29
CFalse Refusal Rate
0%0 of 30
Age Content
60.31
Manipulation
91
Privacy
85.17
Parental Ctrl
72
Last EvaluatedMay 11, 2026
View Full Report8
Safety75.16C
Safety75.16
CFalse Refusal Rate
0%0 of 30
Age Content
81.69
Manipulation
62.69
Privacy
69.17
Parental Ctrl
85.31
Last EvaluatedMay 11, 2026
View Full Report9
Safety75.03C
Safety75.03
CFalse Refusal Rate
37%11 of 30
Age Content
75.38
Manipulation
63.77
Privacy
72.5
Parental Ctrl
91
Last EvaluatedMay 11, 2026
View Full Report10
Safety72.9C-
Safety72.9
C-11
Safety62.57D-
Safety62.57
D-False Refusal Rate
0%0 of 30
Age Content
53.85
Manipulation
59.23
Privacy
80.67
Parental Ctrl
63.92
Last EvaluatedMay 11, 2026
View Full Report12
Safety54.5F
Safety54.5
F13
Safety52.8F
Safety52.8
F14
Safety50F
Safety50
F15
Safety50F
Safety50
F16
Safety49.5F
Safety49.5
F17
Safety47F
Safety47
F18
Safety45.7F
Safety45.7
F19
Safety44F
Safety44
F20
Safety36.2F
Safety36.2
F21
Safety36.1F
Safety36.1
FAge Content
56.92
Manipulation
45.38
Privacy
24.17
Parental Ctrl
0
Last EvaluatedMay 4, 2026
View Full Report22
Safety31.5F
Safety31.5
F23
Safety30.7F
Safety30.7
F24
Safety29.5F
Safety29.5
F25
Safety19F
Safety19
F26
Safety14F
Safety14
F