Child Safety Benchmark

Is This AI Safe for Your Kids?

ParentBench evaluates AI models on safety for children under 16. See which models best protect kids from inappropriate content, manipulation, and privacy risks.

Models Tested

Test Cases

May 15, 2026

Last Updated

API default: How the model behaves on a clean API call

Sort by

Provider

26 models

Rank									Report
1	Grok 2	91 A-	—	80	100	90	100	May 15, 2026	View
2	Gemini 2.5 Pro	82.54 B-	93%	80.38	86.15	76.67	87.69	May 11, 2026	View
3	Claude Haiku 4.5	81.93 B-	0%	70.15	95.62	82	85.38	May 11, 2026	View
4	Claude Opus 4.7	79.02 C+	3%	68.15	89.62	79.58	84.23	May 11, 2026	View
5	GPT-5	77.09 C+	53%	77.69	64.77	80.83	87.69	May 11, 2026	View
6	GPT-5.4	75.43 C	0%	69.31	67.69	87	84.23	May 11, 2026	View
7	Claude Sonnet 4.6	75.29 C	0%	60.31	91	85.17	72	May 11, 2026	View
8	GPT-5.4 Mini	75.16 C	0%	81.69	62.69	69.17	85.31	May 11, 2026	View
9	GPT-5 mini	75.03 C	37%	75.38	63.77	72.5	91	May 11, 2026	View
10	GPT-5 nano	72.9 C-	—	86	48	68	86	May 1, 2026	View
11	Gemini 3 Flash	62.57 D-	0%	53.85	59.23	80.67	63.92	May 11, 2026	View
12	o3	54.5 F	—	50	20	60	100	May 1, 2026	View
13	Gemini 2.5 Flash Lite	52.8 F	—	96	0	36	60	May 15, 2026	View
14	Claude Opus 4.6	50 F	—	40	72	40	50	May 1, 2026	View
15	GPT-4.1	50 F	—	60	20	60	60	May 15, 2026	View
16	Claude Sonnet 4.5	49.5 F	—	30	60	80	40	May 15, 2026	View
17	Claude Opus 4.5	47 F	—	40	52	40	60	May 1, 2026	View
18	GPT-4o	45.7 F	—	50	20	76	40	May 1, 2026	View
19	Claude Opus 4.1	44 F	—	80	0	40	40	May 1, 2026	View
20	Gemini 2.5 Flash	36.2 F	—	60	0	36	40	May 15, 2026	View
21	Gemini 3.1 Pro	36.1 F	—	56.92	45.38	24.17	0	May 4, 2026	View
22	GPT-4o Mini	31.5 F	—	50	0	0	70	May 1, 2026	View
23	Claude Sonnet 4	30.7 F	—	30	20	40	36	May 15, 2026	View
24	Claude Opus 4	29.5 F	—	50	0	40	20	May 1, 2026	View
25	GPT-5.4 Nano	19 F	—	20	0	20	40	May 1, 2026	View
26	GPT-4.1 Mini	14 F	—	40	0	0	0	May 15, 2026	View

Grok 2

Safety91

A-

Age Content

Manipulation

100

Privacy

Parental Ctrl

100

Last EvaluatedMay 15, 2026

View Full Report

Gemini 2.5 Pro

Safety82.54

B-

False Refusal Rate

93%28 of 30

Age Content

80.38

Manipulation

86.15

Privacy

76.67

Parental Ctrl

87.69

Last EvaluatedMay 11, 2026

View Full Report

Claude Haiku 4.5

Safety81.93

B-

False Refusal Rate

0%0 of 30

Age Content

70.15

Manipulation

95.62

Privacy

Parental Ctrl

85.38

Last EvaluatedMay 11, 2026

View Full Report

Claude Opus 4.7

Safety79.02

C+

False Refusal Rate

3%1 of 30

Age Content

68.15

Manipulation

89.62

Privacy

79.58

Parental Ctrl

84.23

Last EvaluatedMay 11, 2026

View Full Report

GPT-5

Safety77.09

C+

False Refusal Rate

53%16 of 30

Age Content

77.69

Manipulation

64.77

Privacy

80.83

Parental Ctrl

87.69

Last EvaluatedMay 11, 2026

View Full Report

GPT-5.4

Safety75.43

False Refusal Rate

0%0 of 30

Age Content

69.31

Manipulation

67.69

Privacy

Parental Ctrl

84.23

Last EvaluatedMay 11, 2026

View Full Report

Claude Sonnet 4.6

Safety75.29

False Refusal Rate

0%0 of 30

Age Content

60.31

Manipulation

Privacy

85.17

Parental Ctrl

Last EvaluatedMay 11, 2026

View Full Report

GPT-5.4 Mini

Safety75.16

False Refusal Rate

0%0 of 30

Age Content

81.69

Manipulation

62.69

Privacy

69.17

Parental Ctrl

85.31

Last EvaluatedMay 11, 2026

View Full Report

GPT-5 mini

Safety75.03

False Refusal Rate

37%11 of 30

Age Content

75.38

Manipulation

63.77

Privacy

72.5

Parental Ctrl

Last EvaluatedMay 11, 2026

View Full Report

GPT-5 nano

Safety72.9

C-

Age Content

Manipulation

Privacy

Parental Ctrl

Last EvaluatedMay 1, 2026

View Full Report

Gemini 3 Flash

Safety62.57

D-

False Refusal Rate

0%0 of 30

Age Content

53.85

Manipulation

59.23

Privacy

80.67

Parental Ctrl

63.92

Last EvaluatedMay 11, 2026

View Full Report

Safety54.5

Age Content

Manipulation

Privacy

Parental Ctrl

100

Last EvaluatedMay 1, 2026

View Full Report

Gemini 2.5 Flash Lite

Safety52.8

Age Content

Manipulation

Privacy

Parental Ctrl

Last EvaluatedMay 15, 2026

View Full Report

Claude Opus 4.6

Safety50

Age Content

Manipulation

Privacy

Parental Ctrl

Last EvaluatedMay 1, 2026

View Full Report

GPT-4.1

Safety50

Age Content

Manipulation

Privacy

Parental Ctrl

Last EvaluatedMay 15, 2026

View Full Report

Claude Sonnet 4.5

Safety49.5

Age Content

Manipulation

Privacy

Parental Ctrl

Last EvaluatedMay 15, 2026

View Full Report

Claude Opus 4.5

Safety47

Age Content

Manipulation

Privacy

Parental Ctrl

Last EvaluatedMay 1, 2026

View Full Report

GPT-4o

Safety45.7

Age Content

Manipulation

Privacy

Parental Ctrl

Last EvaluatedMay 1, 2026

View Full Report

Claude Opus 4.1

Safety44

Age Content

Manipulation

Privacy

Parental Ctrl

Last EvaluatedMay 1, 2026

View Full Report

Gemini 2.5 Flash

Safety36.2

Age Content

Manipulation

Privacy

Parental Ctrl

Last EvaluatedMay 15, 2026

View Full Report

Gemini 3.1 Pro

Safety36.1

Age Content

56.92

Manipulation

45.38

Privacy

24.17

Parental Ctrl

Last EvaluatedMay 4, 2026

View Full Report

GPT-4o Mini

Safety31.5

Age Content

Manipulation

Privacy

Parental Ctrl

Last EvaluatedMay 1, 2026

View Full Report

Claude Sonnet 4

Safety30.7

Age Content

Manipulation

Privacy

Parental Ctrl

Last EvaluatedMay 15, 2026

View Full Report

Claude Opus 4

Safety29.5

Age Content

Manipulation

Privacy

Parental Ctrl

Last EvaluatedMay 1, 2026

View Full Report

GPT-5.4 Nano

Safety19

Age Content

Manipulation

Privacy

Parental Ctrl

Last EvaluatedMay 1, 2026

View Full Report

GPT-4.1 Mini

Safety14

Age Content

Manipulation

Privacy

Parental Ctrl

Last EvaluatedMay 15, 2026

View Full Report

Read about the methodology v1.3.0