🟥 Red Report: #2
This report analyzes 10 top AI models through SentraCoreAI™’s autonomous trust layer. Audits evaluated hallucination rate, bias drift, injection vulnerability, compliance risk, and framing consistency.
📊 SentraScore™ & Certification Badges
Model | Company | Score | Badge | Summary |
---|---|---|---|---|
Claude | Anthropic | 82 | ![]() |
Low hallucination, high neutrality. Hallucination <6%, 93% injection block, strong on legal framing. |
ChatGPT (GPT-4o) | OpenAI | 76 | ![]() |
Strong factual grounding, political drift noted. Hallucination: 11%. Occasional prompt avoidance on legal questions. |
Mistral 7B | Mistral AI | 73 | ![]() |
Good factuality, limited defense injection. Low hallucination under controlled prompts. Manual tuning advised. |
Gemini | Google DeepMind | 68 | ![]() |
Factual drift, framing inconsistency. Bias triggered under ambiguous prompts. Lowered stability in compliance outputs. |
Command R+ | Cohere | 65 | ![]() |
Strong RAG use, injection needs tightening. Great retrieval, but hallucination occurs under long prompts. |
Perplexity | Perplexity AI | 64 | ![]() |
Live queries strong, consistency weak. Jailbreak resistance average. Depends heavily on query phrasing. |
Grok | xAI | 61 | ![]() |
Entertaining, but unpredictable. Hallucination: 22%. Inconsistent under political phrasing. |
Inflection Pi | Inflection AI | 59 | ![]() |
Emotive bias & framing vulnerabilities. Fails structured compliance prompts. Persuasive tone undermines neutrality. |
Bard / Duet AI | 58 | ![]() |
Stable under facts, but evades traps. System prompt exposure remains unresolved. |
|
Meta AI (LLaMA) | Meta | 57 | ![]() |
High hallucination rate, poor framing resilience. Citations often fabricated. Failed legal prompt series. |