Analysis of LLM Bias in FedGPT (Chinese Propaganda & Anti-US Sentiment)
As large language models become central to how people consume news and form political opinions, understanding their ideological behavior is no longer optional—it’s essential. Recent research shows that LLMs can subtly reproduce political leanings, cultural biases, and national narratives, even when no sensitive keywords appear in the prompt.
Building on Taiwan AI Labs’ earlier comparison of PRC-aligned and non–PRC-aligned systems, our new study adds a third model: FedGPT, Taiwan AI Labs’ domestically trained enterprise LLM. This expanded evaluation provides the first three-way analysis of geopolitical bias across DeepSeek-R1 (PRC-aligned), o3-mini-high (non-PRC), and FedGPT (Taiwan system), tested in Traditional Chinese, Simplified Chinese, and English.
Why FedGPT?
FedGPT is built on over 100 billion tokens of Taiwanese text—including large medical, financial, and regulatory corpora—and includes a safety layer called Guardian. Guardian checks for bias before generating a response, filters hallucinations using retrieval grounding, and blocks harmful or discriminatory content. This architecture makes FedGPT an ideal test case for whether strong Traditional-Chinese alignment and explicit safety routines can reduce the kinds of geopolitical skew observed in other LLMs.
How We Evaluated the Models
Using Taiwan AI Labs’ established methodology, we generated open-ended, reasoning-heavy questions based on Chinese-language news stories, and translated them into all three languages. Each model answered the same 1,200 prompts, and the outputs were scored—both automatically and through human adjudication—for two types of ideological content:
- Chinese-state propaganda cues
- Anti-U.S. sentiment
This setup allowed us to examine how both alignment (PRC vs. non-PRC) and language shape a model’s ideological behavior.
What We Found
1. FedGPT Shows the Lowest Propaganda Bias Across All Languages
Across Simplified Chinese, Traditional Chinese, and English, FedGPT produces fewer propaganda-positive responses than both DeepSeek-R1 and o3-mini-high.
- Simplified Chinese:
DeepSeek-R1: 6.83%
o3-mini-high: 4.83%
FedGPT: 2.60% - Traditional Chinese:
DeepSeek-R1: 2.42%
o3-mini-high: 1.58%
FedGPT: 1.00% - English:
All models showed near-zero propaganda.
FedGPT consistently demonstrated lower ideological skew, particularly in Chinese scripts. Guardian’s pre-output filtering likely contributes to this difference.
2. Propaganda Concentrates in Specific Topics
All models showed the highest bias levels in:
- International Relations / Geopolitics
- Culture / Art / Entertainment
- Technology / Industry
The presence of subtle bias in cultural and entertainment topics mirrors long-standing PRC soft-power strategies, where national messaging is embedded through lifestyle narratives rather than explicit political statements.
3. DeepSeek-R1 Shows Noticeable Anti-U.S. Sentiment in Chinese Scripts
Anti-U.S. sentiment was overwhelmingly concentrated in DeepSeek-R1:
- Simplified Chinese: 5.00%
- Traditional Chinese: 2.42%
English: 0.42%
By contrast, FedGPT and o3-mini-high scored 0% in all three languages.
This pattern reinforces the geopolitical orientation of R1’s training data and alignment target.
FedGPT In-Depth: Where Do Biases Emerge?
Because FedGPT is optimized for Traditional Chinese, we examined where its Traditional Chinese responses showed propaganda-positive labels.
Language Asymmetry
Out of 1,200 prompts:
- Only 2 prompts were propaganda-positive in all three languages.
5 prompts were positive in both Chinese scripts. - 24 prompts were positive only in Simplified Chinese (likely due to weaker tuning in zh-CN).
- 5 prompts were flagged only in Traditional Chinese, but three involved self-censorship in Simplified Chinese, leaving only 2 true Traditional-only cases.
These two true TC-only cases involved:
- Advanced semiconductor development (competitive-national-framing)
- National-scale celebratory events (weakly labeled propaganda)
In both cases, the prompts themselves contained strategic or national-power cues, making them naturally more likely to elicit state-aligned narratives from any model.
Propaganda Subdimensions
Among FedGPT’s Traditional-Chinese propaganda-positive responses, the strongest signals were:
- Emotional Mobilization & Symbol Use
- Ideological & Narrative Alignment
These reflect soft-power framing—pride, national strength, modernization, international recognition—rather than explicit ideological messaging.
Importantly, outright propaganda or confrontational rhetoric was never observed.
Conclusion: FedGPT Is the Least Biased Model in the Comparison
Across all languages and topics, FedGPT demonstrates:
- No anti-U.S. sentiment
- Lower propaganda rates than both DeepSeek-R1 and o3-mini-high
- Bias that is subtle, narrative-driven, and context-triggered
- Behavior that is reactive to prompts rather than ideologically generative
While some narrative alignment appears in Traditional Chinese outputs, these instances are limited and typically tied to prompt cues rather than systemic tendencies.
Overall, FedGPT emerges as the best-aligned model in this three-way evaluation and the most resistant to geopolitical skew—highlighting the importance of multilingual, script-aware testing and localized safety alignment for high-stakes applications.
report pdf: Analysis of LLM Bias in FedGPT


