php如何禁止外部实体引用,防止XXE

생성일: 2025년 10월 17일

답변에 사용된 모델 GPT-5 Thinking by Chat01

질문

php如何禁止外部实体引用,防止XXE

사고

Addressing XXE prevention

The question asks how to disable external entity references in PHP for defense against XXE attacks. We can provide code examples for older versions using libxml_disable_entity_loader, but this function was deprecated in PHP 8.0 and removed in PHP 8.2. By default, external entity loading is disabled in PHP 8.x, as it is in libxml 2.9.0.

Clarifying XXE prevention steps

In PHP 8.0, libxml_disable_entity_loader() became deprecated and has no effect anymore since external entity loading is disabled by default (since libxml 2.9.0). By PHP 8.3, the function is completely removed. For XXE prevention, it's safer to rely on security-focused configurations like setting LIBXML_NOENT but be cautious as it can expand entities.

Setting proper security options

To prevent XXE, avoid using LIBXML_NOENT as it expands entities (both internal and external). Instead, use LIBXML_NONET to block network access, without loading external resources. In DOMDocument, you can disable DTD loading with LIBXML_DTDLOAD and disable entity substitution by setting $dom->substituteEntities = false. Consider using parser options when loading XML, avoiding configurations that allow external loading.

Preventing DTD Processing and Securing XML

To prevent DTD processing, avoid using LIBXML_DTDLOAD. In PHP, we can use libxml_set_external_entity_loader to block external entities, although it's not commonly used. For PHP 8, external entity loading is disabled by default, but XInclude might still be risky. To secure DOMDocument or SimpleXML, use LIBXML_NONET to block network access and ensure DTD-related options like LIBXML_DTDLOAD and LIBXML_NOENT are not set. Additionally, disable resolveExternals and validateOnParse properties.

Thought for 47s

답변

下面给你一份「XXE 防护速查表」,覆盖 PHP 里最常见的 XML 解析方式(DOMDocument / SimpleXML / XMLReader)以及历史兼容点。

总原则

  1. 拒绝 DTD / DOCTYPE:发现 <!DOCTYPE 直接拒绝。
  2. 禁止网络访问:解析时加 LIBXML_NONET,阻止加载远程实体/外部 DTD。
  3. 不要展开实体:不要使用 LIBXML_NOENT(它会展开实体,存在风险)。
  4. 不要显式加载 DTD:不要传 LIBXML_DTDLOAD / LIBXML_DTDATTR,也不要验证解析时加载外部资源。
  5. PHP 8+ 里不要再用 libxml_disable_entity_loader:它已废弃并基本不再生效;现代 libxml 默认已禁用外部实体加载。

DOMDocument(推荐做法)

php
function dom_load_xml_safe(string $xml): DOMDocument { // 1) 直接拒绝带 DOCTYPE 的 XML(最稳妥) if (stripos($xml, '<!DOCTYPE') !== false) { throw new RuntimeException('DTD/DOCTYPE 不允许,已拒绝解析。'); } libxml_use_internal_errors(true); // 捕获解析错误,不向外输出 $dom = new DOMDocument(); // 显式关闭这些行为(默认就是 false,这里再次强调) $dom->resolveExternals = false; // 不解析外部实体 $dom->substituteEntities = false; // 不展开实体 $dom->validateOnParse = false; // 不在解析时做验证(会触发外部访问) // 2) 禁止网络访问 + 抑制警告 $options = LIBXML_NONET | LIBXML_NOERROR | LIBXML_NOWARNING; if (!$dom->loadXML($xml, $options)) { throw new RuntimeException('XML 无效或解析失败。'); } return $dom; }

文件解析(而不是字符串)

php
$dom = new DOMDocument(); $dom->resolveExternals = false; $dom->substituteEntities = false; $dom->validateOnParse = false; $dom->load('/path/local.xml', LIBXML_NONET | LIBXML_NOERROR | LIBXML_NOWARNING);

SimpleXML

php
function simplexml_load_string_safe(string $xml): SimpleXMLElement { if (stripos($xml, '<!DOCTYPE') !== false) { throw new RuntimeException('DTD/DOCTYPE 不允许。'); } libxml_use_internal_errors(true); $options = LIBXML_NONET | LIBXML_NOERROR | LIBXML_NOWARNING; $sx = simplexml_load_string($xml, 'SimpleXMLElement', $options); if ($sx === false) { throw new RuntimeException('XML 无效或解析失败。'); } return $sx; }

XMLReader(流式解析,适合大文件)

php
$reader = new XMLReader(); // 禁网 $reader->XML($xml, null, LIBXML_NONET); // 明确关闭 DTD 加载 / 实体替换 / 验证 $reader->setParserProperty(XMLReader::LOADDTD, false); $reader->setParserProperty(XMLReader::SUBST_ENTITIES, false); $reader->setParserProperty(XMLReader::VALIDATE, false); // 开始读取 while ($reader->read()) { // ... } $reader->close();

旧项目(PHP 7.x 及更早)兼容提示

  • 传统做法会在解析前后包一层:
    php
    $prev = libxml_disable_entity_loader(true); // 解析... libxml_disable_entity_loader($prev);
  • 在 PHP 8+ 这已不再是正确做法(函数已废弃且基本无效)。如果你仍在维护老环境,依然要结合上面的拒绝 DOCTYPELIBXML_NONET来兜底。

其他加固建议

  • 白名单输入:只接受期望来源/格式;限制大小(防“亿级字节炸弹”)。
  • 改用 JSON:如果不是必须 XML,优先 JSON,减少此类风险面。
  • 评审第三方库:有些库内部会用 LIBXML_NOENT 或加载 DTD,需检查其解析选项。

一眼看懂:哪些选项别用?

  • LIBXML_NOENT(会展开实体,易被 XXE 利用)
  • LIBXML_DTDLOAD / LIBXML_DTDATTR(会加载/处理 DTD)
  • ❌ 在解析时做验证(validateOnParse = true 常导致外部访问)

按上面的模板处理,基本就能禁止外部实体引用防止 XXE。如果你贴一下现在的解析代码,我可以直接帮你改成安全版本。

이 Q&A 공유하기