HTML Sanitizer ¶
6.1
The HTML Sanitizer component was introduced in Symfony 6.1.
The HTML Sanitizer component aims at sanitizing/cleaning untrusted HTML code (e.g. created by a WYSIWYG editor in the browser) into HTML that can be trusted. It is based on the HTML Sanitizer W3C Standard Proposal.
The HTML sanitizer creates a new HTML structure from scratch, taking only the elements and attributes that are allowed by configuration. This means that the returned HTML is very predictable (it only contains allowed elements), but it does not work well with badly formatted input (e.g. invalid HTML). The sanitizer is targeted for two use cases:
- Preventing security attacks based on XSS or other technologies relying on
execution of malicious code on the visitors browsers;訪問者のブラウザでの悪意のあるコードの実行に依存する XSS またはその他の技術に基づくセキュリティ攻撃を防止します。
- Generating HTML that always respects a certain format (only certain
tags, attributes, hosts, etc.) to be able to consistently style the
resulting output with CSS. This also protects your application against
attacks related to e.g. changing the CSS of the whole page.常に特定の形式 (特定のタグ、属性、ホストなどのみ) を尊重する HTML を生成して、結果の出力を CSS で一貫してスタイル設定できるようにします。これにより、アプリケーションを攻撃から保護することもできます。ページ全体の CSS を変更します。
Installation ¶
You can install the HTML Sanitizer component with:
1 |
$ composer require symfony/html-sanitizer
|
Basic Usage ¶
Use the HtmlSanitizer class to
sanitize the HTML. In the Symfony framework, this class is available as the
html_sanitizer
service. This service will be autowired
automatically when type-hinting for
HtmlSanitizerInterface:
-
Framework Use
フレームワークの使用
-
Standalone Use
スタンドアロン使用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
// src/Controller/BlogPostController.php
namespace App\Controller;
// ...
use Symfony\Component\HtmlSanitizer\HtmlSanitizerInterface;
class BlogPostController extends AbstractController
{
public function createAction(HtmlSanitizerInterface $htmlSanitizer, Request $request): Response
{
$unsafeContents = $request->request->get('post_contents');
$safeContents = $htmlSanitizer->sanitize($unsafeContents);
// ... proceed using the safe HTML
}
}
|
Note
The default configuration of the HTML sanitizer allows all "safe" elements and attributes, as defined by the W3C Standard Proposal. In practice, this means that the resulting code will not contain any scripts, styles or other elements that can cause the website to behave or look different. Later in this article, you'll learn how to fully customize the HTML sanitizer.
Sanitizing HTML for a Specific Context ¶
The default sanitize()
method cleans the HTML code for usage in the <body>
element. Using the
sanitizeFor()
method, you can instruct HTML sanitizer to customize this for the
<head>
or a more specific HTML tag:
1 2 3 4 5 6 7 8 9 10 |
// tags not allowed in <head> will be removed
$safeInput = $htmlSanitizer->sanitizeFor('head', $userInput);
// encodes the returned HTML using HTML entities
$safeInput = $htmlSanitizer->sanitizeFor('title', $userInput);
$safeInput = $htmlSanitizer->sanitizeFor('textarea', $userInput);
// uses the <body> context, removing tags only allowed in <head>
$safeInput = $htmlSanitizer->sanitizeFor('body', $userInput);
$safeInput = $htmlSanitizer->sanitizeFor('section', $userInput);
|
Sanitizing HTML from Form Input ¶
The HTML sanitizer component directly integrates with Symfony Forms, to sanitize the form input before it is processed by your application.
You can enable the sanitizer in TextType
forms, or any form extending
this type (such as TextareaType
), using the sanitize_html
option:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
// src/Form/BlogPostType.php
namespace App\Form;
// ...
class BlogPostType extends AbstractType
{
// ...
public function configureOptions(OptionsResolver $resolver): void
{
$resolver->setDefaults([
'sanitize_html' => true,
// use the "sanitizer" option to use a custom sanitizer (see below)
//'sanitizer' => 'app.post_sanitizer',
]);
}
}
|
Sanitizing HTML in Twig Templates ¶
Besides sanitizing user input, you can also sanitize HTML code before
outputting it in a Twig template using the sanitize_html()
filter:
1 2 3 4 |
{{ post.body|sanitize_html }}
{# you can also use a custom sanitizer (see below) #}
{{ post.body|sanitize_html('app.post_sanitizer') }}
|
Configuration ¶
The behavior of the HTML sanitizer can be fully customized. This allows you to explicitly state which elements, attributes and even attribute values are allowed.
You can do this by defining a new HTML sanitizer in the configuration:
-
YAML
YAML
-
XML
XML
-
PHP
PHP
-
Standalone Use
スタンドアロン使用
1 2 3 4 5 6 7 |
# config/packages/html_sanitizer.yaml
framework:
html_sanitizer:
sanitizers:
app.post_sanitizer:
block_elements:
- h1
|
This configuration defines a new html_sanitizer.sanitizer.app.post_sanitizer
service. This service will be autowired
for services having an HtmlSanitizerInterface $appPostSanitizer
parameter.
Allow Element Baselines ¶
You can start the custom HTML sanitizer by using one of the two baselines:
- Static elements
-
All elements and attributes on the baseline allow lists from the
W3C Standard Proposal (this does not include scripts).
ベースラインのすべての要素と属性は、W3C 標準提案のリストを許可します (これにはスクリプトは含まれません)。
- Safe elements
-
All elements and attributes from the "static elements" list, excluding
elements and attributes that can also lead to CSS
injection/click-jacking.
「静的要素」リストのすべての要素と属性。ただし、CSS インジェクション/クリックジャッキングにつながる可能性のある要素と属性は除きます。
-
YAML
YAML
-
XML
XML
-
PHP
PHP
-
Standalone Use
スタンドアロン使用
1 2 3 4 5 6 7 8 |
# config/packages/html_sanitizer.yaml
framework:
html_sanitizer:
sanitizers:
app.post_sanitizer:
# enable either of these
allow_safe_elements: true
allow_static_elements: true
|
Allow Elements ¶
This adds elements to the allow list. For each element, you can also specify the allowed attributes on that element. If not given, all allowed attributes from the W3C Standard Proposal are allowed.
-
YAML
YAML
-
XML
XML
-
PHP
PHP
-
Standalone Use
スタンドアロン使用
1 2 3 4 5 6 7 8 9 10 11 12 13 |
# config/packages/html_sanitizer.yaml
framework:
html_sanitizer:
sanitizers:
app.post_sanitizer:
# ...
allow_elements:
# allow the <article> element and 2 attributes
article: ['class', 'data-attr']
# allow the <img> element and preserve the src attribute
img: 'src'
# allow the <h1> element with all safe attributes
h1: '*'
|
Block and Drop Elements ¶
You can also block (the element will be removed, but its children will be kept) or drop (the element and its children will be removed) elements.
This can also be used to remove elements from the allow list.
-
YAML
YAML
-
XML
XML
-
PHP
PHP
-
Standalone Use
スタンドアロン使用
1 2 3 4 5 6 7 8 9 10 11 |
# config/packages/html_sanitizer.yaml
framework:
html_sanitizer:
sanitizers:
app.post_sanitizer:
# ...
# remove <div>, but process the children
block_elements: ['div']
# remove <figure> and its children
drop_elements: ['figure']
|
Allow Attributes ¶
Using this option, you can specify which attributes will be preserved in the returned HTML. The attribute will be allowed on the given elements, or on all elements allowed before this setting.
-
YAML
YAML
-
XML
XML
-
PHP
PHP
-
Standalone Use
スタンドアロン使用
1 2 3 4 5 6 7 8 9 10 11 12 |
# config/packages/html_sanitizer.yaml
framework:
html_sanitizer:
sanitizers:
app.post_sanitizer:
# ...
allow_attributes:
# allow "src' on <iframe> elements
src: ['iframe']
# allow "data-attr" on all elements currently allowed
data-attr: '*'
|
Drop Attributes ¶
This option allows you to disallow attributes that were allowed before.
-
YAML
YAML
-
XML
XML
-
PHP
PHP
-
Standalone Use
スタンドアロン使用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
# config/packages/html_sanitizer.yaml
framework:
html_sanitizer:
sanitizers:
app.post_sanitizer:
# ...
allow_attributes:
# allow the "data-attr" on all safe elements...
data-attr: '*'
drop_attributes:
# ...except for the <section> element
data-attr: ['section']
# disallows "style' on any allowed element
style: '*'
|
Force Attribute Values ¶
Using this option, you can force an attribute with a given value on an
element. For instance, use the follow config to always set rel="noopener noreferrer"
on each <a>
element (even if the original one didn't contain a rel
attribute):
-
YAML
YAML
-
XML
XML
-
PHP
PHP
-
Standalone Use
スタンドアロン使用
1 2 3 4 5 6 7 8 9 |
# config/packages/html_sanitizer.yaml
framework:
html_sanitizer:
sanitizers:
app.post_sanitizer:
# ...
force_attributes:
a:
rel: noopener noreferrer
|
Force/Allow Link URLs ¶
Besides allowing/blocking elements and attributes, you can also control the
URLs of <a>
elements:
-
YAML
YAML
-
XML
XML
-
PHP
PHP
-
Standalone Use
スタンドアロン使用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
# config/packages/html_sanitizer.yaml
framework:
html_sanitizer:
sanitizers:
app.post_sanitizer:
# ...
# if `true`, all URLs will be forced using the `https://` scheme (instead
# of e.g. `http://` or `mailto:`)
force_https_urls: true
# specifies the allowed URL schemes. If the URL has a different scheme, the
# attribute will be dropped
allowed_link_schemes: ['http', 'https', 'mailto']
# specifies the allowed hosts, the attribute will be dropped if the
# URL contains a different host
allowed_link_hosts: ['symfony.com']
# whether to allow relative links (i.e. URLs without scheme and host)
allow_relative_links: true
|
Force/Allow Media URLs ¶
Like link URLs, you can also control the
URLs of other media in the HTML. The following attributes are checked by
the HTML sanitizer: src
, href
, lowsrc
, background
and ping
.
-
YAML
YAML
-
XML
XML
-
PHP
PHP
-
Standalone Use
スタンドアロン使用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
# config/packages/html_sanitizer.yaml
framework:
html_sanitizer:
sanitizers:
app.post_sanitizer:
# ...
# if `true`, all URLs will be forced using the `https://` scheme (instead
# of e.g. `http://` or `data:`)
force_https_urls: true
# specifies the allowed URL schemes. If the URL has a different scheme, the
# attribute will be dropped
allowed_media_schemes: ['http', 'https', 'mailto']
# specifies the allowed hosts, the attribute will be dropped if the URL
# contains a different host
allowed_media_hosts: ['symfony.com']
# whether to allow relative URLs (i.e. URLs without scheme and host)
allow_relative_medias: true
|
Custom Attribute Sanitizers ¶
Controlling the link and media URLs is done by the
UrlAttributeSanitizer.
You can also implement your own attribute sanitizer, to control the value
of other attributes in the HTML. Create a class implementing
AttributeSanitizerInterface
and register it as a service. After this, use with_attribute_sanitizers
to enable it for an HTML sanitizer:
-
YAML
YAML
-
XML
XML
-
PHP
PHP
-
Standalone Use
スタンドアロン使用
1 2 3 4 5 6 7 8 9 10 11 12 |
# config/packages/html_sanitizer.yaml
framework:
html_sanitizer:
sanitizers:
app.post_sanitizer:
# ...
with_attribute_sanitizers:
- App\Sanitizer\CustomAttributeSanitizer
# you can also disable previously enabled custom attribute sanitizers
#without_attribute_sanitizers:
# - App\Sanitizer\CustomAttributeSanitizer
|