Sanitizer
in package
A class that sanitizes HTML elements which load external resources automatically, the is achieved by setting their attributes that load the external resources to temporary URI/URL and saving the original attributes values in a temporary attribute to be used again to reload the resources upon consent.
Elements that will be sanitized are:
-
<link href="" />
-
<script src="" />
-
<iframe src="" />
-
<embed src="" />
-
<img src="" srcset="" />
-
<audio src="" />
-
<video src="" poster="" />
-
<source src="" srcset="" />
-
<track src="" />
-
<object data="" />
Example:
// sanitizing the response before returning it to the client
// e.g. in kernel response event listener
$condition = function ($data) {
// only html responses or check additionally for some consent cookie
return strpos($data, '<!DOCTYPE html>') !== false;
};
$uris = [
'link' => sprintf('data:text/css;charset=UTF-8;base64,%s', base64_encode('body::after{content:"Blocked! Consent Please.";color:orangered}')),
'script' => sprintf('data:text/javascript;charset=UTF-8;base64,%s', base64_encode('console.log("Blocked! Consent Please.")')),
'iframe' => sprintf('data:text/html;charset=UTF-8;base64,%s', base64_encode('<div>Blocked! Consent Please.</div>')),
];
$whitelist = [
'unpkg.com',
'cdnjs.cloudflare.com',
];
$appends = [
'body' => [
'<script defer src="/path/to/client-side-code.js"></script>',
],
];
$sanitizedHTML = (new \MAKS\GDPRTools\Backend\Sanitizer())
->setData($html)
->setCondition($condition)
->setURIs($uris)
->setWhitelist($whitelist)
->setAppends($appends)
->sanitize()
->append('<script id="sanitization">{"sanitized":true}</script>', 'body') // add additional appends.
->get();
// sanitizing using the shorthand
$sanitizedHTML = (new \MAKS\GDPRTools\Backend\Sanitizer())->sanitizeData($html, $condition, $uris, $whitelist, $appends);
// sanitizing app entry
// (1) rename index.php to app.php
// (2) create index.php with following content
// (3) the result will simply be returned to the client
require '/path/to/src/Backend/Sanitizer.php';
\MAKS\GDPRTools\Backend\Sanitizer::sanitizeApp('./app.php', $condition, $uris, $whitelist, $appends);
Tags
Table of Contents
- ELEMENTS = [ // element => attributes 'link' => ['href'], 'script' => ['src'], 'iframe' => ['src'], 'embed' => ['src'], 'img' => ['src', 'srcset'], 'audio' => ['src'], 'video' => ['src', 'poster'], 'source' => ['src', 'srcset'], 'track' => ['src'], 'object' => ['data'], ]
- HTML elements that load external resources.
- INJECTION_MODE_AFTER = 'AFTER'
- `AFTER` injection mode.
- INJECTION_MODE_APPEND = 'APPEND'
- `APPEND` injection mode.
- INJECTION_MODE_BEFORE = 'BEFORE'
- `BEFORE` injection mode.
- INJECTION_MODE_PREPEND = 'PREPEND'
- `PREPEND` injection mode.
- VERSION = 'v1.4.2'
- Package version.
- INJECTION_MODES = [ // mode => [search => replacement] self::INJECTION_MODE_PREPEND => ['/(<\s*{target}[^>]*>)/i' => '$1{data}'], self::INJECTION_MODE_APPEND => ['/(<\/\s*{target}\s*>)/i' => '{data}$1'], self::INJECTION_MODE_BEFORE => ['/(<\s*{target}[^>]*>)/i' => '{data}$1'], self::INJECTION_MODE_AFTER => ['/(<\/\s*{target}\s*>)/i' => '$1{data}'], ]
- Injection modes search and replacements.
- $attributes : array<string|int, mixed>
- The overrides for the names of the attributes added after the sanitization.
- $appends : array<string|int, mixed>
- The list of appends for each target.
- $condition : callable
- The condition to check before sanitizing.
- $data : string
- The data to sanitize.
- $prepends : array<string|int, mixed>
- The list of prepends for each target.
- $result : string
- The result after the sanitization.
- $uris : array<string|int, mixed>
- The temporary URIs/URLs to replace the original sources with.
- $whitelist : array<string|int, mixed>
- The list of the whitelisted domains that should not be sanitized.
- __construct() : mixed
- Sanitizer constructor.
- append() : static
- Appends some data to the current data/result.
- get() : string
- Returns the current result and resets class internal state.
- inject() : static
- Injects data around or into an element (modes: `PREPEND`, `APPEND`, `BEFORE`, `AFTER`).
- prepend() : static
- Prepends some data in the current data/result.
- sanitize() : static
- Sanitizes the current data.
- sanitizeApp() : void
- Sanitize the HTML resulting from including the passed path.
- sanitizeData() : string
- Sanitize the given HTML code.
- setAppends() : static
- Sets the list of appends for each target.
- setCondition() : static
- Sets the condition to check that determines whether to sanitize the data or not.
- setData() : mixed
- Sets the data to sanitize.
- setPrepends() : static
- Sets the list of prepends for each target.
- setURIs() : static
- Sets the temporary URIs/URLs to set for each sanitized element.
- setWhitelist() : static
- Sets the list of whitelisted domains that should not be sanitized.
- bootstrap() : void
- Use this method instead of `self::__construct()` to bootstrap the object.
- getDomains() : array<int, string>
- Returns a listed of domains that should not be sanitized.
- getReplaceCallback() : callable
- Returns the callback to replace the elements to sanitize.
- getSearchPattern() : string
- Returns the search pattern to find the elements to sanitize.
- getURIs() : array<string, string>
- Returns the Data-URIs to set to the sanitized elements.
Constants
ELEMENTS
HTML elements that load external resources.
public
array<string, array<string|int, mixed>>
ELEMENTS
= [
// element => attributes
'link' => ['href'],
'script' => ['src'],
'iframe' => ['src'],
'embed' => ['src'],
'img' => ['src', 'srcset'],
'audio' => ['src'],
'video' => ['src', 'poster'],
'source' => ['src', 'srcset'],
'track' => ['src'],
'object' => ['data'],
]
Available elements are:
-
<link href="" />
-
<script src="" />
-
<iframe src="" />
-
<embed src="" />
-
<img src="" srcset="" />
-
<audio src="" />
-
<video src="" poster="" />
-
<source src="" srcset="" />
-
<track src="" />
-
<object data="" />
INJECTION_MODE_AFTER
`AFTER` injection mode.
public
string
INJECTION_MODE_AFTER
= 'AFTER'
INJECTION_MODE_APPEND
`APPEND` injection mode.
public
string
INJECTION_MODE_APPEND
= 'APPEND'
INJECTION_MODE_BEFORE
`BEFORE` injection mode.
public
string
INJECTION_MODE_BEFORE
= 'BEFORE'
INJECTION_MODE_PREPEND
`PREPEND` injection mode.
public
string
INJECTION_MODE_PREPEND
= 'PREPEND'
VERSION
Package version.
public
string
VERSION
= 'v1.4.2'
Tags
INJECTION_MODES
Injection modes search and replacements.
protected
array<string|int, mixed>
INJECTION_MODES
= [
// mode => [search => replacement]
self::INJECTION_MODE_PREPEND => ['/(<\s*{target}[^>]*>)/i' => '$1{data}'],
self::INJECTION_MODE_APPEND => ['/(<\/\s*{target}\s*>)/i' => '{data}$1'],
self::INJECTION_MODE_BEFORE => ['/(<\s*{target}[^>]*>)/i' => '{data}$1'],
self::INJECTION_MODE_AFTER => ['/(<\/\s*{target}\s*>)/i' => '$1{data}'],
]
Properties
$attributes
The overrides for the names of the attributes added after the sanitization.
public
static array<string|int, mixed>
$attributes
= []
Available attributes are:
-
data-consent-element
-
data-consent-attribute
-
data-consent-value
-
data-consent-alternative
-
data-consent-original-{{ attribute:[href|src|srcset|poster|data] }}
e.g.data-consent-original-src
$appends
The list of appends for each target.
private
array<string|int, mixed>
$appends
Tags
$condition
The condition to check before sanitizing.
private
callable
$condition
$data
The data to sanitize.
private
string
$data
$prepends
The list of prepends for each target.
private
array<string|int, mixed>
$prepends
Tags
$result
The result after the sanitization.
private
string
$result
$uris
The temporary URIs/URLs to replace the original sources with.
private
array<string|int, mixed>
$uris
$whitelist
The list of the whitelisted domains that should not be sanitized.
private
array<string|int, mixed>
$whitelist
Methods
__construct()
Sanitizer constructor.
public
__construct() : mixed
Return values
mixed —append()
Appends some data to the current data/result.
public
append(string|array<string|int, mixed> $data[, string $target = 'body' ]) : static
This method is useful to add some <script>
or <link>
to the <head>
and/or <body>
elements.
NOTE: This method will append the data whether the data has changed (sanitized) or not.
Parameters
- $data : string|array<string|int, mixed>
-
The data to inject, a string or an array of strings.
- $target : string = 'body'
-
[optional] The target to append to. It's advisable to only add to top-level elements (i.e.
<head>
,<body>
). The data will be appended to the first element only.
Tags
Return values
static —get()
Returns the current result and resets class internal state.
public
get() : string
Return values
string —inject()
Injects data around or into an element (modes: `PREPEND`, `APPEND`, `BEFORE`, `AFTER`).
public
inject(string|array<string|int, mixed> $data, string $target[, string $mode = self::INJECTION_MODE_APPEND ]) : static
Parameters
- $data : string|array<string|int, mixed>
-
The data to inject, a string or an array of strings.
- $target : string
-
The target to inject in. It's advisable to only use top-level and unique elements (i.e.
<head>
,<body>
). The data will be injected in or around the first element only. - $mode : string = self::INJECTION_MODE_APPEND
-
The mode of injection. One of
PREPEND
,APPEND
,BEFORE
, orAFTER
(defaults and falls back toAPPEND
).
Tags
Return values
static —prepend()
Prepends some data in the current data/result.
public
prepend(string|array<string|int, mixed> $data[, string $target = 'head' ]) : static
This method is useful to add some <script>
or <link>
to the <head>
and/or <body>
elements.
NOTE: This method will prepend the data whether the data has changed (sanitized) or not.
Parameters
- $data : string|array<string|int, mixed>
-
The data to inject, a string or an array of strings.
- $target : string = 'head'
-
[optional] The target to prepend in. It's advisable to only add to top-level elements (i.e.
<head>
,<body>
). The data will be prepended in the first element only.
Tags
Return values
static —sanitize()
Sanitizes the current data.
public
sanitize() : static
Return values
static —sanitizeApp()
Sanitize the HTML resulting from including the passed path.
public
static sanitizeApp(string $path[, callable|null $condition = null ][, array<string, string>|null $uris = null ][, array<int, string>|null $whitelist = null ][, array<string, array<string, string[]|string>>|null $appends = null ][, array<string, array<string, string[]|string>>|null $prepends = null ][, array<string, array<string, string[]|string>>|null $injections = null ]) : void
NOTE: This method should be the last step in the application as it will flush all opened buffers to the client.
Parameters
- $path : string
-
The path to app entry point (
index.php
). - $condition : callable|null = null
-
[optional] The condition to check before sanitizing. The passed callback will be executed when calling
self::sanitize()
to check if the data should be sanitized. The callback will be passed the data and must return a boolean (signature:fn (string $data): bool
). The callback should check for a Cookie or something in the data (HTML) to determine whether to sanitize the data or not. - $uris : array<string, string>|null = null
-
[optional] The temporary URIs/URLs to set for each sanitized element. An associative array where keys are element names and values are the URIs (base64 encoded data) or normal URLs.
- $whitelist : array<int, string>|null = null
-
An array of domains that should not be sanitized. Sub-domains must be specified separately.
- $appends : array<string, array<string, string[]|string>>|null = null
-
[optional] The data to append. An associative array where keys are the targets to append to and values are a string or an array of strings of the data to append.
- $prepends : array<string, array<string, string[]|string>>|null = null
-
[optional] The data to prepend. An associative array where keys are the targets to prepend to and values are a string or an array of strings of the data to prepend.
- $injections : array<string, array<string, string[]|string>>|null = null
-
[optional] The data to inject. An associative array where keys are the modes values are the sames as
$prepends
or$appends
.
Tags
Return values
void —The buffer will simply be flushed to the client.
sanitizeData()
Sanitize the given HTML code.
public
sanitizeData(string $data[, callable|null $condition = null ][, array<string, string>|null $uris = null ][, array<int, string>|null $whitelist = null ][, array<string, array<string, string[]|string>>|null $appends = null ][, array<string, array<string, string[]|string>>|null $prepends = null ][, array<string, array<string, string[]|string>>|null $injections = null ]) : string
Parameters
- $data : string
-
The HTML code to sanitize.
- $condition : callable|null = null
-
[optional] The condition to check before sanitizing. The passed callback will be executed when calling
self::sanitize()
to check if the data should be sanitized. The callback will be passed the data and must return a boolean (signature:fn (string $data): bool
). The callback should check for a Cookie or something in the data (HTML) to determine whether to sanitize the data or not. - $uris : array<string, string>|null = null
-
[optional] The temporary URIs/URLs to set for each sanitized element. An associative array where keys are element names and values are the URIs (base64 encoded data) or normal URLs.
- $whitelist : array<int, string>|null = null
-
An array of domains that should not be sanitized. Sub-domains must be specified separately.
- $appends : array<string, array<string, string[]|string>>|null = null
-
[optional] The data to append. An associative array where keys are the targets to append to and values are a string or an array of strings of the data to append.
- $prepends : array<string, array<string, string[]|string>>|null = null
-
[optional] The data to prepend. An associative array where keys are the targets to prepend to and values are a string or an array of strings of the data to prepend.
- $injections : array<string, array<string, string[]|string>>|null = null
-
[optional] The data to inject. An associative array where keys are the modes values are the sames as
$prepends
or$appends
.
Return values
string —The sanitized HTML code.
setAppends()
Sets the list of appends for each target.
public
setAppends(array<string, array<string, string[]|string>> $appends) : static
Parameters
- $appends : array<string, array<string, string[]|string>>
-
The data to append. An associative array where keys are the target to append to and values are a string or array of the data to append.
Tags
Return values
static —setCondition()
Sets the condition to check that determines whether to sanitize the data or not.
public
setCondition(callable $condition) : static
Parameters
- $condition : callable
-
The condition to check before sanitizing. The passed callback will be executed when calling
self::sanitize()
to check if the data should be sanitized. The callback will be passed the data and must return a boolean (signature:fn (string $data): bool
). The callback should check for a Cookie or something in the data (HTML) to determine whether to sanitize the data or not.
Return values
static —setData()
Sets the data to sanitize.
public
setData(string $data) : mixed
Parameters
- $data : string
Return values
mixed —setPrepends()
Sets the list of prepends for each target.
public
setPrepends(array<string, array<string, string[]|string>> $prepends) : static
Parameters
- $prepends : array<string, array<string, string[]|string>>
-
The data to prepend. An associative array where keys are the target to prepend in and values are a string or array of the data to prepend.
Tags
Return values
static —setURIs()
Sets the temporary URIs/URLs to set for each sanitized element.
public
setURIs(array<string, string> $uris) : static
Parameters
- $uris : array<string, string>
-
An associative array where keys are element names (see
self::ELEMENTS
array keys) and values are the URIs (base64 encoded data) or normal URLs.
Return values
static —setWhitelist()
Sets the list of whitelisted domains that should not be sanitized.
public
setWhitelist(array<int, string> $whitelist) : static
Parameters
- $whitelist : array<int, string>
-
An array of domains that should not be sanitized. Sub-domains must be specified separately.
Return values
static —bootstrap()
Use this method instead of `self::__construct()` to bootstrap the object.
protected
bootstrap() : void
Tags
Return values
void —getDomains()
Returns a listed of domains that should not be sanitized.
private
getDomains() : array<int, string>
Return values
array<int, string> —getReplaceCallback()
Returns the callback to replace the elements to sanitize.
private
getReplaceCallback() : callable
Return values
callable —getSearchPattern()
Returns the search pattern to find the elements to sanitize.
private
getSearchPattern() : string
Return values
string —getURIs()
Returns the Data-URIs to set to the sanitized elements.
private
getURIs() : array<string, string>