Sanitizer
in package
A class that sanitizes HTML elements which load external resources automatically, the is achieved by setting their attributes that load the external resources to temporary URI/URL and saving the original attributes values in a temporary attribute to be used again to reload the resources upon consent.
Elements that will be sanitized are:
-
<link href="" /> -
<script src="" /> -
<iframe src="" /> -
<embed src="" /> -
<img src="" srcset="" /> -
<audio src="" /> -
<video src="" poster="" /> -
<source src="" srcset="" /> -
<track src="" /> -
<object data="" />
Example:
// sanitizing the response before returning it to the client
// e.g. in kernel response event listener
$condition = function ($data) {
// only html responses or check additionally for some consent cookie
return strpos($data, '<!DOCTYPE html>') !== false;
};
$uris = [
'link' => sprintf('data:text/css;charset=UTF-8;base64,%s', base64_encode('body::after{content:"Blocked! Consent Please.";color:orangered}')),
'script' => sprintf('data:text/javascript;charset=UTF-8;base64,%s', base64_encode('console.log("Blocked! Consent Please.")')),
'iframe' => sprintf('data:text/html;charset=UTF-8;base64,%s', base64_encode('<div>Blocked! Consent Please.</div>')),
];
$whitelist = [
'unpkg.com',
'cdnjs.cloudflare.com',
];
$appends = [
'body' => [
'<script defer src="/path/to/client-side-code.js"></script>',
],
];
$sanitizedHTML = (new \MAKS\GDPRTools\Backend\Sanitizer())
->setData($html)
->setCondition($condition)
->setURIs($uris)
->setWhitelist($whitelist)
->setAppends($appends)
->sanitize()
->append('<script id="sanitization">{"sanitized":true}</script>', 'body') // add additional appends.
->get();
// sanitizing using the shorthand
$sanitizedHTML = (new \MAKS\GDPRTools\Backend\Sanitizer())->sanitizeData($html, $condition, $uris, $whitelist, $appends);
// sanitizing app entry
// (1) rename index.php to app.php
// (2) create index.php with following content
// (3) the result will simply be returned to the client
require '/path/to/src/Backend/Sanitizer.php';
\MAKS\GDPRTools\Backend\Sanitizer::sanitizeApp('./app.php', $condition, $uris, $whitelist, $appends);
Tags
Table of Contents
- ELEMENTS = [ // element => attributes 'link' => ['href'], 'script' => ['src'], 'iframe' => ['src'], 'embed' => ['src'], 'img' => ['src', 'srcset'], 'audio' => ['src'], 'video' => ['src', 'poster'], 'source' => ['src', 'srcset'], 'track' => ['src'], 'object' => ['data'], ]
- HTML elements that load external resources.
- INJECTION_MODE_AFTER = 'AFTER'
- `AFTER` injection mode.
- INJECTION_MODE_APPEND = 'APPEND'
- `APPEND` injection mode.
- INJECTION_MODE_BEFORE = 'BEFORE'
- `BEFORE` injection mode.
- INJECTION_MODE_PREPEND = 'PREPEND'
- `PREPEND` injection mode.
- VERSION = 'v1.5.0'
- Package version.
- INJECTION_MODES = [ // mode => [search => replacement] self::INJECTION_MODE_PREPEND => ['/(<\\s*{target}[^>]*>)/i' => '$1{data}'], self::INJECTION_MODE_APPEND => ['/(<\\/\\s*{target}\\s*>)/i' => '{data}$1'], self::INJECTION_MODE_BEFORE => ['/(<\\s*{target}[^>]*>)/i' => '{data}$1'], self::INJECTION_MODE_AFTER => ['/(<\\/\\s*{target}\\s*>)/i' => '$1{data}'], ]
- Injection modes search and replacements.
- $attributes : array<string|int, mixed>
- The overrides for the names of the attributes added after the sanitization.
- $appends : array<string|int, mixed>
- The list of appends for each target.
- $condition : callable
- The condition to check before sanitizing.
- $data : string
- The data to sanitize.
- $prepends : array<string|int, mixed>
- The list of prepends for each target.
- $result : string
- The result after the sanitization.
- $uris : array<string|int, mixed>
- The temporary URIs/URLs to replace the original sources with.
- $whitelist : array<string|int, mixed>
- The list of the whitelisted domains that should not be sanitized.
- __construct() : mixed
- Sanitizer constructor.
- append() : static
- Appends some data to the current data/result.
- get() : string
- Returns the current result and resets class internal state.
- inject() : static
- Injects data around or into an element (modes: `PREPEND`, `APPEND`, `BEFORE`, `AFTER`).
- prepend() : static
- Prepends some data in the current data/result.
- sanitize() : static
- Sanitizes the current data.
- sanitizeApp() : void
- Sanitize the HTML resulting from including the passed path.
- sanitizeData() : string
- Sanitize the given HTML code.
- setAppends() : static
- Sets the list of appends for each target.
- setCondition() : static
- Sets the condition to check that determines whether to sanitize the data or not.
- setData() : mixed
- Sets the data to sanitize.
- setPrepends() : static
- Sets the list of prepends for each target.
- setURIs() : static
- Sets the temporary URIs/URLs to set for each sanitized element.
- setWhitelist() : static
- Sets the list of whitelisted domains that should not be sanitized.
- bootstrap() : void
- Use this method instead of `self::__construct()` to bootstrap the object.
- getDomains() : array<int, string>
- Returns a listed of domains that should not be sanitized.
- getReplaceCallback() : callable
- Returns the callback to replace the elements to sanitize.
- getSearchPattern() : string
- Returns the search pattern to find the elements to sanitize.
- getURIs() : array<string, string>
- Returns the Data-URIs to set to the sanitized elements.
Constants
ELEMENTS
HTML elements that load external resources.
public
array<string, array<string|int, mixed>>
ELEMENTS
= [
// element => attributes
'link' => ['href'],
'script' => ['src'],
'iframe' => ['src'],
'embed' => ['src'],
'img' => ['src', 'srcset'],
'audio' => ['src'],
'video' => ['src', 'poster'],
'source' => ['src', 'srcset'],
'track' => ['src'],
'object' => ['data'],
]
Available elements are:
-
<link href="" /> -
<script src="" /> -
<iframe src="" /> -
<embed src="" /> -
<img src="" srcset="" /> -
<audio src="" /> -
<video src="" poster="" /> -
<source src="" srcset="" /> -
<track src="" /> -
<object data="" />
INJECTION_MODE_AFTER
`AFTER` injection mode.
public
string
INJECTION_MODE_AFTER
= 'AFTER'
INJECTION_MODE_APPEND
`APPEND` injection mode.
public
string
INJECTION_MODE_APPEND
= 'APPEND'
INJECTION_MODE_BEFORE
`BEFORE` injection mode.
public
string
INJECTION_MODE_BEFORE
= 'BEFORE'
INJECTION_MODE_PREPEND
`PREPEND` injection mode.
public
string
INJECTION_MODE_PREPEND
= 'PREPEND'
VERSION
Package version.
public
string
VERSION
= 'v1.5.0'
Tags
INJECTION_MODES
Injection modes search and replacements.
protected
array<string|int, mixed>
INJECTION_MODES
= [
// mode => [search => replacement]
self::INJECTION_MODE_PREPEND => ['/(<\\s*{target}[^>]*>)/i' => '$1{data}'],
self::INJECTION_MODE_APPEND => ['/(<\\/\\s*{target}\\s*>)/i' => '{data}$1'],
self::INJECTION_MODE_BEFORE => ['/(<\\s*{target}[^>]*>)/i' => '{data}$1'],
self::INJECTION_MODE_AFTER => ['/(<\\/\\s*{target}\\s*>)/i' => '$1{data}'],
]
Properties
$attributes
The overrides for the names of the attributes added after the sanitization.
public
static array<string|int, mixed>
$attributes
= []
Available attributes are:
-
data-consent-element -
data-consent-attribute -
data-consent-value -
data-consent-alternative -
data-consent-original-{{ attribute:[href|src|srcset|poster|data] }}e.g.data-consent-original-src
$appends
The list of appends for each target.
private
array<string|int, mixed>
$appends
Tags
$condition
The condition to check before sanitizing.
private
callable
$condition
$data
The data to sanitize.
private
string
$data
$prepends
The list of prepends for each target.
private
array<string|int, mixed>
$prepends
Tags
$result
The result after the sanitization.
private
string
$result
$uris
The temporary URIs/URLs to replace the original sources with.
private
array<string|int, mixed>
$uris
$whitelist
The list of the whitelisted domains that should not be sanitized.
private
array<string|int, mixed>
$whitelist
Methods
__construct()
Sanitizer constructor.
public
__construct() : mixed
Return values
mixed —append()
Appends some data to the current data/result.
public
append(string|array<string|int, mixed> $data[, string $target = 'body' ]) : static
This method is useful to add some <script> or <link> to the <head> and/or <body> elements.
NOTE: This method will append the data whether the data has changed (sanitized) or not.
Parameters
- $data : string|array<string|int, mixed>
-
The data to inject, a string or an array of strings.
- $target : string = 'body'
-
[optional] The target to append to. It's advisable to only add to top-level elements (i.e.
<head>,<body>). The data will be appended to the first element only.
Tags
Return values
static —get()
Returns the current result and resets class internal state.
public
get() : string
Return values
string —inject()
Injects data around or into an element (modes: `PREPEND`, `APPEND`, `BEFORE`, `AFTER`).
public
inject(string|array<string|int, mixed> $data, string $target[, string $mode = self::INJECTION_MODE_APPEND ]) : static
Parameters
- $data : string|array<string|int, mixed>
-
The data to inject, a string or an array of strings.
- $target : string
-
The target to inject in. It's advisable to only use top-level and unique elements (i.e.
<head>,<body>). The data will be injected in or around the first element only. - $mode : string = self::INJECTION_MODE_APPEND
-
The mode of injection. One of
PREPEND,APPEND,BEFORE, orAFTER(defaults and falls back toAPPEND).
Tags
Return values
static —prepend()
Prepends some data in the current data/result.
public
prepend(string|array<string|int, mixed> $data[, string $target = 'head' ]) : static
This method is useful to add some <script> or <link> to the <head> and/or <body> elements.
NOTE: This method will prepend the data whether the data has changed (sanitized) or not.
Parameters
- $data : string|array<string|int, mixed>
-
The data to inject, a string or an array of strings.
- $target : string = 'head'
-
[optional] The target to prepend in. It's advisable to only add to top-level elements (i.e.
<head>,<body>). The data will be prepended in the first element only.
Tags
Return values
static —sanitize()
Sanitizes the current data.
public
sanitize() : static
Return values
static —sanitizeApp()
Sanitize the HTML resulting from including the passed path.
public
static sanitizeApp(string $path[, callable|null $condition = null ][, array<string, string>|null $uris = null ][, array<int, string>|null $whitelist = null ][, array<string, array<string, string[]|string>>|null $appends = null ][, array<string, array<string, string[]|string>>|null $prepends = null ][, array<string, array<string, string[]|string>>|null $injections = null ]) : void
NOTE: This method should be the last step in the application as it will flush all opened buffers to the client.
Parameters
- $path : string
-
The path to app entry point (
index.php). - $condition : callable|null = null
-
[optional] The condition to check before sanitizing. The passed callback will be executed when calling
self::sanitize()to check if the data should be sanitized. The callback will be passed the data and must return a boolean (signature:fn (string $data): bool). The callback should check for a Cookie or something in the data (HTML) to determine whether to sanitize the data or not. - $uris : array<string, string>|null = null
-
[optional] The temporary URIs/URLs to set for each sanitized element. An associative array where keys are element names and values are the URIs (base64 encoded data) or normal URLs.
- $whitelist : array<int, string>|null = null
-
An array of domains that should not be sanitized. Sub-domains must be specified separately.
- $appends : array<string, array<string, string[]|string>>|null = null
-
[optional] The data to append. An associative array where keys are the targets to append to and values are a string or an array of strings of the data to append.
- $prepends : array<string, array<string, string[]|string>>|null = null
-
[optional] The data to prepend. An associative array where keys are the targets to prepend to and values are a string or an array of strings of the data to prepend.
- $injections : array<string, array<string, string[]|string>>|null = null
-
[optional] The data to inject. An associative array where keys are the modes values are the sames as
$prependsor$appends.
Tags
Return values
void —The buffer will simply be flushed to the client.
sanitizeData()
Sanitize the given HTML code.
public
sanitizeData(string $data[, callable|null $condition = null ][, array<string, string>|null $uris = null ][, array<int, string>|null $whitelist = null ][, array<string, array<string, string[]|string>>|null $appends = null ][, array<string, array<string, string[]|string>>|null $prepends = null ][, array<string, array<string, string[]|string>>|null $injections = null ]) : string
Parameters
- $data : string
-
The HTML code to sanitize.
- $condition : callable|null = null
-
[optional] The condition to check before sanitizing. The passed callback will be executed when calling
self::sanitize()to check if the data should be sanitized. The callback will be passed the data and must return a boolean (signature:fn (string $data): bool). The callback should check for a Cookie or something in the data (HTML) to determine whether to sanitize the data or not. - $uris : array<string, string>|null = null
-
[optional] The temporary URIs/URLs to set for each sanitized element. An associative array where keys are element names and values are the URIs (base64 encoded data) or normal URLs.
- $whitelist : array<int, string>|null = null
-
An array of domains that should not be sanitized. Sub-domains must be specified separately.
- $appends : array<string, array<string, string[]|string>>|null = null
-
[optional] The data to append. An associative array where keys are the targets to append to and values are a string or an array of strings of the data to append.
- $prepends : array<string, array<string, string[]|string>>|null = null
-
[optional] The data to prepend. An associative array where keys are the targets to prepend to and values are a string or an array of strings of the data to prepend.
- $injections : array<string, array<string, string[]|string>>|null = null
-
[optional] The data to inject. An associative array where keys are the modes values are the sames as
$prependsor$appends.
Return values
string —The sanitized HTML code.
setAppends()
Sets the list of appends for each target.
public
setAppends(array<string, array<string, string[]|string>> $appends) : static
Parameters
- $appends : array<string, array<string, string[]|string>>
-
The data to append. An associative array where keys are the target to append to and values are a string or array of the data to append.
Tags
Return values
static —setCondition()
Sets the condition to check that determines whether to sanitize the data or not.
public
setCondition(callable $condition) : static
Parameters
- $condition : callable
-
The condition to check before sanitizing. The passed callback will be executed when calling
self::sanitize()to check if the data should be sanitized. The callback will be passed the data and must return a boolean (signature:fn (string $data): bool). The callback should check for a Cookie or something in the data (HTML) to determine whether to sanitize the data or not.
Return values
static —setData()
Sets the data to sanitize.
public
setData(string $data) : mixed
Parameters
- $data : string
Return values
mixed —setPrepends()
Sets the list of prepends for each target.
public
setPrepends(array<string, array<string, string[]|string>> $prepends) : static
Parameters
- $prepends : array<string, array<string, string[]|string>>
-
The data to prepend. An associative array where keys are the target to prepend in and values are a string or array of the data to prepend.
Tags
Return values
static —setURIs()
Sets the temporary URIs/URLs to set for each sanitized element.
public
setURIs(array<string, string> $uris) : static
Parameters
- $uris : array<string, string>
-
An associative array where keys are element names (see
self::ELEMENTSarray keys) and values are the URIs (base64 encoded data) or normal URLs.
Return values
static —setWhitelist()
Sets the list of whitelisted domains that should not be sanitized.
public
setWhitelist(array<int, string> $whitelist) : static
Parameters
- $whitelist : array<int, string>
-
An array of domains that should not be sanitized. Sub-domains must be specified separately.
Return values
static —bootstrap()
Use this method instead of `self::__construct()` to bootstrap the object.
protected
bootstrap() : void
Tags
Return values
void —getDomains()
Returns a listed of domains that should not be sanitized.
private
getDomains() : array<int, string>
Return values
array<int, string> —getReplaceCallback()
Returns the callback to replace the elements to sanitize.
private
getReplaceCallback() : callable
Return values
callable —getSearchPattern()
Returns the search pattern to find the elements to sanitize.
private
getSearchPattern() : string
Return values
string —getURIs()
Returns the Data-URIs to set to the sanitized elements.
private
getURIs() : array<string, string>