I have a custom post textbox that I want to sanitize using wp_kses
before I update my post meta.
I was looking for examples of common $allowed
settings, but I have only seen this example:
$allowed = array(
'a' => array( // on allow a tags
'href' => array() // and those anchors can only have href attribute
)
);
What is a typical wp_kses
$allowed
setting? Can someone provide an example of what they normally filter for?
I have a custom post textbox that I want to sanitize using wp_kses
before I update my post meta.
I was looking for examples of common $allowed
settings, but I have only seen this example:
$allowed = array(
'a' => array( // on allow a tags
'href' => array() // and those anchors can only have href attribute
)
);
What is a typical wp_kses
$allowed
setting? Can someone provide an example of what they normally filter for?
- This question is out of scope for the site as there is more than one correct answer. If you narrow the scope of the question, present a usage case and ask for someone to provide you with things you should include that would be much better. – mor7ifer Commented Mar 7, 2012 at 16:35
- I want to have a rich text box where the user can just enter regular text, bold, links, italics... – redconservatory Commented Mar 7, 2012 at 17:31
- The problem was I am looking for. – Frank Commented Jul 2, 2021 at 9:31
6 Answers
Reset to default 10I would disagree with the solution posted by @JaredCobb, wp_kses()
is much more flexible than the method he presented. It can strip out unwanted attributes from tags without destroying the tags themselves. For example, if the user put in <strong class='foo'>
, wp_kses()
would return <strong>
if you did not allow class, whereas strip_tags()
would remove the <strong>
completely.
@redconservatory: The attributes you'll want to use are as follows:
$args = array(
//formatting
'strong' => array(),
'em' => array(),
'b' => array(),
'i' => array(),
//links
'a' => array(
'href' => array()
)
);
This will allow bold and italics with no attributes, as well as anchor tags with an href
attributes...and nothing else. It uses the whitelisting principle, which @jaredcobb rightly noted is the better way to go here.
I would start out with the same $allowedtags
array that WordPress uses for their comments. You can find their array in the [wordpress directory]/wp-includes/kses.php
file. These seem like sensible defaults to me, and a good starting point. Here is their array...
$allowedtags = array(
'a' => array(
'href' => true,
'title' => true,
),
'abbr' => array(
'title' => true,
),
'acronym' => array(
'title' => true,
),
'b' => array(),
'blockquote' => array(
'cite' => true,
),
'cite' => array(),
'code' => array(),
'del' => array(
'datetime' => true,
),
'em' => array(),
'i' => array(),
'q' => array(
'cite' => true,
),
'strike' => array(),
'strong' => array(),
);
I would NOT use PHP's strip_tags
as a replacement for wp_kses
.
You should never use strip_tags to filter an unknown user's content!
I have created a quick video explaining Why WordPress’ wp_kses() is better than PHP’s strip_tags() for security.
There you go. This works both in WordPress and outside of WordPress.
<?php
$str = ' I am <strong>stronger</strong> and cooler every single day <aaaaa>. ';
echo orbisius_html_util::strip_tags($str);
/**
* Util HTML class
* @author Svetoslav Marinov (SLAVI) | http://orbisius
*/
class orbisius_html_util {
/**
* Uses WP's wp_kses to clear some of the html tags but allow some attribs
* usage: orbisius_html_util::strip_tags($str);
* uses WordPress' wp_kses()
* @param str $buffer string buffer
* @return str cleaned up text
*/
public static function strip_tags($buffer) {
static $default_attribs = array(
'id' => array(),
'class' => array(),
'title' => array(),
'style' => array(),
'data' => array(),
'data-mce-id' => array(),
'data-mce-style' => array(),
'data-mce-bogus' => array(),
);
$allowed_tags = array(
'div' => $default_attribs,
'span' => $default_attribs,
'p' => $default_attribs,
'a' => array_merge( $default_attribs, array(
'href' => array(),
'target' => array('_blank', '_top'),
) ),
'u' => $default_attribs,
'i' => $default_attribs,
'q' => $default_attribs,
'b' => $default_attribs,
'ul' => $default_attribs,
'ol' => $default_attribs,
'li' => $default_attribs,
'br' => $default_attribs,
'hr' => $default_attribs,
'strong' => $default_attribs,
'blockquote' => $default_attribs,
'del' => $default_attribs,
'strike' => $default_attribs,
'em' => $default_attribs,
'code' => $default_attribs,
);
if (function_exists('wp_kses')) { // WP is here
$buffer = wp_kses($buffer, $allowed_tags);
} else {
$tags = array();
foreach (array_keys($allowed_tags) as $tag) {
$tags[] = "<$tag>";
}
$buffer = strip_tags($buffer, join('', $tags));
}
$buffer = trim($buffer);
return $buffer;
}
}
I've only used wp_kses
when I've specifically needed to allow / filter attributes of HTML tags (for example, I want them to be allowed to have an <image>
tag, with a src=""
attribute but I don't want them to be able to but href=""
or style=""
or anything else on the image tag. In that case, wp_kses
comes in handy because (as you can see in the example you created) you can filter down very specifically. I've rarely used wp_kses
though because I just find that a couple of native PHP functions (below) do the trick and are easier to understand when I look at the code several months later.
If you want to completely remove HTML tags (except maybe allow a few) then I always use strip_tags
. You can pass in a string of allowed tags (like <p> <br> <strong>
) or whatever other harmless tags you like. This allows the user to be able to have some control over formatting, if that's applicable for your use case. I like strip_tags
because it takes a whitelist approach to sanitizing your data. (Meaning that everything gets stripped except what you explicitly whitelist).
If your goal is to allow them to put any HTML into the content, but you just want to show their text as they entered it (like code examples) then use htmlspecialchars
. This will convert HTML characters into their encoded counterparts so you can safely output it to the page.
You might come across code using str_replace
which "looks" for bad tags like or or whatever. I really don't recommend that approach because it takes a blacklist approach to sanitizing data and you've got to constantly make sure your blacklist is up to date.
I guess to sum up, it depends on what your metaboxes are used for. If you're protecting against input from users (who might be malicious) I'd recommend strip_tags
and just allow some of the harmless tags. If you have a good business case to really micromanage the tags and specific attributes of the user's content, use wp_kses
.
You could also use wp_kses_post function which is used on post content and requires only data as a parameter.
More info here: http://codex.wordpress/Function_Reference/wp_kses_post
@Svetoslav Marinov
I've added this code just after $buffer = trim($buffer);
$string_limpa = array(
'<div><p><\/div>' => '<br>',
'<div><br><\/div>'=> '<br>',
'<div align="left"><br><\/div>' => '<br>',
'<div align="center"><br><\/div>' => '<br>',
'<div align="right"><br><\/div>' => '<br>',
'<div style="text-align: left;"><br><\/div>' => '<br>',
'<div style="text-align: center;"><br><\/div>' => '<br>',
'<div style="text-align: right;"><br><\/div>' => '<br>',
'<div style="text-align: justify;"><br><\/div>' => '<br>',
'class="Apple-style-span"' => '<br>',
'<p><br></p>' => '<br>',
'<p><b></p>' => '<br>',
'<p><i></p>' => '<br>',
'<p><u></p>' => '<br>',
'\r' => '<br>',
'\n' => '<br>',
'\t' => ' ',
'\0' => ' ',
'\x0B' => '<br>',
'<p style="text-align: center;"><br></p>' => '<br>'
);
return strtr($buffer, $string_limpa);
to try to clean html and avoid pasted hidden characters breaking code, but it does not work, it cleans the html, but hidden characters still remain.