00 Votes

Only allow specific HTML code

Question by Progger99 | 2011-12-29 at 22:00

I have a form on my website, so that visitors may comment on the articles in my blog. I would now give users the ability to use certain formatting and links.

However, only certain HTML tags should be allowed, since I would not like, that user chop up the design of my site, nor I wish that someone injects scripts or malicious code in this way.

So, what I would need, would be a function, that keeps tags like <a>, <br> or <p>, but filters out anything else such as <script> or similar things. I've taken into account a function with regular expressions, but somehow I can not continue there. Can someone help me?

ReplyPositiveNegativeDateVotes
00 Votes

The magic word is strip_tags. Look at php.net/manual/en/function.strip-tags.php. There, you will find everything important to that topic!
2011-12-30 at 18:23

ReplyPositive Negative
00 Votes

To explain it in a bit more detail: The function strip_tags() expects a string as a parameter and optionally the allowed tags. Example:

$s='<p>word</p><br><br>';
echo strip_tags($s);
// output 1: 'word'
echo strip_tags($s,'<p><a>');
// output 2: '<p>word</p>'

In output 1, no tags are allowed, only 'word' will be the output. That is different in the second ouput, where the HTML codes <p> and <a> are allowed. Since the string $s contains no <a> but <p>, all <p> will be kept. The line break <br> will be deleted, as this tag is not allowed. If you would write strip_tags($s, '<p><a><br>'), also the line break <br> would be allowed.
2011-12-30 at 20:54

ReplyPositive Negative
0Best Answer0 Votes

Attention! You should not rely solely on strip_tags! Within a permitted HTML tag, "bad" users could put malicious code via an onmouseover event or similar things. This will not be removed by strip_tags alone.

You can overcome this problem like this:

// string with tags and malicious code
$txt = '<p class="x" onmouseover="alert(1);">
          Text Text Text <strong>Text</strong>
        </p>';
 
// clean up
$txt = strip_tags($s,'<p>');
$regex = "#<(/?\w+)\s+[^>]*>#is";
$txt = preg_replace($regex, '<${1}>',$txt);
 
// output
echo $txt; // '<p>Text Text Text Text</p>'

First, this code uses strip_tags() in order to delete all tags up to <p> from the string. Then, a regular expression is used to delete all attributes from the tags. Thus, the onmouseover command disappears from the p tag, but also what is indicated in class or other potentially unwanted attributes. If you wish to keep certain attributes, you can change the function accordingly.
2011-12-31 at 20:25

ReplyPositive Negative
Reply

Related Topics

PHP: Sending an E-Mail

Tutorial | 0 Comments

Important Note

Please note: The contributions published on askingbox.com are contributions of users and should not substitute professional advice. They are not verified by independents and do not necessarily reflect the opinion of askingbox.com. Learn more.

Participate

Ask your own question or write your own article on askingbox.com. That’s how it’s done.