0 0 Votes

Only allow specific HTML code

Question by Progger99 | 29/12/2011 at 22:00

I have a form on my website, so that visitors may comment on the articles in my blog. I would now give users the ability to use certain formatting and links.

However, only certain HTML tags should be allowed, since I would not like, that user chop up the design of my site, nor I wish that someone injects scripts or malicious code in this way.

So, what I would need, would be a function, that keeps tags like <a>, <br> or <p>, but filters out anything else such as <script> or similar things. I've taken into account a function with regular expressions, but somehow I can not continue there. Can someone help me?

ReplyPositiveNegativeDateVotes

SmartUser

Show Profile | Message
Avatar
00 Votes

The magic word is strip_tags. Look at php.net/manual/en/function.strip-tags.php. There, you will find everything important to that topic!
30/12/2011 at 18:23

ReplyPositive Negative

Axuter

Show Profile | Message
Avatar
00 Votes

To explain it in a bit more detail: The function strip_tags() expects a string as a parameter and optionally the allowed tags. Example:

$s='<p>word</p><br><br>';
echo strip_tags($s);
// output 1: 'word'
echo strip_tags($s,'<p><a>');
// output 2: '<p>word</p>'

In output 1, no tags are allowed, only 'word' will be the output. That is different in the second ouput, where the HTML codes <p> and <a> are allowed. Since the string $s contains no <a> but <p>, all <p> will be kept. The line break <br> will be deleted, as this tag is not allowed. If you would write strip_tags($s, '<p><a><br>'), also the line break <br> would be allowed.
30/12/2011 at 20:54

ReplyPositive Negative

Stefan Trost

Show Profile | Message
Avatar
00 Votes

Attention! You should not rely solely on strip_tags! Within a permitted HTML tag, "bad" users could put malicious code via an onmouseover event or similar things. This will not be removed by strip_tags alone.

You can overcome this problem like this:

// string with tags and malicious code
$txt = '<p class="x" onmouseover="alert(1);">
          Text Text Text <strong>Text</strong>
        </p>';
 
// clean up
$txt = strip_tags($s,'<p>');
$regex = "#<(/?\w+)\s+[^>]*>#is";
$txt = preg_replace($regex, '<${1}>',$txt);
 
// output
echo $txt; // '<p>Text Text Text Text</p>'

First, this code uses strip_tags() in order to delete all tags up to <p> from the string. Then, a regular expression is used to delete all attributes from the tags. Thus, the onmouseover command disappears from the p tag, but also what is indicated in class or other potentially unwanted attributes. If you wish to keep certain attributes, you can change the function accordingly.
31/12/2011 at 20:25

ReplyPositive Negative
Reply

Related Topics

Hide HTML Source Code
Question | 1 Answer

Important Note

Please note: The contributions published on askingbox.com are contributions of users and should not substitute professional advice. They are not verified by independents and do not necessarily reflect the opinion of askingbox.com. Learn more.

Participate

Ask your own question or write your own articles on askingbox.com. How to do.