00 Votes

PHP: strip_tags should replace with spaces

Question by Compi | 2016-09-23 at 17:26

I am using the PHP function strip_tags() for removing all HTML tags from a string.

Actually, this is working quite well, however, there are some little problems arising from strings like that:

<p>First Sentence.</p><p>Second Sentence.</p>
<table>
<tr><td>First Column</td><td>Second Column</td></tr>
</table>

Namely, the function simply removes all tags with which the following result arises:

First Sentence.Second Sentence.
First ColumnSecond Column

Of course, this is very inelegant, because after each paragraph or even after the end of each DIV container there is no distance between the closing punctuation marks. It is even worse with tables: here, after each tag deletion with strip_tags(), all of the characters and columns are sticked together without any spacing.

I would rather see the behavior that instead of just deleting the tags without taking into account any content, a space is inserted instead of the tag. So, I would like to have the following result from the example above:

First Sentence. Second Sentence.
First Column Second Column

How is it possible to control strip_tags in a way that every tag is just replaced by a blank? When exporting the result as HTML, it would even be not a problem if there were multiple blanks after each other in the string, because there will be ignored in every browser.

ReplyPositiveNegativeDateVotes
1Best Answer1 Vote

Unfortunately, strip_tags() does not provide a parameter with which you can control the replacement.

However, you can just prepare your string before calling strip_tags(). This can look like that:

$s = '<p>ABC</p><p>DEF</p>';
$s = str_replace('><', '> <', $s);

echo strip_tags($s);  // ABC DEF

As you can see, before calling strip_tags(), we are replacing each occurrence of  "><" by "> <". This should work independent from the actual tag and should replace a "</p><p>" in the same way as a "</td><td>" by "</p> <p>" respectively "</td><td>".

The result of strip_tags() then contains the spaces at the corresponding positions.
2016-09-23 at 23:14

ReplyPositive Negative
13 Votes

Instead of using the function strip_tags(), you can also create your own regular expression for doing the job. Here is an example:

$s = '<p>ABC</p><p>DEF</p>';
$s = preg_replace('#<[^>]+>#', ' ', $s); // " ABC  DEF "
$s = preg_replace('#\s+#', ' ', $s);     // " ABC DEF "

echo trim($s);                           // "ABC DEF"

The second line replaces each tag with a blank, so that we have two blanks between "ABC" and "DEF" after that.

If we want to avoid this, we can use the third line for deleting all double spaces. Additionally, we can use trim() to cut off the blanks from the beginning and the end of the string before the output.
2016-09-24 at 14:28

ReplyPositive Negative
Reply

Related Topics

Important Note

Please note: The contributions published on askingbox.com are contributions of users and should not substitute professional advice. They are not verified by independents and do not necessarily reflect the opinion of askingbox.com. Learn more.

Participate

Ask your own question or write your own article on askingbox.com. That’s how it’s done.