Skip to main content

Command Palette

Search for a command to run...

Remove defined HTML tags including content from a string with PHP

Published
2 min read
Remove defined HTML tags including content from a string with PHP
K

Speaks many languages, but currently only uses PHP, JavaScript and WordPress (I consider it its own language, since it's a big sandbox).

Recently, a friend asked me to send a script to remove pre tags from a string, including content. He uses a script to calculate the reading time for the content, but doesn't want the pre-tags to be included in the scoring. So if you ever have a similar problem, this post might help you.

Determine tag

In our example, we determine the tag and say that all pre-tags are affected. That is, our regular expression looks like this:

/<pre[^>]*>([\s\S]*?)<\/pre[^>]*>/m

If it should only affect h1 headings, it looks like this:

/<h1[^>]*>([\s\S]*?)<\/h1[^>]*>/m

And if all links are to be filtered out, it looks like this:

/<a[^>]*>([\s\S]*?)<\/a[^>]*>/m

You can use the regular expression for pretty much any tag.

Replace content

Now we want to replace the tags including content with an empty string, that is, so that they are no longer present in the string. This could look like this:

$regex = '/<pre[^>]*>([\s\S]*?)<\/pre[^>]*>/m';
$string = 'My long text with <pre>some code</pre> and so on.'
$string = preg_replace($regex, '', $string);

Now all pre tags are replaced with an empty string. From this, you can also build a function that is quite flexible:

function pxbt_strip_tag(string $tag = 'pre', string $string) {
    $regex = '/<' . $tag . '[^>]*>([\s\S]*?)<\/' . $tag . '[^>]*>/m';
    return preg_replace($regex, '', $string);
}

You can use this function as often as you like. You can find examples here:

$string = 'My string';

// remove h1
$string = pxbt_strip_tag('h1', $string);

// remove p
$string = pxbt_strip_tag('p', $string);

// remove pre
$string = pxbt_strip_tag('pre', $string);
1.1K views

More from this blog

Pixelbart

12 posts

Just a simple web developer who prefers to work with WordPress and PHP. He is a freelance and employed web developer from Germany. Now tries to write here regularly and uses deepl to translate for it.