You should not use regular expressions to process HTML. Just a small change in the HTML would make your regex fail. A space of more, a change in the attributes of the tag, a comment, or more complex structures, would make even a gigantic regex not follow the rules.
It's very easy to process HTML with DOM , they are the tools that They are designed for that.
The DOM is simply generated as follows:
$html = 'cualquiercosa<b>contenido</b>cualquiercosa';
//Generar el DOM
$dom = new DOMDocument;
$dom->loadHTML($html, LIBXML_COMPACT | LIBXML_HTML_NOIMPLIED | LIBXML_NONET);
And we can get all the <b>
:
//Obtener todos los tags <B>
$b_nodelist = $dom->getElementsByTagName('b');
To then iterate over the list of results, getting the text inside each label (without labels):
//Bucle para cada <b>
foreach ($b_nodelist as $b) {
//Obtener el contenido de texto del tag
$texto = $b->textContent;
echo "\n\nContenido del B:\n" . $texto; // => contenido
}
Result:
Contenido del B:
contenido
Demo:
See the 3v4l.org demo