Capture text within quotation marks within tag

1

I want to make a match to any text between quotes that is between a couple of specific tags, for example Code:

codigo cualquiera 1

[begin]

 "hola" <--- match a este

 123

 hola "hola" <-- math a este otro

[end]
[otro]

  "hola" <--- a este no

[otro]

The regular expression I want to help me is one that can capture the text "hola" (with double quotes) that are within the [begin] and [begin] but not the hola that is not enclosed in double quotes .

I tried:

[begin](\s*|.*)|".*"(\s*|.*)*[end]

This captures all the content between [begin] and [end] , but what I'm looking for is to capture any text that is enclosed in double quotes.

    
asked by wizeuce 01.11.2017 в 22:56
source

1 answer

2

Text within quotation marks, within [begin] and [end]

Regular expression:

/(?:\G(?!\A)|\[begin])[^["]*(?:\[(?!end])[^["]*)*"([^"]*)"/

Description:

  • (?:\G(?!\A)|\[begin]) - Matches the start of a label or resumes from the position just after the last quotation marks that coincided.
    • Option 1: \G(?!\A) - matches the final position of the last match . \G also matches the start of the string, so we use (?!\A) , which is not followed by the initial position of the string, to avoid it.
    • Option 2: \[begin] - literal [begin]
  • [^["]*(?:\[(?!end])[^["]*)* - Consume all characters up to quotes or the [end] tag.

    • [^["]* - any character other than [ or " .
    • (?:\[(?!end])[^["]*)* - optionally followed by a [ that is not part of [end] and more characters to consume with [^["]* , this repeated 0 or more times.

    This structure is analogous to using .*? but it is more efficient. For more details, see Unrolling the loop .

  • "([^"]*)" - Text in quotes, captured by the group (parentheses ) to obtain it as an independent index in the results array.

Demo: link


Code:

<?php
$texto = 'codigo cualquiera 1

[begin]

 "hola" <--- match a este

 123

 hola "hola" <-- math a este otro

 "chau!"

[end]
[otro]

  "hola" <--- a este no

[otro]';



//Aplicar el regex
$regex = '/(?:\G(?!\A)|\[begin])[^["]*(?:\[(?!end])[^["]*)*"([^"]*)"/';
preg_match_all($regex, $texto, $resultado);

//Imprimir resultados del primer grupo, que es un array dentro de $resultado[1]
echo join("\n", $resultado[1]);

Result:

hola
hola
chau!

Demo: link

    
answered by 01.11.2017 / 23:27
source