How can I select the lines between two patterns?

11
  

Question and answer based on my own question and answer in How to select lines between two patterns?

I have a file like the following and I would like to print the lines that appear between the patterns PAT1 and PAT2 .

1
2
PAT1
3    - primer bloque
4
PAT2
5
6
PAT1
7    - segundo bloque
PAT2
8
9
PAT1
10    - tercer bloque

I have read How to select lines between two marker patterns which may occur multiple times with awk / sed but I am curious about see all the possible combinations, both printing the patterns and not.

How can I select the lines between two patterns?

    
asked by fedorqui 14.03.2017 в 16:16
source

1 answer

11

Prints lines between PAT1 and PAT2

$ awk '/PAT1/,/PAT2/' fichero
PAT1
3    - primer bloque
4
PAT2
PAT1
7    - segundo bloque
PAT2
PAT1
10    - tercer bloque

Or, using variables:

awk '/PAT1/{flag=1} flag; /PAT2/{flag=0}' fichero

How does it work?

  • The regular expression /PAT1/ matches the lines that contain the pattern /PAT1/ ; similarly, the same goes for /PAT2/ and the lines containing the /PAT2/ pattern.
  • /PAT1/{flag=1} initializes the traffic light flag when the% PAT1 appears on a line.
  • /PAT2/{flag=0} turns off the traffic light flag when the pattern PAT2 is on a line.
  • flag is a semaphore and acts as a pattern. When it has value 1 , it is interpreted as a value True and launches the action of awk , consisting of executing {print $0} , that is, printing the current line. In this way, all the lines that appear since PAT1 appear until PAT2 are printed. This also prints all lines since PAT1 appears last until the end of the file, if there is no PAT2 before.

Prints lines between PAT1 and PAT2 - not including PAT1 and PAT2

$ awk '/PAT1/{flag=1; next} /PAT2/{flag=0} flag' fichero
3    - primer bloque
4
7    - segundo bloque
10    - tercer bloque

This uses next to skip the line containing the PAT1 pattern, so it is not printed.

This use of next can be removed by rearranging the blocks: awk '/PAT2/{flag=0} flag; /PAT1/{flag=1}' fichero .

Prints lines between PAT1 and PAT2 - including PAT1

$ awk '/PAT1/{flag=1} /PAT2/{flag=0} flag' fichero
PAT1
3    - first block
4
PAT1
7    - second block
PAT1
10    - third block

By putting flag at the end of the whole, execute the action that was defined in PAT1 or PAT2. That is, to print on PAT1 and not on PAT2.

Prints lines between PAT1 and PAT2 - including PAT2

$ awk 'flag; /PAT1/{flag=1} /PAT2/{flag=0}' fichero
3    - first block
4
PAT2
7    - second block
PAT2
10    - third block

By putting flag at the beginning of the whole, execute the action that was previously defined. Therefore, print the closing pattern but not the start one.

Prints lines between PAT1 and PAT2 - excluding the lines between the last PAT1 and the end of the file if no other PAT2 appears

This is based on the Ed Morton solution .

awk 'flag{
        if (/PAT2/)
           {printf "%s", buf; flag=0; buf=""}
        else
            buf = buf $0 ORS
     }
     /PAT1/ {flag=1}' fichero

On a single line:

$ awk 'flag{ if (/PAT2/){printf "%s", buf; flag=0; buf=""} else buf = buf $0 ORS}; /PAT1/{flag=1}' fichero
3    - primer bloque
4
7    - segundo bloque

# nótese que el tercer bloque no aparece, pues no hay ningún patrón PAT2 después del último PAT1

This saves the selected lines in a buffer: from the moment a line coincides with the PAT1 pattern, it adds all the lines until PAT2 is found. At that time, print the saved lines and empty the buffer.

    
answered by 14.03.2017 / 16:16
source