RegEx: Capture text between backslashs (Path)

1

Let's see if you can help me please. I am trying to capture with RegEx with the program "The regex Coach" the names of the directories and subdirectories of a path. Example:

  

Projects \ UP / PS (21063) \ 2789 (spain) / Ref / 15 \ Email

I want to capture what is between the backslash except what is between the first two. Let me explain, I want to capture the following:

  

UP / PS

I used ([^\]+)\s (it does not work, I take the blank at the end)

  

(21063)

I used \((.*?)\) (works perfectly)

  

2789 (spain) / Ref / 15

I can not jump to the second backslash

  

Email

I can not jump to the third backslash

I've tried a thousand ways but I can not. I hope it's clearer now (sorry, the first time I did not explain it better) Thanks for the help! :)

    
asked by dani3c 13.08.2018 в 10:35
source

2 answers

2

A very easy to understand expression is the following:

(.*?)\(.*?)\s(.*?)\(.*?)\(.*)

Use the non-greedy operator, *? , and capture groups. It means the following:

  • (.*?)\ Capture (group in parentheses) all the characters that appear ( .* ) until you find the first backslash. The question is what causes it to stop at the first one it finds ( non greedy ) instead of continuing until the last one, which would be the default behavior ( greedy ).
  • (.*?)\s Analogous to the previous one, capture (another group) all the characters up to the first space
  • And so each of the following groups, except the last one
  • (.*) captures everything else

Demo: link

    
answered by 13.08.2018 в 12:16
1

Look to see if this is what you are looking for:

([^\]+)\(\S+)\s([^\]+)\([^\]+)\(.+)

The result is:

Grupo 1 = Projects
Grupo 2 = UP/PS
Grupo 3 = (21063)
Grupo 4 = 2789 (spain) / Ref/15
Grupo 5 = Email

Demo

Explained:

([^\]+)   # Cualquier caracter salvo backslash 1 o más veces
\         # backslash
(\S+)      # Cualquier caracter salvo espacios una o más veces
\s         # un espacio (incluye tabuladores, saltos de línea...)
([^\]+)   # Cualquier caracter salvo backslash 1 o más veces
\         # backslash
([^\]+)   # Cualquier caracter salvo backslash 1 o más veces
\         # backslash
(.+)       # Cualquier caracter salvo salto de línea
    
answered by 13.08.2018 в 12:11