RegExp for URL

0

Good day! I have a regular expression to get only the part that interests me about a URL. I have 2 types of URL link link

For now, I have achieved with this regexp ".*\/(.*\/.+)" get: page/login?execution=s3p1

The issue is that I need to get only this part: page / login. A single Regexp for both examples. I've tried some other one with which I get what I want, but trying the URL without "?execution=s3p1" does not work when I do not find any question.

Thank you in advance for all the help you can give me

    
asked by Ricardio 04.12.2018 в 11:36
source

1 answer

2

Specifically with regex:

I would do it using something like:

/\/(\w+)\/(\w+)(\?{1}.*)?$/

That means:

\/        un slash
(\w+)     un texto (grupo de captura)
\/        otro slash 
(\w+)     otro texto (grupo de captura)
(\?{1}.*)? query string "?blabla" (grupo de captura,opcional)
$         fin de la cadena

The fact that there is an explicit end of the string indicates that you will not be taking text + slash + text sets in between, but only the last two.

Since the url can have query string or not have it, all the possible query string is also a capture group. This means:

  • Without query string, the last capture group does not exist, but since it is followed by ? the third element captured is empty.
  • With query string, it has to be of the form ?xxxxxx .

A structure like this would allow you to parse a url that has (although it is not valid) two query string of the form

 https://www.miurl.com/aaa/bbb?param=1/ccc/bbb/?param=2

Because the first occurrence of the pattern slash-texto-slash-texto-signo de interrogación-todo lo demás already satisfies the criterion and a second occurrence does not change the result (nor are there recursive searches)

var expreg = /\/(\w+)\/(\w+)(\?{1}.*)?$/,
  url_con_qs = 'https://www.miurl.com.co/path/path/page1/login?execution=sp1',
  url_sin_qs = 'https://www.miurl.com.co/path/path2/page2/login',
  exec1 = expreg.exec(url_con_qs),
  exec2 = expreg.exec(url_sin_qs);

if (exec1) {
  console.log('url con qs',
    exec1.slice(1, 3).join('/'));
}

if (exec2) {
  console.log('url sin qs',
    exec2.slice(1, 3).join('/'));
}

A more elaborate iteration could be to support URLs that end with slash (optional) or that support slash + query string. You probably do not care.

/\/(\w+)\/(\w+)(\/?|\/?\?{1}.*)$/

A shorter way:

Given a string of the form

 https://www.miurl.com.co/path/path2/page2/login?lalala

Either without query string:

 https://www.miurl.com.co/path/path2/page2/login

You could explode by ? , keep the first part, exploit it by / , take the last two values, and join them with /

var url_con_qs = 'https://www.miurl.com.co/path/path/page1/login?execution=sp1',
  url_sin_qs = 'https://www.miurl.com.co/path/path2/page2/login',
  ruta1 = url_con_qs.split('?')[0].split('/').slice(-2).join('/'),
  ruta2 = url_sin_qs.split('?')[0].split('/').slice(-2).join('/');

  console.log('url con qs', ruta1);
  console.log('url sin qs', ruta2);
    
answered by 04.12.2018 в 12:22