Extract part of a text that is in the middle of a string

0

I need to extract a text that is in the middle of a string using, but I do not know how.

The text is as follows:

string cadena =jy193UAhUHJsAKHV4rD904PBAWCC0wAA&url=http%3A%2F%2Fwww.icali.es%2FPORTAL_ICALI%2FprintPortal.do%3FurlPagina%3DS005013001%2Fes_ES.html&usg=AFQjCNH-c6dVemIxU_GaSYgoGPNXWVztIA

The text that I have to extract is a URL that is limited by &url= and &usg= . This URL I think is encoded in hexadecimal, if I'm not deceived, and if you know how to convert that URL to normal text, it would be the cane.

Example:

&url=(Aquí va la url)&usg=

I thought about solving it with regular expressions. How could I do it?

    
asked by Alonso 27.06.2017 в 11:40
source

5 answers

2

Good, so I see all the answers are aimed at using System.Web , in case you can not or do not want to use it I put another possible solution.

string cadena = "jy193UAhUHJsAKHV4rD904PBAWCC0wAA&url=http%3A%2F%2Fwww.icali.es%2FPORTAL_ICALI%2FprintPortal.do%3FurlPagina%3DS005013001%2Fes_ES.html&usg=AFQjCNH-c6dVemIxU_GaSYgoGPNXWVztIA";
        if (cadena.Contains("&url=") && cadena.Contains("&usg="))
        {
            var subCadena = cadena.Split(new string[]{"&url=", "&usg="},StringSplitOptions.RemoveEmptyEntries).Where(x => x.StartsWith("http")).FirstOrDefault() ;
            string url =Uri.UnescapeDataString(subCadena);

        }

Greetings, and I hope it helps you.

    
answered by 27.06.2017 в 12:36
1

If you use the System.Web assembly in your development you can use the method HttpUtility.ParseQueryString to create a collection of NameValueCollection and be able to handle the parameters of the string. In the link there is this example.

<%@ Page Language="C#"%>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<script runat="server">

  protected void Page_Load(object sender, EventArgs e)
  {
    String currurl = HttpContext.Current.Request.RawUrl;
    String querystring = null ;

    // Check to make sure some query string variables
    // exist and if not add some and redirect.
    int iqs = currurl.IndexOf('?');
    if (iqs == -1)
    {
      String redirecturl = currurl + "?var1=1&var2=2+2%2f3&var1=3";
      Response.Redirect(redirecturl, true); 
    }
    // If query string variables exist, put them in
    // a string.
    else if (iqs >= 0)
    {
      querystring = (iqs < currurl.Length - 1) ? currurl.Substring(iqs + 1) : String.Empty;
    }

    // Parse the query string variables into a NameValueCollection.
    NameValueCollection qscoll = HttpUtility.ParseQueryString(querystring);

    // Iterate through the collection.
    StringBuilder sb = new StringBuilder("<br />");
    foreach (String s in qscoll.AllKeys)
    {
      sb.Append(s + " - " + qscoll[s] + "<br />");
    }

    // Write the result to a label.
    ParseOutput.Text = sb.ToString();

  }
</script>

<html xmlns="http://www.w3.org/1999/xhtml" >
<head runat="server">
    <title>HttpUtility ParseQueryString Example</title>
</head>
<body>
    <form id="form1" runat="server">
      Query string variables are:
      <asp:Label  id="ParseOutput"
                  runat="server" />
    </form>
</body>
</html>
    
answered by 27.06.2017 в 12:20
1

You can also "parse", without using the assembly System.Web , the string generating a NameValueCollection and access its elements. (based on Rick Strahl's blog A. NET QueryString and Form Data Parser

Create a pairing class

using System.Collections.Specialized;

/// <summary>
/// A query string or UrlEncoded form parser and editor 
/// class that allows reading and writing of urlencoded
/// key value pairs used for query string and HTTP 
/// form data.
/// 
/// Useful for parsing and editing querystrings inside
/// of non-Web code that doesn't have easy access to
/// the HttpUtility class.                
/// </summary>
/// <remarks>
/// Supports multiple values per key
/// </remarks>
public class UrlEncodingParser : NameValueCollection
{

    /// <summary>
    /// Holds the original Url that was assigned if any
    /// Url must contain // to be considered a url
    /// </summary>
    private string Url { get; set; }

    /// <summary>
    /// Always pass in a UrlEncoded data or a URL to parse from
    /// unless you are creating a new one from scratch.
    /// </summary>
    /// <param name="queryStringOrUrl">
    /// Pass a query string or raw Form data, or a full URL.
    /// If a URL is parsed the part prior to the ? is stripped
    /// but saved. Then when you write the original URL is 
    /// re-written with the new query string.
    /// </param>
    public UrlEncodingParser(string queryStringOrUrl = null)
    {
        Url = string.Empty;

        if (!string.IsNullOrEmpty(queryStringOrUrl))
        {
            Parse(queryStringOrUrl);
        }
    }


    /// <summary>
    /// Assigns multiple values to the same key
    /// </summary>
    /// <param name="key"></param>
    /// <param name="values"></param>
    public void SetValues(string key, IEnumerable<string> values)
    {
        foreach (var val in values)
            Add(key, val);
    }

    /// <summary>
    /// Parses the query string into the internal dictionary
    /// and optionally also returns this dictionary
    /// </summary>
    /// <param name="query">
    /// Query string key value pairs or a full URL. If URL is
    /// passed the URL is re-written in Write operation
    /// </param>
    /// <returns></returns>
    public NameValueCollection Parse(string query)
    {
        if (Uri.IsWellFormedUriString(query, UriKind.Absolute))
            Url = query;

        if (string.IsNullOrEmpty(query))
            Clear();
        else
        {
            int index = query.IndexOf('?');
            if (index > -1)
            {
                if (query.Length >= index + 1)
                    query = query.Substring(index + 1);
            }

            var pairs = query.Split('&');
            foreach (var pair in pairs)
            {
                int index2 = pair.IndexOf('=');
                if (index2 > 0)
                {
                    Add(pair.Substring(0, index2), pair.Substring(index2 + 1));
                }
            }
        }

        return this;
    }

    /// <summary>
    /// Writes out the urlencoded data/query string or full URL based 
    /// on the internally set values.
    /// </summary>
    /// <returns>urlencoded data or url</returns>
    public override string ToString()
    {
        string query = string.Empty;
        foreach (string key in Keys)
        {
            string[] values = GetValues(key);
            foreach (var val in values)
            {
                query += key + "=" + Uri.EscapeUriString(val) + "&";
            }
        }
        query = query.Trim('&');

        if (!string.IsNullOrEmpty(Url))
        {
            if (Url.Contains("?"))
                query = Url.Substring(0, Url.IndexOf('?') + 1) + query;
            else
                query = Url + "?" + query;
        }

        return query;
    }
}

Create the console application

namespace TestExample
{
    class Program
    {
        static void Main()
        {
            var query = "jy193UAhUHJsAKHV4rD904PBAWCC0wAA&url=http%3A%2F%2Fwww.icali.es%2FPORTAL_ICALI%2FprintPortal.do%3FurlPagina%3DS005013001%2Fes_ES.html&usg=AFQjCNH-c6dVemIxU_GaSYgoGPNXWVztIA";

            var urlQuery = new UrlEncodingParser(query);

            foreach (string key in urlQuery.Keys)
            {
                Console.WriteLine($"{key} {Uri.UnescapeDataString(urlQuery[key])}");
            }

            Console.ReadKey();
        }
    }
}

Exit of the program

    
answered by 27.06.2017 в 12:41
1

As @ Mariano says, it does not seem necessary to use a regex for what you're trying to achieve. What I would do would be to separate the string using Split with the two separating strings as parameters, and then check which of the strings starts with http . Later, to decode the url you can use the UrlDecode method of System.Web :

string cadena = "jy193UAhUHJsAKHV4rD904PBAWCC0wAA&url=http%3A%2F%2Fwww.icali.es%2FPORTAL_ICALI%2FprintPortal.do%3FurlPagina%3DS005013001%2Fes_ES.html&usg=AFQjCNH-c6dVemIxU_GaSYgoGPNXWVztIA";
var url = System.Web.HttpUtility
           .UrlDecode(cadena.Split(new string[] { "&url=", "&usg" }, StringSplitOptions.RemoveEmptyEntries)
           .Where(x=>x.StartsWith("http")).FirstOrDefault());

//en url obtenemos http://www.icali.es/PORTAL_ICALI/printPortal.do?urlPagina=S005013001/es_ES.html

Actually, instead of having to import the System.Web assembly, it's better to use Uri.UnescapeDataString as it says @ Gerardo

    
answered by 27.06.2017 в 11:57
1

Another solution that is a little tricky could be to do a substring from the indexes of those two elements. Something like this:

string cadena = "jy193UAhUHJsAKHV4rD904PBAWCC0wAA&url=http%3A%2F%2Fwww.icali.es%2FPORTAL_ICALI%2FprintPortal.do%3FurlPagina%3DS005013001%2Fes_ES.html&usg=AFQjCNH-c6dVemIxU_GaSYgoGPNXWVztIA";
// Obtenemos la posición del &url= + 5 para obtener la posición del =
int indice1 = cadena.IndexOf("&url=") + "&url=".Length; 
// Obtenemos la posición del &usrg=
int indice2 = cadena.IndexOf("&usg=");
// Restamos los índices para saber cuantos caracteres tenemos que coger
int caracteres = indice2 - indice1;
// Finalmente hacemos un substring del primer índice, cogiendo el número de caracteres necesarios. 
string cadena2 = cadena.Substring(indice1, caracteres);
    
answered by 27.06.2017 в 12:24