"replaceAll ()" in java does not work correctly

1

Good morning,

I am using a small line of code to clean the price information obtained through "scrap". Since I get from different countries, I have different currencies and I want to leave the number clean.

For this I use:

price.select("span").first().text().replace(".", "").replaceAll("SG$|CAD|R$|HUF|€|₽|incl. GST|$|R|₹|£|¥|₩|NT$","")

What is the problem? Well, there are 3 characters that do not take away: $, SG $ (singapore currency) and NT $ (Taiwan Currency).

Which one could understand that the currency of Taiwan did not work well since before replacing NT $ is the substitution of the dollar ($), I understand that NT should stay then. However, neither one nor the other happens.

I have not just found the problem, the rest of the characters do remove them.

    
asked by JetLagFox 12.03.2017 в 01:26
source

5 answers

1

Good morning,

The solution has finally been this:

price.select("s").first().text().replace(".", "").replaceAll("SG\$|CAD|R\$|HUF|€|₽|incl. GST|NT\$|R|₹|£|¥|₩|\$","")); 
    
answered by 14.03.2017 / 14:55
source
1

Remember that special characters like $ in html are represented with a code and not literal

Thus the equivalent of $ is $

then change your replace including the code and not the literal.

It's what I can think of that could be

Here is a table with special characters link

    
answered by 12.03.2017 в 01:59
0

EDIT The chain you want to replace has several problems, you could maybe solve with a regex but ... for example, apart from the problem with the sign of $, you ask him to replace the points and inside the coins you have a value with point (.).

Then, I propose this solution. It works by creating an imaginary chain:

public class Reemplazar
{
  public static void main(String[] args)
  {
    /* CADENA IMAGINARIA: 
     * Para tu código usa esta variable:
     * String sSinMoneda=price.select("span").first().text(); 
     * La de abajo es para probar el programa
     */

    String sSinMoneda="SG$100. $1000 SG$1000 200NT$ 1CAD 50HUF 20€ ₽456 incl. GST 6875 R45 ₹67 £658 ¥89 ₩234 NT$142";

    //Este valor tiene un punto (.) por lo tanto reemplazamos antes que reemplazar el . sólo
    sSinMoneda = sSinMoneda.replace("incl. GST", "");
    sSinMoneda = sSinMoneda.replace(".", "");
    sSinMoneda = sSinMoneda.replaceAll("CAD|HUF|€|₽|incl. GST|R|₹|£|¥|₩","");
    //Todos los reemplazos con el signo $
    sSinMoneda = sSinMoneda.replace("SG$","");
    sSinMoneda = sSinMoneda.replace("R$","");
    sSinMoneda = sSinMoneda.replace("NT$","");

    //Dejamos a $ sólo para el final
    sSinMoneda = sSinMoneda.replace("$","");

    System.out.println(sSinMoneda);
  }
}

Result

  

100 1000 1000 200 1 50 20 456 6875 45 67 658 89 234 142

    
answered by 12.03.2017 в 05:29
0

Good, try to put before the '$' the backslash '\', so for example:

replaceAll("SG$|CAD|R$|HUF|€|₽|incl. GST|$|R|₹|£|¥|₩|NT$","")
    
answered by 13.03.2017 в 13:23
-1

Your problem is that you want to use as characters those that represent other things ( see documentation ):
    1.- The character . represents any character (that's why you see that you put it aside, but you can include it in the regexp)
    2.-The $ character represents a line ending

To mark a specific character, wrap it in square brackets []. If you change your regular expression to the one of the code that I hit you, it should work (it also includes the point and you save another replace ):

String ejemplo = "SG$100. $1000 SG$1000 200NT$ 1CAD 50HUF 20€ ₽456 incl. GST 6875 R45 ₹67 £658 ¥89 ₩234 NT$142";
ejemplo = ejemplo.replaceAll("SG[$]|CAD|R[$]|HUF|€|₽|incl. GST|[$]|R|₹|£|¥|₩|NT[$]|[.]","");
System.out.println(ejemplo);

With this the output is:

  

100 1000 1000 200 1 50 20 456 6875 45 67 658 89 234 142

    
answered by 13.03.2017 в 13:12