How to use split ()?

0

I have a .txt file that is read from which I need to take apart in several words

PRU.U,
 PRU.D,
 EJEM.T,
EJEM.C

For example that pru is equal to prueba and that u is uno And so on each one with their respective words, but I do not know how to use split or stringtokenizer

// This would be an example of the text that reads

Ademasde    
PRU.U,
//      
 PRU.D,
//      
 EJEM.T,
//      
EJEM.C

a

    Scanner input;

try{

            input = new Scanner(new File("C:\Users\USUARIO\Desktop\EJEMPLO.txt"));

            while (input.hasNextLine()) {

                String line = input.nextLine();

                String[] ON = line.split("\stu.");
                for (int x = 0; x < ON.length; x++) {
                    String Tabla = "EJEMPLO ";

                    System.out.println(Tabla + ON[x].replace(",", ""));

                }
            }
        } catch (FileNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

    }

}
    
asked by TGAB99 30.11.2018 в 06:10
source

1 answer

0

You need to correct your .

With the line

String line = input.nextLine();

you are already separating by lines, that is, the first time you call it you will get PRU.U, // in variable line .

We will use the regular expression (\p{Alpha}*\.\p{Alpha}*).* which is composed of the following elements:

  • \p{_} undicates that we will look for a group of characters, a sub-range within unicode.
  • \p{Alpha} indicates that we will look for alphanumeric characters
  • the * indicates that it should appear 0 or more times
  • \. is the point character.
  • . is any character.
  • anything in parentheses is a group, we number them from left to right from 1 onwards
  • Now instead of doing split we will replace as follows:

    String nobreArchivo = line.replaceAll("(\p{Alpha}*\.\p{Alpha}*).*","$1");
    

    Where if you look at the second parameter, use the sign of weights to indicate that we will use the result of the group number that we specify, in our case we only catch a result in a group and it is the group that returns, group 1.

    Addendum

    In case you get to occupy it, you can use the following matchers :

    • \W indicates a word character
    • \b indicates a start or end of a word (yes, you can put two \b and what is in the middle will assume that it is part of the same word)
    answered by 30.11.2018 в 18:28