There is a library that allows you to easily read CSV files. This is Univocity Parsers
For example, to read a CSV:
CsvParserSettings settings = new CsvParserSettings();
//the file used in the example uses '\n' as the line separator sequence.
//the line separator sequence is defined here to ensure systems such as MacOS and Windows
//are able to process this file correctly (MacOS uses '\r'; and Windows uses '\r\n').
settings.getFormat().setLineSeparator("\n");
// creates a CSV parser
CsvParser parser = new CsvParser(settings);
// parses all rows in one go.
List<String[]> allRows = parser.parseAll(getReader("/examples/example.csv"));
To do what you propose in the question, it would be enough to read a file, file it in a Map, and then read the second, archive it in another Map or in a list and then combine them.
I leave an example that could help you, although you may have to give it some tweaks:
First we read one of the CSV files and generate a Map:
public static void main(String... args) {
//First we parse one file (ideally the smaller one)
CsvParserSettings settings = new CsvParserSettings();
//here we tell the parser to read the CSV headers
settings.setHeaderExtractionEnabled(true);
CsvParser parser = new CsvParser(settings);
//Parse all data into a list.
List<String[]> records = parser.parseAll(new File("/path/to/csv1.csv"));
//Convert that list into a map. The first column of this input will produce the keys.
Map<String, String[]> mapOfRecords = toMap(records);
//this where the magic happens.
processFile(new File("/path/to/csv2.csv"), new File("/path/to/diff.csv"), mapOfRecords);
}
This is the code to generate a Map from the list of records:
/* Converts a list of records to a map. Uses element at index 0 as the key */
private static Map<String, String[]> toMap(List<String[]> records) {
HashMap<String, String[]> map = new HashMap<String, String[]>();
for (String[] row : records) {
//column 0 will always have an ID.
map.put(row[0], row);
}
return map;
}
With the record map, we can process the second CSV file and generate another Map with the updated data:
Note: The question in English was about comparing two files, maybe for your case at this stage you have to add the missing data of the second file. The example is adaptable.
private static void processFile(final File input, final File output, final Map<String, String[]> mapOfExistingRecords) {
//configures a new parser again
CsvParserSettings settings = new CsvParserSettings();
settings.setHeaderExtractionEnabled(true);
//All parsed rows will be submitted to the following Processor. This way you won't have to store all rows in memory.
settings.setProcessor(new RowProcessor() {
//will write the changed rows to another file
CsvWriter writer;
@Override
public void processStarted(ParsingContext context) {
CsvWriterSettings settings = new CsvWriterSettings(); //configure at till
writer = new CsvWriter(output, settings);
}
@Override
public void rowProcessed(String[] row, ParsingContext context) {
// Incoming rows from will have the ID as index 0.
// If the map contains the ID, we'll get a row
String[] existingRow = mapOfExistingRecords.get(row[0]);
if (!Arrays.equals(row, existingRow)) {
writer.writeRow(row);
}
}
@Override
public void processEnded(ParsingContext context) {
writer.close();
}
});
CsvParser parser = new CsvParser(settings);
//the parse() method will submit all rows to the RowProcessor defined above. All differences will be
//written to the output file.
parser.parse(input);
}
Source: Stackoverlow in English: Java compare two csv files