Filter data from a CSV table

1

I have an ASCII file and this I do the conversion from excel and I separate it by spaces or also with commas.

But I want to filter only some data with all its row that have a common pattern (9999) and based on this create a new table that contains the previous data and at the end do a summation of the time. This is an example of my table:

Fecha,     Tiempo,  Error, Dato2,Dato3
21092016, 07:09:50,  5.2,   AA,  BB
21092016, 07:10:30,  9999,  AA,  BB
21092016, 07:11:50,  4.2,   AA,  BB
21092016, 07:12:10,  5.2,   AA,  BB
21092016, 07:15:20,  9999,  AA,  BB
21092016, 07:17:53,  9999,  AA,  BB
21092016, 07:18:10,  5.2,   AA,  BB
21092016, 07:19:00,  4.2,   AA,  BB

I could make a dynamic table, but the curious thing about this filtering is that when I pull the row that contains 9999 I want it to also filter the next data that does not contain 9999 bone (4.2 or 5.2 or less than 9999). This in order to count the time it took that error that would be more or less as it should look the final table:

  Fecha,     Tiempo,  Error, Dato2, Dato3, TiempoError
    21092016, 07:10:30,  9999,  AA,  BB,    00:01:20 (Resta 07:11:50 - 07:10:30)
    21092016, 07:11:50,  4.2,   AA,  BB,       -
    21092016, 07:15:20,  9999,  AA,  BB,    00:02:50 (Resta 07:18:10 - 07:15:20)
    21092016, 07:17:53,  9999,  AA,  BB,       -
    21092016, 07:18:10,  5.2,   AA,  BB,       -
    21092016, 07:20:10,  5.2,   AA,  BB,       -
    21092016, 07:21:10,  4.2,   AA,  BB,       -

                        TOTAL TIEMPO ERROR: 00:04:10

What logic would help me detect the next data, after a 9999? What do you recommend doing, programming c # for csv? Macros excel? Dynamic tables?

I would really appreciate it! Greetings

    
asked by David Tod 25.09.2016 в 04:23
source

1 answer

2

It seems to be a simple CSV , meaning that the fields do not include the string NewLine or the comma. Therefore from c # you can process the file as text, extracting the lines that behave as records:

string path = "ruta del archivo CSV";
string[] lineas = System.IO.File.ReadAllLines(path);

Then with a loop for you can go through register by record, searching for error 9999 :

for ( int i = 0; i < lineas.Length - 1; i++ )    { /* ... */ }

The -1 is important to prevent it from passing the vector boundary; because within the loop the line indexed by i will be accessed, and if it corresponds to the error 9999 , then it will try to read the line i + 1 , from which it is deduced that i can take as value maximum lineas.Length - 2

Example for the solution:

// Establece la ruta completa del archivo CSV.
string path = @"A:\CSV.txt";

// Lee todas las líneas del archivo CSV.
string[] lineas = System.IO.File.ReadAllLines(path);

// El resultado se mostrará en un DataTable.
DataTable dt = new DataTable();
dt.Columns.Add("Fecha");
dt.Columns.Add("Tiempo");
dt.Columns.Add("Error");
dt.Columns.Add("Dato2");
dt.Columns.Add("Dato3");
dt.Columns.Add("TiempoError");

// Acumula el total de la diferencia de tiempos.
TimeSpan total = TimeSpan.Zero;

// Campos para las líneas que se leerán en el bucle for.
string[] campos1, campos2;

// Bucle for para recorrer línea a línea.
for ( int i = 0; i < lineas.Length - 1; i++ )
{
    campos1 = lineas[i].Split(',');

    int num;    Int32.TryParse( campos1[2], out num);

    // Verifica la condición solicitada.
    if (num == 9999)
    {
        campos2 = lineas[i + 1].Split(',');

        // Lee los tiempos para obtener la diferencia.
        DateTime dt1, dt2;

        // TryParseExact() permite especificar un formato personalizado,
        // en este caso es "ddMMyyyyHH:mm:ss"
        DateTime.TryParseExact(campos1[0].Trim() + campos1[1].Trim() , "ddMMyyyyHH:mm:ss",
                               null, System.Globalization.DateTimeStyles.None, out dt1);
        DateTime.TryParseExact(campos2[0].Trim() + campos2[1].Trim() , "ddMMyyyyHH:mm:ss",
                               null, System.Globalization.DateTimeStyles.None, out dt2);

        // Las operaciones entre DateTime siempre se deben realizar con TimeSpan.
        TimeSpan dif = dt2 - dt1;
        total += dif;

        // Se agrega al DataTable la fila del tiempo inicial.
        DataRow dr1 = dt.NewRow();
        dr1.ItemArray = campos1;
        // Incluye la diferencia entre los tiempos de ambas líneas.
        dr1["TiempoError"] = dif.ToString();
        dt.Rows.Add(dr1);

        // Agrega al DataTable la fila del tiempo final, si no cumple
        // con la condición Error == 9999.
        // En caso de cumplir con dicha condición, esta línea será
        // evaluada en la siguiente iteración.
        // Así se evita la duplicación de la fila.
        Int32.TryParse(campos2[2], out num);
        if (num != 9999)
        {
            DataRow dr2 = dt.NewRow();
            dr2.ItemArray = campos2;
            dt.Rows.Add(dr2);
        }
    }    
}
// Agrega la línea que muestra el total de TiempoError.
DataRow dr = dt.NewRow();            
dr[4] = "Total tiempo error:";
dr[5] = total.ToString();
dt.Rows.Add(dr);

// Muestra el resultado en una DataGridView.
this.dataGridView1.DataSource = dt;

Keep in mind that 9999 is what is usually called magic number , so it is convenient to use a constant for it.

Links of interest:

answered by 25.09.2016 / 08:42
source