List of CSV lists in Python

1

Through the CSV that I leave at the end of the code I get the information that will be modified with my code. The problem is that when I have the information I want and I want to save it in a new CSV I can not because I only copy the first row of the CSV and not all

Code:

import csv
with open('report_2017_12_06_23_06_16UTC.csv') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        row['fillRate'] = float(row['fillRate'])
        row['fillRate'] = row['fillRate']*100
        row['fillRate'] = '%.2f' % row['fillRate']
        row['fillRate'] = str(row['fillRate'])
        if row['domain'] == row['ddomain']:
            row['domain'] = row['ddomain']
        else: 
            row['domain'] =  row['ddomain']+" (MD != MR)"
        a = (row['trafficChannel'], row['date'], row['domain'], row['country'], row['opportunities'], row['impressions'], row['fillRate'])

        your_list = list(a)

        your_list = ', '.join(your_list)

with open("file.csv", "w") as output:
    for items in your_list:
        output.write(str(items))

CSV

    
asked by Martin Bouhier 07.12.2017 в 15:07
source

1 answer

1

The first problem is that when doing your_list = list(a) in each row of the csv, in the end it only contains the last row. For your approach to be correct, you should add each modified row to the list (list.append) within for and end up iterating over it by writing each sublist as a row.

However, instead of creating a list that is inefficient (you end up with all the csv in memory and it is slow to build), you should use csv.DictWriter to go writing each row directly as you iterate over the original file.

Your code may look something like this:

import csv

input_file = 'report_2017_12_06_23_06_16UTC.csv'
output_file= "file.csv"

with open(input_file) as csvfile, open(output_file,  "w") as output:
    reader = csv.DictReader(csvfile)
    writer = csv.DictWriter(output, fieldnames=reader.fieldnames)
    writer.writeheader()
    for row in reader:
        row['fillRate'] = '{:.2f}'.format(float(row['fillRate']) * 100)
        row['domain'] = row['ddomain'] if row['domain'] == row['ddomain']\
                                           else row['ddomain'] + " (MD != MR)"
        writer.writerow(row)

Edit:

If you do not want all the columns simply pass it an iterable one with the ones you want to the%% parameter% of fielnames and specify the parameter csv.DictWriter as extrasaction :

import csv

input_file = 'report_2017_12_06_23_06_16UTC.csv'
output_file= "file.csv"

with open(input_file) as csvfile, open(output_file,  "w") as output:
    reader = csv.DictReader(csvfile)
    cols = ("trafficChannel", "date", "domain", "country",
            "opportunities", "impressions", "fillRate")
    writer = csv.DictWriter(output, fieldnames=cols, extrasaction='ignore')

    writer.writeheader()
    for row in reader:
        row['fillRate'] = '{:.2f}'.format(float(row['fillRate']) * 100)
        row['domain'] = row['ddomain'] if row['domain'] == row['ddomain']\
                                           else row['ddomain'] + " (MD != MR)"
        writer.writerow(row)

That for the example csv that you show:

trafficChannel,trafficChannelId,date,domain,ddomain,country,opportunities,impressions,fillRate
aaaa,,11/29/17,juegos.com,juegos.com,ES,994,8,0.00804829
aaaa,,11/29/17,vinted.cz,vinted.cz,CZ,203,1,0.004926108
aaaa,,11/29/17,collinsdictionary.com,collinsdictionary.com,BE,421,1,0.002375297
aaaa,,11/30/17,urldelivery.com,chatytvgratishd.me,CO,490,1,0.002040816
aaaa,,11/30/17,androidpit.com.br,androidpit.com.br,BR,125,2,0.016
aaaa,,12/1/17,eredmenyek.com,eredmenyek.com,HU,230,1,0.004347826

We get:

trafficChannel,date,domain,country,opportunities,impressions,fillRate
aaaa,11/29/17,juegos.com,ES,994,8,0.80
aaaa,11/29/17,vinted.cz,CZ,203,1,0.49
aaaa,11/29/17,collinsdictionary.com,BE,421,1,0.24
aaaa,11/30/17,chatytvgratishd.me (MD != MR),CO,490,1,0.20
aaaa,11/30/17,androidpit.com.br,BR,125,2,1.60
aaaa,12/1/17,eredmenyek.com,HU,230,1,0.43
    
answered by 07.12.2017 в 16:29