Access element, list list - Python

1

Very good, I am new to Python and I am now struggling with the lists.

I expose what I have developed.

Currently I have a program that loads the following data structure through a .csv file.

1,4.0,?,?,none,?
2,2.0,3.0,?,none,38
2,2.5,2.5,?,tc,39

Later I store it in a list, applying a split in the commas so that I have a list with the following form.

['1', '4.0', '?', '?', 'none', '?\n']
['2', '2.0', '3.0', '?', 'none', '38\n']
['2', '2.5', '2.5', '?', 'tc', '39\n']

Based on that list I have to calculate the average of the elements in each column, that is, the mean for example of the first would be with the elements 1, 2 and 2, the second average with the elements 4.0 2.0 and 2.5 and so on.

My question is this, how to access those elements? , until now I was doing a for of the list, but I returned each row, that is, ['1', '4.0', '?', '?', 'None', '? \ N'] and so on, but after trying different ways, I can not get back the first of the elements of each one of the rows, then the second, and so on until the characters end.

I have the following function to which I pass the list to discuss, previously commented.

To subsequently obtain the elements of the first column, then the second and so on.

def promedio(lista):
    for elem in lista:
        print elem

def main():
print sys.argv[1]
lista = []

with open(sys.argv[1],'r') as f:
    for line in f:
       lista.append(line.split(','))
print f.close()
lista.pop()
True

promedio(lista)

if __name__ == '__main__':
    main()

In short, it is a main, which performs the processing of data and then passes the list to a function that calculates the average of the numerical values (not yet developed).

Any ideas? Thanks in advance.

    
asked by fiticida 01.11.2017 в 16:54
source

1 answer

2

You can use zip or itertools.izip (return an iterator in Python 2. x as does zip in Python 3) to obtain the columns in a simple way. If you pass a series of iterables it will return a list of tuples matching the elements that are in the same position.

Unpacking the nested lists uses the operator * .

To avoid including the line breaks in the last column, use the str.strip() method:

import sys 

def promedio(lista):
    columnas = zip(*lista)
    for columna in columnas:
        print(columna)

def main():
    lista = []
    with open(sys.argv[1],'r') as f:
        for line in f:
            lista.append(line.strip().split(','))
    promedio(lista)

if __name__ == '__main__':
    main()

That gives us the following output:

  

('1', '2', '2')
  ('4.0', '2.0', '2.5')
  ('?', '3.0', '2.5')
  ('?', '?', '?')
  ('none', 'none', 'tc')
  ('?', '38', '39')

Another option is to use for nested and access the elements by indexing:

def promedio(lista):
    for i in range(len(lista[0])):
        columna = [fila[i] for fila in lista]
        print(columna)

When using the% with (context handler) you do not have to explicitly close the file, that is already done automatically.

There are several ways to calculate the average, the first thing is to pass the values to float . Those that can not be passed as if you want to take them into account for the average can make them 0. We can do it with a small function.

One possibility is the following:

import sys 
import itertools


def promedio(lista):
    media =  [sum(columna)/len(columna) for columna in itertools.izip(*lista)]
    print(media)


def to_float(string):
    try:
        return float(string)
    except ValueError:
        return 0

def main():
    lista = []
    with open(sys.argv[1],'r') as f:
        lista = [[to_float(valor) for valor in line.strip().split(',')] for line in f]
    promedio(lista)

if __name__ == '__main__':
    main()

This gives us a list with the means of each column:

  

[1.6666666666666667, 2.8333333333333335, 1.8333333333333333, 0, 0, 25.666666666666668]

Link to play online: link

    
answered by 01.11.2017 / 17:36
source