openpyxl import ranges and convert into dictionary

0

On an Excel sheet I have a table with two columns (Date and Value). The date has the format dd/mm/yy . With the script I show below I import the data contained in the A63:B80 range of said Excel sheet.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import openpyxl
# Abrir el archivo y almacenarlo en doc
doc = openpyxl.load_workbook('cotizaciones.xlsx')
# hojas disponibles
print doc.get_sheet_names()
# seleccionamos hoja para trabajar
hoja = doc.get_sheet_by_name('Sheet')
hoja.title
# seleccionar un rango
seleccion = hoja['A63':'B80']
dicc_cotiz = {}
for filas in seleccion:
    for columnas in filas:
        print columnas.value

With this iteration I can see the imported data, with this format:

2017-06-15 00:00:00
44.14

There are two problems that I now pose: a) .- How could I create a dictionary with these data ?. b) .- How could you change the format of the date, so that it was the key in the dictionary with the format 2017-06-15 , or better yet, 15/06/2017 ?

    
asked by efueyo 18.09.2017 в 20:25
source

2 answers

1

Since the dates (if they are correctly parsed) are of type date you just have to use date.strftime to convert them to string with the desired format. Using compression dictionaries would be something like this:

dicc_cotiz = {fecha.value.strftime('%d/%m/%Y'): valor.value
                  for fecha,  valor in seleccion}

A strftime is passed a string with the desired output format, in this case day / month / year. You can modify it to your liking, in this link you have the directives that can be used to specify day, month, year, time, etc.

If the compression dictionaries confuse you the code with a normal for would be:

dicc_cotiz = {}
for fecha, valor in seleccion:
    dicc_cotiz[fecha.value.strftime('%d/%m/%Y')] = valor.value

Since you select two columns, your rows are always a tuple of two values. Instead of using indices that are less clear and less efficient, you can simply unpack the tuple in for directly, that's what we do with for fecha, valor in seleccion .

The output would be (example):

>>> dicc_cotiz
{'15/06/2017': 44, '16/06/2017': 45, '17/06/2017': 14}

Keep in mind that if you have repeated dates, only the last one will be stored (there can not be duplicate keys in a dictionary).

I am assuming that all date cells have valid date values, otherwise you will have to filter these rows.

    
answered by 18.09.2017 / 21:07
source
1

So that the read values can be transformed into a dictionary without problems it is necessary that each date is unique, but you must take into account that each element of the dictionary will have only the last value read for the date. That is, with these cells:

15/06/2016  44,14
15/06/2016  50

The resulting dictionary would be {"15/06/2016": 50}

By adapting your code a bit, you could do the following:

dicc_cotiz = {}
for fila in seleccion:
    fecha = (fila[0].value).strftime("%d-%m-%Y")
    valor = fila[1].value
    dicc_cotiz[fecha] = valor

First we iterate over the selection, this would return a set of rows and cells, we can access the latter by means of an index fila[0] would correspond to the cell of the first column of the row read.

On the other hand, in the A column, apparently because of what you say, you are receiving an object datetime , so by simply using the method strftime solve the theme of the format.

The update of the dictionary elements, we do it in the usual way diccionario[clave] = valor .

    
answered by 18.09.2017 в 21:15