Get list of child nodes in XML

Question

Get list of child nodes in XML

Navigation

#1 by (1 votes)

0

I have a problem with a Python exercise in which I have to use an XML file, the statement is this:

In a list we have different province identifiers, show the name of the provinces and all the municipalities corresponding to the identifiers that are in the list.

The file in question is very long but I put a fragment so you can see more or less what needs to be done:

<lista>

   <provincia id="01">

      <nombre>Alava</nombre>

   <localidades>

     <localidad c="0">Aberasturi</localidad>
     <localidad c="0">Abetxuko</localidad>
     <localidad c="0">Abezia</localidad>
     <localidad c="0">Abornikano</localidad>
     <localidad c="0">Acebedo</localidad>

   <provincia id="02">

      <nombre>Barcelona</nombre>

   <localidades>

     <localidad c="0">ej1</localidad>
     <localidad c="0">ej2</localidad>
     <localidad c="0">ej3</localidad>
     <localidad c="0">ej4</localidad>
     <localidad c="0">ej5</localidad>

The thing is, I need to get the format:

 Provincia: Alava 01

 Localidades:

 Aberasturi 01
 Abetxuko 01
 Abezia 01

 etc...

and so on with each province, but I do not know how to do it. I have managed to get the provinces with their respective locations but I am not able to get the id of each province and print it after your name and your own locations.

I leave my code and see if anyone can help me:

from lxml import etree

doc = etree.parse('provinciasypoblaciones.xml')
raiz=doc.getroot()

for i in range(len(raiz)):
    provincia=raiz[i]
    print(provincia[0].text)

    for j in range(len(provincia[1])):
        print(provincia[1][j].text)

python python-3.x xml

asked by ShJod 15.02.2018 в 17:02

source

1 answer

Error uploading images in Laravel Retrieve variable from a URL

score 1 · Answer 1

Well, the idea is already there, you only need to access the attributes, using the attrib dictionary of each node, for example.

I recommend not using range and indexing to go through a list, instead use for in , it is more efficient and more readable. A possible option to get what you want would be:

from lxml import etree

doc = etree.parse('provinciasypoblaciones.xml')
raiz=doc.getroot()

for provincia in raiz:
    nombre, localidades = provincia.getchildren()
    print('\nProvincia: {} {}\n  Localidades:'.format(nombre.text, provincia.attrib['id']))
    for localidad in localidades:
        print('    {} {}'.format(localidad.text, provincia.attrib['id']))

That for your example xml (correcting the closing of the labels) would show us:

Provincia: Alava 01
  Localidades:
    Aberasturi 01
    Abetxuko 01
    Abezia 01
    Abornikano 01
    Acebedo 01

Provincia: Barcelona 02
  Localidades:
    ej1 02
    ej2 02
    ej3 02
    ej4 02
    ej5 02