like parsing an XML document to print it in console


Good afternoon, I need help to parse a document .XML in Python, I just need to know the best way to printearlo in the console (bash or cmd). the question is that I have already managed to parse the document, but in it there are elements that are repeated example <p> . How do I print all the elements as well as their childs when they are repeated. in other words I want to print <p> and its childs ( tag and texto ) and how it repeats also print it. If you explain to me? ... Thank you in advance to those who can collaborate. Regards!

XML document:

<?xml version="1.0" encoding="UTF-8"?>
    <r_l>text 1</r_l>
      <t>text 2</t>
      <o>text 3</o>
      <d>text 4</d>
        <t>text 5</t>
        <i_n>text 6</i_n>
        <ln>text 7</ln>
        <fn>text 8</fn>
        <fi>text 9</fi>
        <t>text 10</t>
        <i_n>text 11</i_n>
        <ln>text 12</ln>
        <fn>text 13</fn>
        <p_t>text 14</p_t>
        <fi>text 15</fi>
asked by mega_yizuz 30.05.2017 в 20:28

1 answer


If you do not want to use external libraries, you can use the code described in this answer from another similar thread in English . What this response proposes is to define a function indent (roughly translated as 'tabular') with the logic necessary to add as many spaces and line breaks as necessary, to show each element of the XML tree in its correct position according to the depth level of the element. I include the code here for convenience.

from xml.etree import ElementTree                                                                                  

def indent(elem, level=0):                                                                                         
    i = "\n" + level*"  "                                                                                          
    j = "\n" + (level-1)*"  "                                                                                      
    if len(elem):                                                                                                  
        if not elem.text or not elem.text.strip():                                                                 
            elem.text = i + "  "                                                                                   
        if not elem.tail or not elem.tail.strip():                                                                 
            elem.tail = i                                                                                          
        for subelem in elem:                                                                                       
            indent(subelem, level+1)                                                                               
        if not elem.tail or not elem.tail.strip():                                                                 
            elem.tail = j                                                                                          
        if level and (not elem.tail or not elem.tail.strip()):                                                     
            elem.tail = j                                                                                          
    return elem                                                                                                    

root = ElementTree.parse('mi_archivo.xml').getroot()                                                       

If you prefer to use an external library, you can use BeautifulSoup . To install it, according to your version of Python, execute the command:

# Python 2
pip install beautifulsoup4
# Python 3
pip3 install beautifulsoup4

Finally, to use it in your source code:

from bs4 import BeautifulSoup

bs = BeautifulSoup(open('mi_archivo.xml'), 'xml')
print bs.prettify()
answered by 31.05.2017 / 11:51