Sort by attribute selected nodes with xpath [duplicate]

0

I am scraping a web whose information is in a table with the following structure.

<tbody>
   <tr class='Leaguestitle'>
      <td>...<\td>
      <td>...<\td>
   <\tr>
   <tr id='tr1_abababa'>
      <td>...<\td>
      <td>...<\td>
   <\tr>
   <tr id='tr2_abababa'>..<\tr>
    .
    .
   <tr id='tr1_acacaca'>..<\tr>
   <tr id='tr2_acacaca'>..<\tr>
   <tr align='center'>..<\tr>
    .
    .
   <tr id='tr1_cbcbcbc'>..<\tr>
   <tr id='tr2_cbcbcbc'>--<\tr>
<\tbody>

This structure is periodic. What interests me is the node with attribute class that gives me a header, nodes with attribute id that contain tr1 and the node with attribute align that is the one that marks the end of the data that I they are interesting For this, I create a list with the 3 types of nodes doing this:

allrows = table.find_elements_by_xpath("//tr[@class='Leaguestitle' or contains(@id,'tr1') and not (@align='center')]")

My wish is to iterate the list, and depending on whether the node is of attribute class that goes to a sublist, if it is of attribute id that goes to another, and if it is the node with attribute align finalize the program.

The problem is that the selected tr nodes do not have the structure of the beginning, that is:

<tr id='tr1_abababa'>
  <td>...<\td>
  <td>...<\td>
<\tr>

If not:

  <td>...<\td>
  <td>...<\td>

So since the attribute id of the node tr is not present, neither the attribute class or the align it is impossible to address said node to one list or another.

How could you, in a pythonic way, make such a classification?

    
asked by puppet 23.12.2017 в 22:50
source

1 answer

0

When you iterate the "allrows" list each element is a WebElement and a WebElement you can ask for any of its properties with the getAttribute(name) function where the name is the name of the attribute. For example:

for i in allrows:
    # i es un WebElement
    # i.getAttribute("class") deberia ser un String
    if i.getAttribute("class") == "Leaguestitle":
        titleList.append(i)
    elif:
    .
    .
    .

Using i.getAttribute("id") .contains () and String processing functions you can classify them into a single loop.

Another option is to obtain two different lists with two findelements different, that way you do not have to classify, you get directly the two lists already classified.

leaguesTitlerows = table.find_elements_by_xpath("//tr[@class='Leaguestitle']")
tr1rows = table.find_elements_by_xpath("//tr[contains(@id,'tr1') and not (@align='center')]")
    
answered by 02.01.2018 в 15:37