Search for several items in a chain

0

I have a big doubt in an exercise that I am not able to solve yet. I comment, I have a string, dna="ATGCGAGTTGATA" , then I must find the position of the letters G and C . For this, what I have been doing is:

for base in dna:
    if base == "G"
        print dna.find("G")
        continue

but this only returns the first position and not the other positions where G is found. If someone can help me, I would really appreciate it.

    
asked by Aaron 10.11.2018 в 23:27
source

4 answers

1

With dictionaries (Python 3.6).

dna = "ATGCGAGTTGATA"

resultados = {'G': [], 'C': []}

for indice, valor in enumerate(dna):
    if valor in ['G', 'C']:
        resultados[valor].append(indice)

print(f"Posiciones de G: {resultados['G']}")
print(f"Posiciones de C: {resultados['C']}")
    
answered by 11.11.2018 в 19:17
0

With this statement dna.find("G") what you are looking for is again the letter G in the chain, but this only brings the first match. You were already doing it with the for if there is no more code, you do not need the continue this only serves to "break" the loop and continue with the next index if it exists. What you are looking for is simple,

dna="ATGCGAGTTGATA"
for i,base in enumerate( dna ):
    if base == 'G':
        print( 'G encontrada en el indice ' + str( i )  )

I hope you can guide yourself with this.

    
answered by 11.11.2018 в 00:32
0

To solve your problem, you must add the following to your code:

  • A variable to take the count of the position of the letters
  • A list to save the positions of the letter G
  • A list to save the positions of the letter C

The code would look like this:

dna="ATGCGAGTTGATA"
pos = 0 # Variable para llevar el conteo
posG = [] # Guardar las posiciones de la letra G
posC = [] # Guardar las posiciones de la letra C

for base in dna:
    if base == "G":
        posG.append(pos)
    if base == "C":
        posC.append(pos)
    pos += 1

print "Posiciones de G: " + str(posG)
print "Posiciones de C: " + str(posC)

Your initial code does not work because find only returns the smallest index of the searched word

    
answered by 11.11.2018 в 00:32
0

You can use regular expressions to find character patterns with the library re

Return all the matching strings and the position where each one is

import re

dna="ATGCGAGTTGATA"

g = re.compile("G")
c = re.compile("C")

g_find = g.finditer(dna)
c_find = c.finditer(dna)

[print("Encontrada " + str(x.group()) + " en posición " + str(x.start())) for x in g_find]
[print("Encontrada " + str(x.group()) + " en posición " + str(x.start())) for x in c_find]
    
answered by 12.11.2018 в 16:54