The problem is that at no time are you comparing one base with another to see if there is a mutation or not. You should do something like:
transicion = {'A':'G', 'G':'A', 'T':'C', 'C':'T'}
transversion = {'A':('T','C'), 'G':('T','C'), 'T':('A','G'), 'C':('A','G')}
def rel_trans(s1,s2):
res = 0
for i,j in zip(s1, s2):
if transicion[i] == j or j in transversion[i]:
res += 1
return (res)
Or using generator compression:
def rel_trans(s1,s2):
return sum(transicion[i] == j or j in transversion[i] for i,j in zip(s1, s2))
s1 = 'CAACGCA'
s2 = 'TGTCTGA'
print(rel_trans(s1,s2))
Exit:
5
This really makes little sense if you do not get transversions and transitions separately. In the previous way you get all the substitution mutations possible and for that you do not need the dictionaries (unless you only look for some specific ones and not the 12 possible ones) since that is enough to do (if they are valid chains that only contain the characters A, G, T, C):
def rel_trans(s1,s2):
return sum(i==j for i, j in zip(s1, s2))
Something more informative would be:
transicion = {'A':'G', 'G':'A', 'T':'C', 'C':'T'}
transversion = {'A':('T','C'), 'G':('T','C'), 'T':('A','G'), 'C':('A','G')}
def mutaciones_por_sustitucion(s1, s2):
ts = [(ind+1, i, j) for ind, (i, j) in enumerate(zip(s1, s2)) if transicion[i] == j]
tv = [(ind+1, i, j) for ind, (i, j) in enumerate(zip(s1, s2)) if j in transversion[i]]
print('Encontradas {} mutaciones por sustitucion:'. format(len(ts)+len(tv)))
print(' Transiciones ({}): '.format(len(ts)))
for m in ts:
print(' Posicion {}: {} cambiada por {}'.format(m[0], m[1], m[2]))
print(' Transversiones ({}): '.format(len(tv)))
for m in tv:
print(' Posicion {}: {} cambiada por {}'.format(m[0], m[1], m[2]))
s1 = 'CAACGCA'
s2 = 'TGTCTGA'
mutaciones_por_sustitucion(s1, s2)
Exit:
Encontradas 5 mutaciones por sustitucion:
Transiciones (2):
Posicion 1: C cambiada por T
Posicion 2: A cambiada por G
Transversiones (3):
Posicion 3: A cambiada por T
Posicion 5: G cambiada por T
Posicion 6: C cambiada por G