Count all the characters except a few

1

I'm trying to count all the characters of a string other than four that I have reserved.

for example:

s.count (I want all the characters of a string (s) except 4: a, c t and g) How could you count all of them except the four mentioned?

In a practical example it would be the following case:

 s = 'acaaaaattgggaaacccccbvbm2xyyuuooopp5585'
s.count('a')
s.count('t')
s.count('g')
s.count('c')
s.count(all except 'a','t','c','g')
    
asked by user7491985 28.09.2017 в 22:59
source

2 answers

1

One of the simplest ways is to use list comprehension in the following way:

s = 'acaaaaattgggaaacccccbvbm2xyyuuooopp5585'
excepto = ['a','t','c','g']

print("Cantidad: {0}".format(len([l for l in s if l not in excepto])))

This: [l for l in s if l not in excepto] converts the string s into a list of letters that were not present in the list excepto , what remains is to simply get the length with len to know the number of letters.

    
answered by 28.09.2017 в 23:14
0

Following the same idea as the one explained by Patricio in his answer, a somewhat more efficient way, although it will not really be noticed in small chains like yours, is to use string.translate to eliminate the excluded characters of the string:

print(len(s.translate({ord(c): None for c in 'atgc'})))

However, if as you imply, you require the number of times that 'a', 'g', 't' and 'c' also appear, the most efficient by far is to use collections.Counter instead of applying str.count four times on the string. Even more so if what you are trying to tell are bases of a genetic chain where most of the characters are expected to be A, T, G and C:

from collections import Counter

s = 'acaaaaattgggaaacccccbvbm2xyyuuooopp5585'
c = Counter(s)
print("'a' aparece {} veces.".format(c["a"]))
print("'t' aparece {} veces.".format(c["t"]))
print("'g' aparece {} veces.".format(c["g"]))
print("'c' aparece {} veces.".format(c["c"]))
print("Caracteres diferentes a 'a', 'g', 't' y 'c' aparecen {} veces.".format(len(s) - c["a"] - c["g"] -c["t"] -c["c"]))

Exit:

  

'a' appears 9 times.
  't' appears 2 times.
  'g' appears 3 times.
  'c' appears 6 times.
  Characters other than 'a', 't', 'g' and 'c' appear 19 times.

    
answered by 29.09.2017 в 00:47