With if Dna_bases in i
You are 'asking' if a set ( set
) is inside a string ("str"). You are expected to ask if one string is contained in the other, that the objects on both sides of the in
are of the same type .
Anyway you would not prove anything with that logic, if the set is in the chain it does not mean that the string does not contain bases that are not in the set and different from u
or t
respectively.
What you want to do is see if all the bases of a chain are part of one of the sets or none at all. To do this simply create a set with the bases of each chain and make the difference of sets with Dna_bases
and Rna_bases
.
When you pass a string to the set
constructor, a set is created with all the characters contained in the string without repetitions. For example:
>>> c = aucgcgauacgacgu
>>> set_c = set(c)
>>> set_c
{'c', 'u', 'a', 'g'}
If you make the difference of sets with another set you will get the characters that are in the first but not in the second one:
>>> {'c', 'u', 'a', 'g'} - {'a', 'u', 'c', 'g'}
set()
>>> {'c', 'u', 'a', 'g'} - {'a', 't', 'c', 'g'}
set('u')
This can be applied to our problem because if all the bases of the chain are in the set, an empty set will be returned.
Your code would be something like this:
Result=[]
Dna_bases = {'a', 't', 'c', 'g'}
Rna_bases = {'a', 'u', 'c', 'g'}
chain_list= ['ttgaatgccttacaact', 'aucgcgauacgacgu', 'aaacggacgacgxxn4']
for i in chain_list:
chain_set = set(i)
if not chain_set - Dna_bases:
Result.append('DNA')
elif not chain_set - Rna_bases:
Result.append('RNA')
else:
Result.append('UKN')
print ('Result =', Result)
Exit:
Result = ['DNA', 'RNA', 'UKN']
We must bear in mind that a supposed string like "acagcc" will be returned as DNA although it could also be RNA. If this possibility exists you could do something like:
Result=[]
Dna_bases = {'a', 't', 'c', 'g'}
Rna_bases = {'a', 'u', 'c', 'g'}
chain_list= ["acagcc", 'ttgaatgccttacaact', 'aucgcgauacgacgu', 'aaacggacgacgxxn4']
for i in chain_list:
chain_set = set(i)
if (not chain_set - Dna_bases) and (not chain_set - Rna_bases):
Result.append('DNA/RNA')
elif not chain_set - Dna_bases:
Result.append('DNA')
elif not chain_set - Rna_bases:
Result.append('RNA')
else:
Result.append('UKN')
print ('Result =', Result)
Exit:
Result = ['DNA / RNA', 'DNA', 'RNA', 'UKN']