How do I replace characters in a string using a character map?

Posted on

Question :

How do I replace characters in a string using a character map?

Code purpose:

I provide a string and match it with dictionary keys; if the key and string match I print the dictionary values.

Here’s the code:

def to_rna(dna_input):
    dna_rna = {'A':'U', 'C':'G', 'G':'C', 'T':'A'}
    rna = []
    for key in dna_rna.iterkeys():
        if key in dna_input:
            rna.append(dna_rna[key])
    print "".join(rna)

to_rna("ACGTGGTCTTAA") #the string input

Problem:

The result should be ‘UGCACCAGAAUU‘ but all I get is ‘UGAC‘. The problem appears to be that I have duplicate characters in the string and the loop is ignoring this. How do I loop through the dictionary so that it returns the dictionary value as many times as the dict key is found?

Asked By: shaz

||

Answer #1:

If you want to output a character for every character in dna_input you need to iterate over character in dna_input. Note that the get() function provides a default for characters that aren’t in your dictionary. I am replacing with nothing, if desired you could put an n here, or an X.

rna.append(dna_rna.get(char, 'n'))

Your code was only iterating over the 4 entries in the dna_rna dictionary.

def to_rna(dna_input):
    dna_rna = {'A':'U', 'C':'G', 'G':'C', 'T':'A'}
    rna = []
    for char in dna_input:
        rna.append(dna_rna.get(char, ''))
    print "".join(rna)

to_rna("ACGTGGTCTTAA") #the string input

However, this isn’t the most efficient way to translate a string.

Answered By: jgritty

Answer #2:

You could use translate(). Edit: I added the regex to return - for bad entries (seemed like a good idea @jh44tx had):

import string
import re
rna_trans = string.maketrans("ACGTU","UGCA-")
rna_trans = re.sub("[^UGCA]","-",rna_trans)
print "ACGTGGTCTTAA".translate(rna_trans)

Since the mappings are 1:1 you can also create a reverse translate:

rev_rna_trans = string.maketrans("UGCAT","ACGT-")
rev_rna_trans = re.sub("[^ACGT]","-",rna_trans)
Answered By: woot

Answer #3:

Since you know that each letter of the input will be translated to an output string, you’re better off making a loop over each letter:

def to_rna(dna_input):
    dna_rna = {'A':'U', 'C':'G', 'G':'C', 'T':'A'}
    rna = []
    for x in dna_input:
        rna.append(dna_rna[x])
    return ''.join(rna)

or you could write it with list comprehensions

def to_rna(dna_input):
    dna_rna = {'A':'U', 'C':'G', 'G':'C', 'T':'A'}
    return ''.join([dna_rna[x] for x in dna_input])
Answered By: Zanapher

Answer #4:

Just in case you think you’ll get junk letters sometimes, you can do this:

def to_rna(dna_input):
    dna_rna={'A':'U','C':'G','G':'C','T':'A'}
    rna=[]
    for char in dna_input:
        if char in dna_rna.keys():
            rna.append(dna_rna[char])
        else:
            rna.append('-')
    print "".join(rna)

to_rna("ACGTGGTCTTAAX")

and the result is:
UGCACCAGAAUU-

Answered By: jh44tx

Answer #5:

You can do this as a list comprehension. Because this becomes a one-liner it pretty much makes the function superfluous:

def to_rna(dna_input):
    dna_rna = {'A':'U', 'C':'G', 'G':'C', 'T':'A'}
    return "".join([dna_rna.get(x, '') for x in dna_input])
Answered By: Nathaniel Ford

Leave a Reply

Your email address will not be published.