Skip to content Skip to sidebar Skip to footer

Python Symbol Comparison

I have st = 'aaaabbсaa'. My task is if in the string characters repeat then I must write the character plus a number counting the repeats. My code (but it doesn't work): st = 'aa

Solution 1:

This looks like a task for itertools.groupby.

from itertools import groupby
data = 'aaaabbсaa'
compressed = ''.join('{}{}'.format(key, len(list(group))) for key, group in groupby(data))
print(compressed)

Result

a4b2с1a2

This might help to understand what's happening here.

data = 'aaaabbсaa'forkey, groupin groupby(data):
    print(key, len(list(group)))

Result

a4b2
с 1a2

Solution 2:

You've got three problems with your code.

First, as gnibbler points out, all of your if/elif conditions are the same. And you don't need a separate condition for each letter, you just need to print the variable (like st[i]) instead of a literal (like "a").

Second, you're trying to print out the current run length for each character in the run, instead of after the entire run. So, if you get this working, instead of a4b2c1a2 you're going to get a1a2a3a4b1b2c1a1a2. You need to keep track of the current run length for each character in the run, but then only print it out when you get to a different character.

Finally, you've got two off-by-one errors. First, when i starts at 0, st[i - 1] is st[-1], which is the last character; you don't want to compare with that. Second, when i finally gets to j-1 at the end, you've got a leftover run that you need to deal with.

So, the smallest change to your code is:

st = "aaaabbcaa"
cnt = 0

j = len(st)
i = 0while i < j:
    if i == 0or st[i] == st[i - 1]:
        cnt += 1else:
        print(st[i - 1] + str(cnt), end="")
        cnt = 1

    i += 1print(st[i - 1] + str(cnt))

As a side note, one really easy way to improve this: range(len(st)) gives you all the numbers from 0 up to but not including len(st), so you can get rid of j and the manual i loop and just use for i in range(len(st)):.

But you can improve this even further by looping over an iterable of st[i], st[i-1] pairs; then you don't need the indexes at all. This is pretty easy with zip and slicing. And then you don't need the special handling for the edges either either:

st = "aaaabbcaa"
cnt = 1for current, previous inzip(st[1:]+" ", st):
    if current == previous:
        cnt += 1else:
        print(previous + str(cnt), end="")
        cnt = 1

I think Matthias's groupby solution is more pythonic, and simpler (there's still a lot of things you could get wrong with this, like starting with cnt = 0), but this should be mostly understandable to a novice out of the box. (If you don't understand the zip(st[1:]+" ", st), try printing out st[1:], list(zip(st[1:], st)), and list(zip(st[1:]+" ", st) and it should be clearer.)

Solution 3:

This is kind of a silly way to go about it, but:

def encode(s):
    _lastch = s[0]
    out = []
    count = 0for ch in s:
        if ch == _lastch:
            count +=1else:
            out.append(_lastch + str(count))
            _lastch = ch
            count = 1out.append(_lastch + str(count))
    return''.join(out)

Example

>>>st = "aaaabbcaa">>>encode(st)
'a4b2c1a2'

Post a Comment for "Python Symbol Comparison"