We suppose a set of symbols with their frequencies, given as a list of fr(S,F) terms. Example: [fr(a,45),fr(b,13),fr(c,12),fr(d,16),fr(e,9),fr(f,5)]. Our objective is to construct a list hc(S,C) terms, where C is the Huffman code word for the symbol S. In our example, the result could be Hs = [hc(a,'0'), hc(b,'101'), hc(c,'100'), hc(d,'111'), hc(e,'1101'), hc(f,'1100')] [hc(a,'01'),...etc.]. The task shall be performed by the predicate huffman/2 defined as follows:

# stats: [(letter, count)]

# return: [(count, letter OR (left, right)]

def make_tree(stats):

pool = sorted([(count, letter) for letter, count in stats])

while len(pool) > 1:

first = pool.pop(0)

second = pool.pop(0)

node = (first[0]+second[0], (first, second))

pool.append(node)

pool.sort()

return pool[0]

def codes_from_tree(node, path, results):

count, letter_or_trees = node

letter = left = right = None

if type(letter_or_trees) == tuple:

left, right = letter_or_trees

else:

letter = letter_or_trees

if letter:

results.append((letter, path))

else:

codes_from_tree(left, path+"0", results)

codes_from_tree(right, path+"1", results)

def huffman(stats):

results = []

tree = make_tree(stats)

codes_from_tree(tree, "", results)

#to sort by huffman code value

#results.sort(key=lambda x: int(x[1], 2))

results.sort()

return results

That solution *feels* a little hacky for me, but I can't think of why it bothers me. It might be that the last time I wrote this was in java, and it was a lot more work. Maybe this just seems too easy.

## No comments:

Post a Comment