lair of the dustbunny: January 2009

Friday, January 23, 2009

Higher Order Perl (Python Style) : Chapter 3 - Caching and Memoization


### 3.1 Caching Fixes Recursion

# sub RGB_to_CMYK {
#   my ($r, $g, $b) = @_;
#   my ($c, $m, $y) = (255-$r, 255-$g, 255-$b);
#   my $k = $c < $m ? ($c < $y ? $c : $y)
#                   : ($m < $y ? $m : $y);  # Minimum
#   for ($c, $m, $y) { $_ -= $k }
#   [$c, $m, $y, $k];
# }
def rgb_to_cmyk(r,g,b):
   c, m, y = 255-r, 255-g, 255-b
   k = min([c,m,y])
   c, m, y = c - k, m - k, y - k
   return c, m, y, k

# my %cache;
# sub RGB_to_CMYK {
#   my ($r, $g, $b) = @_;
#   my $key = join ',', $r, $g, $b;
#   return $cache{$key} if exists $cache{$key};
#   my ($c, $m, $y) = (255-$r, 255-$g, 255-$b);
#   my $k = $c < $m ? ($c < $y ? $c : $y)
#                   : ($m < $y ? $m : $y);  # Minimum
#   for ($c, $m, $y) { $_ -= $k }
#   return $cache{$key} = [$c, $m, $y, $k];
# }
CACHE = {}
def rgb_to_cmyk(r,g,b):
   if (r,g,b) not in CACHE:
       c, m, y = 255-r, 255-g, 255-b
       k = min([c,m,y])
       c, m, y = c - k, m - k, y - k
       CACHE[(r,g,b)] = (c,m,y,k)
   return CACHE[(r,g,b)]


# Compute the number of pairs of rabbits alive in month n
# sub fib {
#    my ($month) = @_;
#    if ($month < 2) { 1 }
#    else {
#        fib($month-1) + fib($month-2);
#    }
# }
def fib(month):
   if month < 2:
       return 1
   else:
       return fib(month - 1) + fib(month - 2)


### 3.2 Inline Caching

# # Compute the number of pairs of rabbits alive in month n
# { my %cache;
#   sub fib {
#     my ($month) = @_;
#     unless (exists $cache{$month}) {
#       if ($month < 2) { $cache{$month} = 1 }
#       else {
#         $cache{$month} = fib($month-1) + fib($month-2);
#       }
#     }
#     return $cache{$month};
#   }
# }
# NOTE: we include the function itself as part of
#       the hash key since python can't do the
#       scoping technique above w/o hackery (AFAIK).
#       Of course the more natural python way
#       would probably be just to use a decorator
CACHE = {}
def fib(month):
   if (fib, month) not in CACHE:
       if month < 2:
           CACHE[(fib, month)] = 1
       else:
           CACHE[(fib, month)] = fib(month - 1) + fib(month - 2)
   return CACHE[(fib, month)]


# sub fib {
#   my %cache;
#   ...
# }
def fib(month):
   CACHE = {}
   ...

### 3.3 Good Ideas

# sub some_function {
#   $result = some computation involving @_;
#   return $result;
# }
def some_function(*x):
   result = some computation involving x
   return result

# { my %cache;
#   sub some_function_with_caching {
#     my $key = join ',', @_;
#     return $cache{$key} if exists $cache{$key};
#     $result = the same computation involving @_;
#     return $cache{$key} = $result;
#   }
# }
CACHE = {}
def some_function_with_caching(*x):
   key = (some_function_with_caching, x)
   if key not in CACHE:
       CACHE[key] = the same computation involving *x
   return CACHE[key]


### 3.4 Memoization

# use Memoize;
# memoize 'fib';
# # Compute the number of pairs of rabbits alive in month n
# sub fib {
#   my ($month) = @_;
#   if ($month < 2) { 1 }
#   else {
#       fib($month-1) + fib($month-2);
#   }
# }
# Just for sake of argument let's pretend we have a "memoize"
# module that provides a memoize decorator
@memoize
def fib(month):
   if month < 2:
       return 1
   else:
       return fib(month - 1) + fib(month - 2)

### 3.5 The Memoize Module

# sub memoize {
#   my ($func) = @_;
#   my %cache;
#   my $stub = sub {
#     my $key = join ',', @_;
#     $cache{$key} = $func->(@_) unless exists $cache{$key};
#     return $cache{$key};
#   };
#   return $stub;
# }

# One reasonable way to do this in python is as a decorator.  To keep
# things simple we'll leave out **kwargs since we can't (simply) use
# dicts as parts of keys.  We also assume for now that args contains
# only things that can be in a dict key
def memoize(func):
   CACHE = {}
   def memoized_func(*args):
       if args not in CACHE:
           CACHE[args] = func(*args)
       return CACHE[args]
   return memoized_func

# NOTE: running the above with fib actually gave
# "RuntimeError: maximum recursion depth exceeded"
# for month = 1000.  But if you do some smaller
# values first to "seed" it it will handle the
# larger values.




#$fastfib = memoize(\&fib);
@memoize
def fib(month):
   ....
# or
# fastfib = memoize(fib)


# sub make_counter {
#   my $n = shift;
#   return sub { print "n is ", $n++ };
# }
# my $x = make_counter(7);
# my $y = make_counter(20);

# We cheat a little here since python's scoping
# won't let's us work with the n directly
def make_counter(n):
   x = [n]
   def counter():
       print "n is", x[0]
       x[0] += 1
   return counter
x = make_counter(7)
y = make_counter(20)


### 3.6 Caveats

# use Memoize;
# sub iota {
#   my $n = shift;
#   return [1 .. $n];
# }
# memoize 'iota';
# $i10 = iota(10);
# $j10 = iota(10);
# pop @$i10;
# print @$j10;
@memoize
def iota(n):
   return range(n)
i10 = iota(10)
j10 = iota(10)
i10.pop()
print j10


# package Octopus;
# sub new {
#   my ($class, %args) = @_;
#   $args{tentacles} = 8;
#   bless \%args => $class;
# }

# sub name {
#   my $self = shift;
#   if (@_) { $self->{name} = shift }
#   $self->{name};
# }

# my $junko = Octopus->new(favorite_food => "crab cakes");
# $junko->name("Junko");
# my $fenchurch = Octopus->new(favorite_food => "crab cakes");
# $fenchurch->name("Fenchurch");

# # This prints "Fenchurch" -- oops!
# print "The name of the FIRST octopus is ", $junko->name, "\n";
class Octopus(object):
   def __init__(self, **kwargs):
       self.tentacles = 8
       self.__dict__.update(kwargs)

   def name(self, new_name=None):
       if new_name != None:
           self.name = new_name
       return self.name

# OK, this one has me stumped.  The text says
# that "new" was memoized, but I'm not seeing it
# Either a typo on the books part or a thinko on mine
# But the lesson about caching mutable items is well taken


#@result = sort { -M $a <=> -M $b } @files;
result = sorted(files, key=(lambda x: os.path.getmtime(x)))


### 3.7 Key Generation

#my $key = join ',', @_;
key = ",".join([str(x) for x in key_list])
# or just use a tuple and avoid problems in the first place
key = tuple(key_list)


# my @args = @_;
# s/([\\,])/\\$1/g for @args;
# my $key = join ",", @args;
key = ",".join([str(x).replace(",","\,").replace("\\","\\\\") for x in key_list])


# sub memoize {
#   my ($func, $keygen) = @_;
#   my %cache;
#   my $stub = sub {
#     my $key = $keygen ? $keygen->(@_) : join ',', @_;
#     $cache{$key} = $func->(@_) unless exists $cache{$key};
#     return $cache{$key};
#   };
#   return $stub;
# }
# There are other ways to do this, but we want
# to stay close to the perl version.  Now
# memoize will need to be invoked as
#@memoize() # need the "()" here
#@memoize(keygen_func)
def memoize(keygen=None):
   CACHE = {}
   def outer_func(func):
       def inner_func(*args):
           key = args
           if keygen:
               key = keygen(args)
           if key not in CACHE:
               CACHE[key] = func(*args)
           return CACHE[key]

       return inner_func
   return outer_func


# sub memoize {


#   my ($func, $keygen) = @_;
#   my %cache;
#   my $stub = $keygen ?
#     sub { my $key = $keygen->(@_);
#           $cache{$key} = $func->(@_) unless exists $cache{$key};
#           return $cache{$key};
#         }
#   :
#     sub { my $key = join ',', @_;
#           $cache{$key} = $func->(@_) unless exists $cache{$key};
#           return $cache{$key};
#         }
#   ;
#   return $stub;
# }
def memoize(keygen=None):
   CACHE = {}
   def outer_func(func):
       def inner_func_no_keygen(*args):
           key = args
           if key not in CACHE:
               CACHE[key] = func(*args)
           return CACHE[key]

       def inner_func_with_keygen(*args):
           key = keygen(args)
           if key not in CACHE:
               CACHE[key] = func(*args)
           return CACHE[key]

       return keygen and inner_func_with_keygen or inner_func_no_keygen
   return outer_func


# $memoized = memoize(\&fib, q{my @args = @_;
#                              s/([\\,])/\\$1/g for @args;
#                              join ',', @args;
#                             });

# sub memoize {
#   my ($func, $keygen) = @_;
#   $keygen ||= q{join ',', @_};
#   my %cache;
#   my $newcode = q{
#     sub { my $key = do { KEYGEN };
#           $cache{$key} = $func->(@_) unless exists $cache{$key};
#           return $cache{$key};
#         }
#   };
#   $newcode =~ s/KEYGEN/$keygen/g;
#   return eval $newcode;
# }

# $keygen ||= 'join \',\', @_';

# my $newcode = "
#   sub { my \$key = do { $keygen };
#         \$cache{\$key} = \$func->(\@_) unless exists \$cache{\$key};
#         return \$cache{\$key};
#       }
# ";


# sub memoize {
#   my ($func, $keygen) = @_;
#   my $keyfunc;
#   if ($keygen eq '') {
#     $keygen = q{join ',', @_}
#   } elsif (UNIVERSAL::isa($keygen, 'CODE')) {
#     $keyfunc = $keygen;
#     $keygen = q{$keyfunc->(@_)};
#   }
#   my %cache;
#   my $newcode = q{
#     sub { my $key = do { KEYGEN };
#           $cache{$key} = $func->(@_) unless exists $cache{$key};
#           return $cache{$key};
#         }
#   };
#   $newcode = ̃ s/KEYGEN/$keygen/g;
#   return eval $newcode;
# }

# if (ref($keygen) eq 'CODE') { ... }

"""
UNCLE!  This seems too far off the path for a sane python translation.
Actually I can think of a way to do it, you just wouldn't, so unless
there is a great public outcry, I will just skip trying to force this
down the python's throat.
"""

# sub example {
#   my %args = @_;
#   $args{A} = 32 unless defined $args{A};
#   $args{B} = 17 unless defined $args{B};
#   # ...
# }
def example(**args):
   args.setdefault("A", 32)
   args.setdefault("B", 17)
   #

# example(C => 99);
# example(C => 99, A => 32);
# example(A => 32, C => 99);
# example(B => 17, C => 99);
# example(C => 99, B => 17);
# example(A => 32, C => 99, B => 17);
# example(B => 17, A => 32, C => 99);
# (etc.)

example(C=99)
example(C=99, A=32)
example(A=32, C=99)
example(B=17, C=99)
example(C=99, B=17)
example(A=32, C=99, B=17)
example(B=17, A=32, C=99)
(etc.)

# NOTE: the problem in python is that
#       we just can't use dicts as a hash
#       otherwise this would be a no brainer


# sub {
#   my %h = @_;
#   $h{A} = 32 unless defined $h{A};
#   $h{B} = 17 unless defined $h{B};
#   join ",", @h{'A','B','C'};
# }
# we give this function a name so that we don't
# have to work around python's lambda limitations
def _keygen(a, h):
   h.setdefault("A", 32)
   h.setdefault("B", 17)
   # a tuple key is probably more natural
   return tuple([x[1] for x in sorted(h.items())])
   # but we can make a string just as simply
   # return ','.join([str(x[1]) for x in sorted(args.items())])

# keygen only has to work over the set of expected values so we can go ahead and
# do something like: 
# have the key be a tuple of (args, kwargs)
# where kwargs has been adapted to a tuple of key,value pair tuples
# BUT now our keygen needs to handle *Both* types of inputs
def memoize(keygen=None):
   CACHE = {}
   def outer_func(func):
       def inner_func(*args, **kwargs):
           kwargs_ = tuple(sorted(kwargs.items()))
           key = (args, kwargs_)
           if keygen:
               key = keygen(args, kwargs)
           if key not in CACHE:
               print "updating cache:", key
               CACHE[key] = func(*args, **kwargs)
           return CACHE[key]

       return inner_func
   return outer_func


# sub example {
#   my ($a, $b, $c) = @_;
#   $a = 32 unless defined $a;
#   $b = 17 unless defined $b;
#   # more calculation here ...
# }
def example(a=32, b=17,c=None):
   # more calculation here ...

# my ($a, $b, $c) = @_;
# $a = 32 unless defined $a;
# $b = 17 unless defined $b;
# join ',', $a, $b, $c;

# first three lines handled by machinery of argument passing in python
return ",".join([str(x) for x in (a,b,c)])


# sub { my $key = do { $_[0] = 32 unless defined $_[0];
#                      $_[1] = 17 unless defined $_[1];
#                      join ',', @_;
#                    };
#       $cache{$key} = $func->(@_) unless exists $cache{$key};
#       return $cache{$key};
#     }

# Hmm... very likely I'm missing a sublety here but
# is seems that python's default args already handles
# this case.  Plus the fact that (if I'm understanding
# correctly), python doesn't map well to the "do { }"
# syntax.  I'll punt for now.


# sub set_to_57 {
#   $_[0] = 57;
# }
# my $x = 119;
# set_to_57($x);

# you can't take advantage of this (or get bit by, depending
# on your perspective) in python, but cause the args are pass
# by value.


# sub safe_function {
#    my ($n) = @_;
#    $n = 57; # does *not* set $x to 57
# }
# my $x = 119;
# safe_function($x);
def safe_function(n):
   n = 57
x = 119
safe_function(x)


# memoize(\&example, q{
#   my ($a, $b, $c) = @_;
#   $a = 32 unless defined $a;
#   $b = 17 unless defined $b;
#   @_ = ($a, $b, $c);         # line 5
#   join ',', @_;
# });
# We avoid sending the keygen function as a string
# since it is too unpythonic.
# Here we are assuming that the arguments are positional
# and "None" if not available.  Usually we would use
# key word args in this situation
def _keygen(args, kwargs):
   print "args=",args,"kwargs=",kwargs
   a = args[0] == None and 32 or args[0]
   b = args[1] == None and 17 or args[1]
   c = args[2]
   return (a,b,c)
   # or return ",".join([str(x) for x in (a,b,c)])

@memoize(_keygen)
def exercise(*args):
   a,b,c = (args + [None]*3)[:3]
   ...


# sub is_in {
#   my ($needle, $haystack) = @_;
#   for my $item (@$haystack) {
#     return 1 if $item == $needle;
#   }
#   return;
# }
def is_in(needle, haystack):
   for item in haystack:
       if item == needle:
           return True
   return False

# if (is_in($my_id, \@employee_ids)) { ... }
if is_in(my_id, employee_ids):
   ...

#sub { join ",", $_[0], @{$_[1]} }
# In python the most obvious solution here
# is again to just use tuple()


# sub find_share {
#   my ($target, $treasures) = @_;
#   return [] if $target == 0;
#   return    if $target < 0 || @$treasures == 0;
#   my ($first, @rest) = @$treasures;
#   my $solution = find_share($target-$first, \@rest);
#   return [$first, @$solution] if $solution;
#   return         find_share($target       , \@rest);
# }


def find_share(target, treasures):
   if target == 0:
       return []
   if target < 0 or len(treasures) == 0:
       return
   first, rest = treasures[0], treasures[1:]
   solution = find_share(target-first, rest)
   if solution != None:
       return [first] + solution
   return find_share(target, rest)



# sub delivery_charge {
#   my ($quantity_ordered) = @_;
#   my ($hour, $day_of_week) = (localtime)[2,6];
#   # perform complex computation involving $weight, $gross_cost,
#   #     $hour, $day_of_week, and $quantity_ordered
#   # ...
#   return $delivery_charge;
# }
import time
def delivery_charge(quantity_ordered):
   localtime = time.localtime()
   hour, day_of_week = localtime[2], localtime[6]
   # perform complex computation involving $weight, $gross_cost,
   #     $hour, $day_of_week, and $quantity_ordered
   # ...
   return delivery_charge


# sub delivery_charge_key {
#   join ',', @_, (localtime)[2,6];
# }
def delivery_charge_key(quantity_ordered):
   localtime = time.localtime()
   return ",".join([str(x) for x in (quantity_ordered,localtime[2], localtime[6])])
   ## or just return (quantity_ordered,localtime[2], localtime[6])


# sub delivery_charge_key {
#   my ($hour, $day_of_week) = (localtime)[2,6];
#   my $weekend = $day_of_week == 0 || $day_of_week == 6;
#   join ',', @_, $hour, $weekend;
# }
def delivery_charge_key(quantity_ordered):
   localtime = time.localtime()
   hour, day_of_week = localtime[2], localtime[6]
   weekend = day_of_week in (0,6)
   return ",".join([str(x) for x in (quantity_ordered, hour, weekend)])
   ## or just return (quantity_ordered, hour, weekend)


### 3.8 Caching in Object Methods


# package Investor;
# # Compute total amount currently invested
# sub total {
#   my $self = shift;
#   # ... complex computation performed here ...
#   return $total;
# }
class Investor(object):
   def total(self):
       # ... complex computation performed here ...
       return total

# # Compute total amount currently invested
# { my %cache;
#   sub total {
#     my $self = shift;
#     return $cache{$self} if exists $cache{$self};
#     # ... complex computation performed here ...
#     return $cache{$self} = $total;
#   }
# }
# One way to simulate this in python would be to
# have a cache "class" variable
class Investor(object):
   _cache = {}
   def total(self):
       # ... complex computation performed here ...\
       key = (self, Investor.total)
       if key not in Investor._cache:
           # ... complex computation performed here ...
           Investor._cache[key] = total
       return Investor._cache[key]


# # here 90,000 is returned from the cache
# $old_total = $old_object->total();
# undef $old_object;
# $new_object = Investor->new();
# $new_total = $new_object->total();
old_total = old_object.total()
del old_object
new_object = Investor()
new_total = new_object.total()

# NOTE: in addition to manually invoked "destroy"
# method recommended in the text, we could
# also have __del__ method, though that might not be
# reliable.  Also could just stick in a unix timestamp
# as part of the key so new objects with same memory location
# won't be confusing


# # Compute total amount currently invested
# sub total {
#   my $self = shift;
#   return $self->{cached_total} if exists $self->{cached_total};
#   # ... complex computation performed here ...
#   return $self->{cached_total} = $total;
# }
def total(self):
   if self.cached_total == None:
       # ... complex computation performed here ...
       self.cached_total = total
   return self.cached_total

# # Compute total amount currently invested
# { my %cache;
#   sub total {
#     my $self = shift;
#     return $cache{$self} if exists $cache{$self};
#     # ... complex computation performed here ...
#     return $cache{$self} = $total;
#   }
#   sub expire_total {
#     my $self = shift;
#     delete $cache{$self};
#   }
# }
# sub invest {
#   my ($self, $amount, ...) = @_;
#   $self->expire_total;
#   ...
# }
class Investor(object):
   _cache = {}
   def total(self):
       # ... complex computation performed here ...\
       key = (self, Investor.total)
       if key not in Investor._cache:
           # ... complex computation performed here ...
           Investor._cache[key] = total
       return Investor._cache[key]


   def expire_total(self):
       key = (self, Investor.total)
       del Investor._cache[key]

   def invest(self, ammount, ...):
       self.expire_total()
       ...


# # Compute total amount currently invested
# sub total {
#   my $self = shift;
#   return $self->{cached_total} if exists $self->{cached_total};
#   # ... complex computation performed here ...
#   return $self->{cached_total} = $total;
# }
# sub invest {
#   my ($self, $amount, ...) = @_;
#   delete $self->{cached_total};
#   ...
# }

def total(self):
   if self.cached_total == None:
       # ... complex computation performed here ...
       self.cached_total = total
   return self.cached_total

def invest(self, amount, ...):
   self.cached_total = None
   ...


# sub memoize_method {
#   my ($method, $key) = @_;
#   return sub {
#     my $self = shift;
#     return $self->{$key} if exists $self->{$key};
#     return $self->{$key} = $method->($self, @_);
#   };
# }
# As a decorator
def memoize_method(key):
   def outter(method):
       def inner(self):
           if self.__dict__.get(key) == None:
               print "cache miss:", method, key, self
               self.__dict__[key] = method(self)
           return self.__dict__[key]
       return inner
   return outter


#*Investor::total = memoize_method(\&Investor::total, 'cached_total');
#$investor_bob->total;

@memoize_method("cached_total")
def total(self):
   ...

investor_bob.total()


# $memoized_total = memoize_method(\&Investor::total, 'cached_total');
# $investor_bob->$memoized_total;

# I suppose you could also use the pre-decorator syntax:

def total(self):
   ...
total = memoize_method("cached_total")(total)

investor_bob.total()




### 3.9 Persistent Caches


# use DB_File;
# sub memoize {
#   my ($func, $keygen, $file) = @_;
#   my %cache;
#   if (defined $file) {
#     tie %cache => 'DB_File', $file, O_RDWR|O_CREAT, 0666
#       or die "Couldn’t access cache file $file: $!; aborting";
#   }
#   my $stub = sub {
#     my $key = $keygen ? $keygen->(@_) : join ',', @_;
#     $cache{$key} = $func->(@_) unless exists $cache{$key};
#     return $cache{$key};
#   };
#   return $stub;
# }
import anydbm

def memoize(keygen=None, cache_file=None):
   if cache_file:
       cache = anydbm.open(cache_file, 'c')
   else:
       cache = {}
   def outer_func(func):
       def inner_func(*args):
           key = str(args)
           if keygen:
               key = keygen(args)
           if key not in cache:
               cache[key] = func(*args)
           return cache[key]

       return inner_func
   return outer_func



# sub memoize {
#     my ($func, $keygen, $cache) = @_;
#     $cache = {} unless defined $cache;
#     my $stub = sub {
#        my $key = $keygen ? $keygen->(@_) : join ',', @_;
#        $cache->{$key} = $func->(@_) unless exists $cache->{$key};
#        return $cache->{$key};
#     };
#     return $stub;
# }
def memoize(keygen=None, cache=None):
   if cache == None:
       cache = {}
   def outer_func(func):
       def inner_func(*args):
           key = str(args)
           if keygen:
               key = keygen(args)
           if key not in cache:
               cache[key] = func(*args)
           return cache[key]

       return inner_func
   return outer_func

### 3.10 Alternatives To Memoization

# @sorted_numbers = sort { $a <=> $b } @numbers;
# This is default behavior for sort
sorted_numbers = sorted(numbers)



# @sorted_numbers = sort numerically @numbers;
# sub numerically { $a <=> $b }
sorted_numbers = sorted(numbers, cmp=numerically)
def numerically(a,b):
   return cmp(a,b)



# @sorted_dates = sort chronologically @dates;

# %m2n =
#    ( jan =>  1, feb =>  2, mar =>  3,
#      apr =>  4, may =>  5, jun =>  6,
#      jul =>  7, aug =>  8, sep =>  9,
#      oct => 10, nov => 11, dec => 12, );

# sub chronologically {
#   my ($am, $ad, $ay) =
#     ($a =~ /(\w{3}) (\d+), (\d+)/);
#   my ($bm, $bd, $by) =
#     ($b =~ /(\w{3}) (\d+), (\d+)/);
#              $ay  <=>         $by
#   || $m2n{lc $am} <=> $m2n{lc $bm}
#   ||         $ad  <=>         $bd;
# }

sorted_dates = sorted(dates, chronologically)

m2n = {
   "jan" : 1,  "feb" : 2,  "mar" : 3,
   "apr" : 4,  "may" : 5,  "jun" : 6,
   "jul" : 7,  "aug" : 8,  "sep" : 9,
   "oct" : 10, "nov" : 11, "dec" : 12
   }

def chronologically(a,b):
   am, ad, ay = re.search("(\w{3}) (\d+), (\d+)", a).groups()
   bm, bd, by = re.search("(\w{3}) (\d+), (\d+)", b).groups()

   return (cmp(int(ay), int(by)) or
           cmp(m2n[am.lower()],  m2n[bm.lower()]) or
           cmp(int(ad), int(bd)))

#     ## or more simply
#     return cmp((int(ay), m2n[am.lower()], int(ad)),
#                (int(by), m2n[bm.lower()], int(bd)))



# @sorted_dates = sort chronologically @dates;
# %m2n =
#    ( jan =>  1, feb =>  2, mar =>  3,
#      apr =>  4, may =>  5, jun =>  6,
#      jul =>  7, aug =>  8, sep =>  9,
#      oct => 10, nov => 11, dec => 12, );
# sub chronologically {
#   my ($am, $ad, $ay) = split_date($a);
#   my ($bm, $bd, $by) = split_date($b);
#              $ay  <=>         $by
#   || $m2n{lc $am} <=> $m2n{lc $bm}
#   ||         $ad  <=>         $bd;
# }
# sub split_date {
#   $_[0] =~ /(\w{3}) (\d+), (\d+)/;
# }


sorted_dates = sorted(dates, chronologically)

m2n = {
   "jan" : 1,  "feb" : 2,  "mar" : 3,
   "apr" : 4,  "may" : 5,  "jun" : 6,
   "jul" : 7,  "aug" : 8,  "sep" : 9,
   "oct" : 10, "nov" : 11, "dec" : 12
   }

def chronologically(a,b):
   am, ad, ay = split_date(a)
   bm, bd, by = split_date(b)

   return (cmp(int(ay), int(by)) or
           cmp(m2n[am.lower()],  m2n[bm.lower()]) or
           cmp(int(ad), int(bd)))

def split_date(dt):
   return re.search("(\w{3}) (\d+), (\d+)", dt).groups()



@sorted_dates = sort chronologically @dates;  COD E L IB RARY
                                            chrono-3
# %m2n =
#    ( jan =>  1, feb =>  2, mar =>  3,
#      apr =>  4, may =>  5, jun =>  6,
#      jul =>  7, aug =>  8, sep =>  9,
#      oct => 10, nov => 11, dec => 12, );

# sub chronologically {
#   date_to_string($a) cmp date_to_string($b)
# }

# sub date_to_string {
#   my ($m, $d, $y) = ($_[0] =~ /(\w{3}) (\d+), (\d+)/);
#   sprintf "%04d%02d%02d", $y, $m2n{lc $m}, $d;
# }

m2n = {
   "jan" : 1,  "feb" : 2,  "mar" : 3,
   "apr" : 4,  "may" : 5,  "jun" : 6,
   "jul" : 7,  "aug" : 8,  "sep" : 9,
   "oct" : 10, "nov" : 11, "dec" : 12
   }

def chronologically(a,b):
   return cmp(date_to_string(a), date_to_string(b))

def date_to_string(dt):
   m, d, y = re.search("(\w{3}) (\d+), (\d+)", dt).groups()
   return "%04d%02d%02d" % (int(y), m2n[m.lower()], int(d))

# { my %cache;
#   sub chronologically {
#     ($cache{$a} ||= date_to_string($a))
#        cmp
#     ($cache{$b} ||= date_to_string($b))
#   }
# }
# In python we can't do the scope trick with cache
# but we ignore that for now
CACHE = {}
def chronologically(a, b):
   if a not in CACHE:
       CACHE[a] = date_to_string(a)
   if b not in CACHE:
       CACHE[b] = date_to_string(b)

   return cmp(CACHE[a], CACHE[b])


# { my %cache;
#   sub by_total_invested {
#     ($cache{$a} ||= total_invested($a))
#        <=>
#     ($cache{$b} ||= total_invested($b))
#   }
# }
CACHE = {}
def by_total_invested(a,b):
   if not CACHE.get(a):
       CACHE[a] = total_invested(a)
   if not CACHE.get(b):
       CACHE[b] = total_invested(b)

   return cmp(CACHE[a], CACHE[b])


# { my %cache;
#   sub by_total_invested {
#    (exists $cache{$a} ? $cache{$a} : ($cache{$a} = total_invested($a)))
#        <=>
#    (exists $cache{$b} ? $cache{$b} : ($cache{$b} = total_invested($b)))
#   }
# }
CACHE = {}
def by_total_invested(a,b):
   if a not in CACHE:
       CACHE[a] = total_invested(a)
   if b not in CACHE:
       CACHE[b] = total_invested(b)

   return cmp(CACHE[a], CACHE[b])

# { my %cache;
#   sub by_total_invested {
#     ($cache{$a} ||= total_invested($a) || "0e0")
#        <=>
#     ($cache{$b} ||= total_invested($b) || "0e0")
#   }
# }
# I can't think of a reason to try to emulate this trick
# (or how, frankly)

### 3.11 Evangelism

### 3.12 The Benefits of Speed

# sub function {
#   if (++$CALLS == 100) { memoize 'function'}
#   ...
# }
def function():
   CALL += 1
   if CALLS == 100:
       function = memoize(function)
   ...

# use Time::HiRes 'time';
# my (%time, %calls);
# sub profile {
#   my ($func, $name) = @_;
#   my $stub = sub {
#     my $start = time;
#     my $return = $func->(@_);
#     my $end = time;
#     my $elapsed = $end - $start;
#     $calls{$name} += 1;
#     $time{$name} += $elapsed;
#     return $return;
#   };
#   return $stub;
# }
# This is another good case for a decorator
import time
time_ = {}
calls = {}
def profile(func):
   calls.setdefault(func.__name__, 0)
   time_.setdefault(func.__name__, 0)
   def stub(*args):
       start = time.time()
       return_ = func(*args)
       end = time.time()
       elapsed = end - start
       calls[func.__name__] += 1
       time_[func.__name__] += elapsed
       return return_
   return stub

@profile
def foo(x):
   return x

# END {
#   printf STDERR "%-12s %9s %6s\n", "Function", "# calls", "Elapsed";
#   for my $name (sort {$time{$b} <=> $time{$a}} (keys %time)) {
#     printf STDERR "%-12s %9d %6.2f\n", $name, $calls{$name}, $time{$name};
#   }
# }
print >>sys.stderr, "%-12s %9s %6s\n" % ("Function", "# calls", "Elapsed")
for name, _time in sorted(time_.items(), key=lambda x: x[1]):
   print >>sys.stderr, "%-12s %9d %6.2f\n" % (name, calls[name], time_[name])

Sunday, January 18, 2009

99 problems - python - 57

Binary search trees (dictionaries)

Use the predicate add/3, developed in chapter 4 of the course, to write a predicate to construct a binary search tree from a list of integer numbers.


# Example:

# * construct([3,2,5,7,1],T).
# T = t(3, t(2, t(1, nil, nil), nil), t(5, nil, t(7, nil, nil)))

# Then use this predicate to test the solution of the problem P56.

# Example:

# * test-symmetric([5,3,18,1,4,12,21]).
# Yes
# * test-symmetric([3,2,5,7,1]).
# No
def construct(nums):
    if not nums:
        return None
    ordered_nums = sorted(nums)
    center_index = (len(nums) / 2) 
    return (ordered_nums[center_index], construct(ordered_nums[:center_index]), construct(ordered_nums[center_index+1:]))

def test_symmetric(nums):
    return symmetric_binary_tree(construct(nums))

Wednesday, January 14, 2009

Higher Order Perl (Python Style) : Chapter 2


#
# Dispatch Tables
#

# sub read_config {
#   my ($filename) = @_;
#   open my($CF), $filename or return; # Failure
#   while (<$CF>) {
#     chomp;
#     my ($directive, $rest) = split /\s+/, $_, 2;
#     if ($directive eq 'CHDIR') {
#       chdir($rest) or die "Couldn't chdir to '$rest': $!; aborting";
#     } elsif ($directive eq 'LOGFILE') {
#       open STDERR, ">>", $rest
#         or die "Couldn't open log file '$rest': $!; aborting";
#     } elsif ($directive eq 'VERBOSITY') {
#       $VERBOSITY = $rest;
#     } elsif ($directive eq ...) {
#       ...
#     } ...
#     } else {
#       die "Unrecognized directive $directive on line $. of $filename; aborting";
#     }
#   }
#   return 1; # Success
# }
def read_config(filename):
    global VERBOSITY
    for i, line in enumerate(file(filename)):
        line = line.strip()
        directive, rest = line.split(None, 1)
        try:
            if directive == "CHDIR":
                os.chdir(rest)
            elif directive == "LOGFILE":
                sys.stderr = file(rest, "w")
            elif directive == "VERBOSITY":
                VERBOSITY = rest
            elif directive == ...:
                ...
            else:
                sys.exit("Unrecognized directive %s in line %s of %s; aborting" % (directive, i, filename))
        except StandardError, why:
            sys.exit(why)
    return 1 # success


# sub read_config {                               
#   my ($filename, $actions) = @_;
#   open my($CF), $filename or return; # Failure
#   while (<$CF>) {
#     chomp;
#     my ($directive, $rest) = split /\s+/, $_, 2;
#     if (exists $actions->{$directive}) {
#       $actions->{$directive}->($rest);
#     } else {
#       die "Unrecognized directive $directive on line $. of $filename; aborting";
#     }
#   }
#   return 1; # Success
# }
def read_config(filename, actions):
    for i, line in enumerate(file(filename)):
        line = line.strip()
        directive, rest = line.split(None, 1)
        if directive in actions:
            actions[directive](rest)
        else:
            sys.exit("Unrecognized directive %s in line %s of %s; aborting" % (directive, i, filename))
    return 1 # success

# $dispatch_table =
# { CHDIR      => \&change_dir,
#   LOGFILE    => \&open_log_file,
#   VERBOSITY  => \&set_verbosity,
#   ...        =>  ...,
# };
dispatch_table = {
    "CHDIR"     : change_dir,
    "LOGFILE"   : open_log_file,
    "VERBOSITY" : set_verbosity,
    ...         : ...,
    }

# sub change_dir {
#   my ($dir) = @_;
#   chdir($dir)
#     or die "Couldn't chdir to '$dir': $!; aborting";
# }
# sub open_log_file {
#   open STDERR, ">>", $_[0]
#     or die "Couldn't open log file '$_[0]': $!; aborting";
# }
# sub set_verbosity {
#   $VERBOSITY = shift
# }
def change_dir(dir):
    try:
        os.chdir(dir)
    except OSError, why:
        sys.exit(why)        
def open_log_file(log_file):
    try:
        sys.stderr = file(log_file, "w")
    except OSError, why:
        sys.exit(why)
def set_verbosity(verbosity):
    global VERBOSITY
    VERBOSITY = verbosity

# $dispatch_table =
#   { CHDIR      => sub { my ($dir) = @_;
#                        chdir($dir) or
#                          die "Couldn't chdir to '$dir' $!; aborting";
#                                                       :
#                       },
#     LOGFILE    => sub { open STDERR, ">>", $_[0] or
#                           die "Couldn't open log file '$_[0]': $!; aborting";
#                       },
#     VERBOSITY  => sub { $VERBOSITY = shift },
#     ...        => ...,
# };

# This doesn't map well since lambda doesn't allow
# statements. 


# 'DEFINE' => \&define_config_directive,
"DEFINE" : define_config_directive,

# sub define_config_directive {
#   my $rest = shift;
#   $rest =~ s/∧ \s+//;
#   my ($new_directive, $def_txt) = split /\s+/, $rest, 2;
#   if (exists $CONFIG_DIRECTIVE_TABLE{$new_directive}) {
#     warn "$new_directive already defined; skipping.\n";
#     return;
#   }
#   my $def = eval "sub { $def_txt }";
#   if (not defined $def) {
#     warn "Could not compile definition for '$new_directive': $@; skipping.\n";
#     return;
#   }
#   $CONFIG_DIRECTIVE_TABLE{$new_directive} = $def;
# }

def define_config_directive(rest):
    rest = rest.strip()
    new_directive, def_txt = rest.split(None, 1)

    if new_directive in CONFIG_DIRECTIVE_TABLE:
        print "%s already defined; skipping." % new_directive
        return

    try:
        _def = eval("""lambda x: %s""" % def_txt)
    except StandardError, why:
        print "Could not compile definition for '%s': %s" % (new_directive, why)
        return

    CONFIG_DIRECTIVE_TABLE[new_directive] = _def

    
# This doesn't map well to the python way of doing things since
# lambdas are sort of limited.  Instead we would
# probably just define the functions with a name directly.
# On the other hand if you really must doing everything "lambda style"
# you can probably get farther than you'd first guess with clever uses
# of and/or connectors in your lambda expression


# sub read_config {
#   my ($filename, $actions) = @_;
#   open my($CF), $filename or return; # Failure
#   while (<$CF>) {
#     chomp;
#     my ($directive, $rest) = split /\s+/, $_, 2;
#     if (exists $actions->{$directive}) {
#       $actions->{$directive}->($rest, $actions);
#     } else {
#       die "Unrecognized directive $directive on line $. of $filename; aborting";
#     }
#   }
#   return 1; # Success
# }
def read_config(filename, actions):
    for i, line in enumerate(file(filename)):
        line = line.strip()
        directive, rest = line.split(None, 1)
        if directive in actions:
            actions[directive](rest, actions)
        else:
            sys.exit("Unrecognized directive %s in line %s of %s; aborting" % (directive, i, filename))
    return 1 # success


# sub define_config_directive {
#   my ($rest, $dispatch_table) = @_;
#   $rest =~ s/∧ \s+//;
#   my ($new_directive, $def_txt) = split /\s+/, $rest, 2;
#   if (exists $dispatch_table->{$new_directive}) {
#     warn "$new_directive already defined; skipping.\n";
#     return;
#   }
#   my $def = eval "sub { $def_txt }";
#   if (not defined $def) {
#     warn "Could not compile definition for '$new_directive': $@; skipping.\n";
#     return;
#   }
#   $dispatch_table->{$new_directive} = $def;
# }
def define_config_directive(rest, dispatch_table):
    rest = rest.strip()
    new_directive, def_txt = rest.split(None, 1)

    if new_directive in dispatch_table:
        print "%s already defined; skipping." % new_directive
        return

    try:
        _def = eval("""lambda x: %s""" % def_txt)
    except StandardError, why:
        print "Could not compile definition for '%s': %s" % (new_directive, why)
        return

    dispatch_table[new_directive] = _def



# sub read_config {
#   my ($filename, $actions, $user_param) = @_;
#   open my($CF), $filename or return; # Failure
#   while (<$CF>) {
#     my ($directive, $rest) = split /\s+/, $_, 2;
#     if (exists $actions->{$directive}) {
#       $actions->{$directive}->($rest, $user_param, $actions);
#     } else {
#       die "Unrecognized directive $directive on line $. of $filename; aborting";
#     }
#   }
#   return 1; # Success
# }
def read_config(filename, actions, user_param):
    for i, line in enumerate(file(filename)):
        line = line.strip()
        directive, rest = line.split(None, 1)
        if directive in actions:
            actions[directive](rest, user_param, actions)
        else:
            sys.exit("Unrecognized directive %s in line %s of %s; aborting" % (directive, i, filename))
    return 1 # success

#read_config($filename, $dispatch_table, \@dirs);
read_config(filename, dispatch_table, dirs)

#read_config($filename, $dispatch_table, []);
read_config(filename, dispatch_table, [])


# sub read_config {
#   my ($filename, $actions, $user_param) = @_;
#   open my($CF), $filename or return; # Failure
#   while (<$CF>) {
#     my ($directive, $rest) = split /\s+/, $_, 2;
#     if (exists $actions->{$directive}) {
#       $actions->{$directive}->($directive, $rest, $actions, $user_param);
#     } else {
#       die "Unrecognized directive $directive on line $. of $filename; aborting";
#     }
#   }
#   return 1; # Success
# }
def read_config(filename, actions, user_param):
    for i, line in enumerate(file(filename)):
        line = line.strip()
        directive, rest = line.split(None, 1)
        if directive in actions:
            actions[directive](directive, rest, user_param, actions)
        else:
            sys.exit("Unrecognized directive %s in line %s of %s; aborting" % (directive, i, filename))
    return 1 # success


# sub set_var {
#   my ($var, $val) = @_;
#   $$var = $val;
# }
# OK, this is pretty unnatural at this point
# we would almost certianly work instead with a
# global or passed in dictionary
# (not tested)
def set_var(var, val):
    exec "global %s" % var
    exec "%s = %r" % (var, val)


# sub set_var {
#   my ($var, $val, undef, $config_hash) = @_;
#   $config_hash->{$var} = $val;
# }
def set_var(var, val, _, config_hash):
    config_hash[var] = val



# sub open_input_file {
#   my ($handle, $filename) = @_;
#   unless (open $handle, $filename) {
#     warn "Couldn't open $handle file '$filename': $!; ignoring.\n";
#   }
# }
# yet more unnatural hackery (not tested)
# again, we'd probably work with a global or passed in dictionary
def open_input_file(handle, filename):
    exec "global %s" % handle
    exec """
try:
    %s = file('%s')
except StandardError, why:
    print "Counldn't open %s file '%s': %%s; ignoring." %% why
return """ % (handle, filename, handle, filename)
    

# sub read_config {
#   my ($filename, $actions, $userparam) = @_;
#   open my($CF), $filename or return; # Failure
#   while (<$CF>) {
#     chomp;
#     my ($directive, $rest) = split /\s+/, $_, 2;
#     my $action = $actions->{$directive} || $actions->{_DEFAULT_};
#     if ($action) {
#       $action->($directive, $rest, $actions, $userparam);
#     } else {
#       die "Unrecognized directive $directive on line $. of $filename; aborting";
#     }
#   }
#   return 1; # Success
# }
def read_config(filename, actions, user_param):
    for i, line in enumerate(file(filename)):
        line = line.strip()
        directive, rest = line.split(None, 1)
        action = actions[directive] or actions["__DEFAULT__"]
        if action:
            action[directive](directive, rest, user_param, actions)
        else:
            sys.exit("Unrecognized directive %s in line %s of %s; aborting" % (directive, i, filename))
    return 1 # success

# sub no_such_directive {
#   my ($directive) = @_;
#   warn "Unrecognized directive $directive at line $.; ignoring.\n";
# }
# python doesn't have a trivial way to get the line number
# so we leave that out, but may get close with inspect
def no_such_directive(directive):
    print "Unrecognized directive %s; ignoring." % directive

# sub no_such_directive {
#   my ($bad, $rest, $table) = @_;
#   my ($best_match, $best_score);
#   for my $good (keys %$table) {
#     my $score = score_match($bad, $good);
#     if ($score > $best_score) {
#       $best_score = $score;
#       $best_match = $good;
#     }
#   }
#   warn "Unrecognized directive $bad at line $.;\n";
#   warn "\t(perhaps you meant $best_match?)\n";
# }
def no_such_directive(bad, rest, table):
    best_match = None
    best_score = 0
    
    for good in table:
        score = score_match(bad, good)
        if score > best_score:
            best_score = score
            best_match = good

    print "Unrecognized directive %s." % bad
    print "\t(perhaps you meant %s?)" % best_match


# $address_actions =
#   { _DEFAULT_ => sub { my ($id, $addr, $act, $aref) = @_;
#                        push @$aref, [$id, $addr];
#                      },
#   };
# read_config($ADDRESS_FILE, $address_actions, \@address_array);
address_actions = {
    "__DEFAULT__" : lambda id, addr, act, aref: aref.append([id, addr])
    }
read_config(ADDRESS_FILE, address_actions, address_array)


# my $result = evaluate($ARGV[0]);
# print "Result: $result\n";
# sub evaluate {
#   my @stack;
#   my ($expr) = @_;
#   my @tokens = split /\s+/, $expr;
#   for my $token (@tokens) {
#     if ($token =~ /∧ \d+$/) { # It's a number
#       push @stack, $token;
#     } elsif ($token eq '+') {
#        push @stack, pop(@stack) + pop(@stack);
#     } elsif ($token eq '-') {
#        my $s = pop(@stack);
#        push @stack, pop(@stack) - $s
#     } elsif ($token eq '*') {
#        push @stack, pop(@stack) * pop(@stack);
#     } elsif ($token eq '/') {
#        my $s = pop(@stack);
#        push @stack, pop(@stack) / $s
#     } else {
#       die "Unrecognized token '$token'; aborting";
#     }
#    }
#   return pop(@stack);
# }
result = evaluate(sys.argv[1])
print "Result: %s" % result
def evaluate(expr):
    stack = []
    tokens = expr.split()
    for token in tokens:
        if token.isdigit():
            stack.insert(0, int(token))
        elif token == "+":
            stack.insert(0, (stack.pop(0) + stack.pop(0)))
        elif token == "-":
            s = stack.pop(1)
            stack.insert(0, (stack.pop(0) - s))
        elif token == "*":
            stack.insert(0, (stack.pop(0) * stack.pop(0)))
        elif token == "/":
            s = stack.pop(1)
            stack.insert(0, (stack.pop(0) / s))
        else:
            sys.exit("Unrecognized toke '%s'; aborting" % token)
    return stack.pop(0)



# my @stack;
# my $actions = {
#   '+' => sub { push @stack, pop(@stack) + pop(@stack) },
#   '*' => sub { push @stack, pop(@stack) * pop(@stack) },
#   '-' => sub { my $s = pop(@stack); push @stack, pop(@stack) - $s },
#   '/' => sub { my $s = pop(@stack); push @stack, pop(@stack) / $s },
#   'NUMBER' => sub { push @stack, $_[0] },
#   '_DEFAULT_' => sub { die "Unrecognized token '$_[0]'; aborting" }
# };
# my $result = evaluate($ARGV[0], $actions);
# print "Result: $result\n";
# sub evaluate {
#   my ($expr, $actions) = @_;
#   my @tokens = split /\s+/, $expr;
#
#   for my $token (@tokens) {
#     my $type;
#     if ($token =~ /∧ \d+$/) { # It's a number
#       $type = 'NUMBER';
#     }
#     my $action = $actions->{$type}
#               || $actions->{$token}
#               || $actions->{_DEFAULT_};
#     $action->($token, $type, $actions);
#   }
#   return pop(@stack);
# }
stack = []
actions = {
    "+" : (lambda *x: stack.insert(0,  stack.pop(0)    + stack.pop(0))),
    "-" : (lambda *x: stack.insert(0, -stack.pop(0)    + stack.pop(0))),
    "*" : (lambda *x: stack.insert(0,  stack.pop(0)    * stack.pop(0))),
    "/" : (lambda *x: stack.insert(0, (1/stack.pop(0)) * stack.pop(0))),
    "NUMBER" : (lambda *x: stack.insert(0, int(x[0]))),
    "__DEFAULT__" : (lambda *x: sys.exit("Unrecognized token '%s'; aborting" % x[0]))
}

result = evaluate(sys.argv[1], actions)
print "Result: %s" % result
def evaluate(expr, actions):
    tokens = expr.split()

    for token in tokens:
        type = None
        if token.isdigit():
            type = "NUMBER"
        action = actions.get(type or token) or actions["__DEFAULT__"]
        action(token, type, actions)
    return stack.pop(0)


#sqrt' => sub { push @stack, sqrt(pop(@stack)) },
"sqrt" : (lambda *x: stack.insert(0, math.sqrt(stack.pop(0))))

# my $actions = {
#   'NUMBER'    => sub { push @stack,   $_[0] },
#   '_DEFAULT_' => sub { my $s = pop(@stack);
#                        push @stack,
#                           [ $_[0], pop(@stack), $s ]
#                        },
# };
actions = {
    "NUMBER" : (lambda *x: stack.insert(0, int(x[0]))),
    "__DEFAULT__" : (lambda *x: stack.insert(0, [x[0], stack.pop(1), stack.pop(0)]))
}


# sub AST_to_string {
#   my ($tree) = @_;
#   if (ref $tree) {
#     my ($op, $a1, $a2) = @$tree;
#     my ($s1, $s2) = (AST_to_string($a1),
#                      AST_to_string($a2));
#     "($s1 $op $s2)";
#   } else {
#     $tree;
#   }
# }
def AST_to_string(tree):
    if tree:
        op, a1, a2 =  tree
        s1, s2 = AST_to_string(a1), AST_to_string(a2)
        return "%s %s %s" % (s1, op, s2)
    else:
        return tree

# sub elementfunc {
#   my $table = { h1        => sub { shift; my $text = join '', @_;
#                                    print $text; return $text ;
#                                  }
#                 _DEFAULT_ => sub { shift; my $text = join '', @_;
#                                               return $text ;
#               };
#   my ($element) = @_;
#   my $tag = $element->{_tag};
#   my $action = $table->{$tag} || $table{_DEFAULT_};
#   return $action->(@_);
# }
def elementfunc(element):
    table = {"h1"          : (lambda html, results: "".join(results)),
             "__DEFAULT__" : (lambda html, results: "".join(results))}

    tag = element["_tag"]
    action = table.get(tag) or table["__DEFAULT__"]
    return action(element)



# sub walk_html {
#   my ($html, $textfunc, $elementfunc_table) = @_;
#   return $textfunc->($html) unless ref $html; # It's a plain string
#   my ($item, @results);
#   for $item (@{$html->{_content}}) {
#     push @results, walk_html($item, $textfunc, $elementfunc_table);
#   }
#   my $tag = $html->{_tag};
#   my $elementfunc = $elementfunc_table->{$tag}
#                  || $elementfunc_table->{_DEFAULT_}
#                  || die "No function defined for tag '$tag'";
#   return $elementfunc->($html, @results);
# }
def walk_html(html, textfunc, elementfunc_table):
    if type(html) == str:
        return textfunc(html)

    results = []
    for item in html["_content"]:
        results.append(walk_html(item, textfunc, elementfunc_table))
    
    tag = html["_tag"]
    elementfunc = elementfunc_table.get(tag) or elementfunc_table.get("__DEFAULT__")
    if not elementfunc:
        sys.exit("No function defined for tag '%s'" % tag)
    return elementfunc(html, results)


# sub walk_html {
#   my ($html, $textfunc, $elementfunc, $userparam) = @_;
#   return $textfunc->($html, $userparam) unless ref $html;
#   my ($item, @results);
#   for $item (@{$html->{_content}}) {
#     push @results, walk_html($item, $textfunc, $elementfunc, $userparam);
#   }
#   return $elementfunc->($html, $userparam, @results);
# }
def walk_html(html, textfunc, elementfunc, userparam):
    if type(html) == str:
        return textfunc(html)

    results = []
    for item in html["_content"]:
        results.append(walk_html(item, textfunc, elementfunc, userparam))

    return elementfunc(html, results, userparam)
        
# walk_html($html_text,
#  # $textfunc
#  sub { my ($text, $aref) = @_;
#        push @$aref, $text },
#  # $elementfunc does nothing
#  sub { },
#  # user parameter
#  \@text_array
# );
def walk_html(html_text,
              lambda text, aref: aref.append(text),
              lambda x,y: [],
              text_array)

Higher Order Perl (Python Style) : Chapter 1


#
# Recursion and Callbacks
#

# sub binary {
#   my ($n) = @_;
#   return $n if $n == 0 || $n == 1;

#   my $k = int($n/2);
#   my $b = $n % 2;

#   my $E = binary($k);

#   return $E . $b;
# }
def binary(n):
    if n in (0,1):
        return str(n)

    k = n / 2
    b = n % 2

    return binary(k) + str(b)


# sub factorial {
#   my ($n) = @_;
#   return factorial($n-1) * $n;
# }
def factorial(n):
    return factorial(n-1) * n


# sub factorial {
#   my ($n) = @_;
#   return 1 if $n == 0;
#   return factorial($n-1) * $n;
# }
def factorial(n):
    if n == 0:
        return 1
    return factorial(n-1) * n

# sub hanoi {
#   my ($n, $start, $end, $extra) = @_;
#   if ($n == 1) {
#     print "Move disk #1 from $start to $end.\n";  # Step 1
#   } else {
#     hanoi($n-1, $start, $extra, $end);            # Step 2
#     print "Move disk #$n from $start to $end.\n"; # Step 3
#     hanoi($n-1, $extra, $end, $start);            # Step 4
#   }
# }
def hanoi(n, start, end, extra):
    if n == 1:
        print "Move disk #1 from %(start)s to %(end)s." % locals()
    else:
        hanoi(n-1, start, extra, end)
        print "Move disk #%(n)s from %(start)s to %(end)s." % locals()
        hanoi(n-1, extra, end, start)
        


# sub hanoi {
#   my ($n, $start, $end, $extra, $move_disk) = @_;
#   if ($n == 1) {
#     $move_disk->(1, $start, $end);
#   } else {
#     hanoi($n-1, $start, $extra, $end, $move_disk);
#     $move_disk->($n, $start, $end);
#     hanoi($n-1, $extra, $end, $start, $move_disk);
#   }
# }
def hanoi(n, start, end, extra, move_disk):
    if n == 1:
        move_disk(1, start, end)
    else:
        hanoi(n-1, start, extra, end, move_disk)
        move_disk(n, start, end)
        hanoi(n-1, extra, end, start, move_disk)
    

# sub print_instruction {
#   my ($disk, $start, $end) = @_;
#   print "Move disk #$disk from $start to $end.\n";
# }
# hanoi(3, 'A', 'C', 'B', \&print_instruction);
def print_instruction(disk, start, end):
    print "Move disk %(disk)s from %(start)s to %(end)s" % locals()

hanoi(3, "A", "C", "B", print_instruction)




# @position = ('', ('A') x 3); # Disks are all initially on peg A

# sub check_move {
#     my $i;
#     my ($disk, $start, $end) = @_;

#     if ($disk < 1 || $disk > $#position) {
#        die "Bad disk number $disk. Should be 1..$#position.\n";
#     }

#     unless ($position[$disk] eq $start) {
#        die "Tried to move disk $disk from $start, but it is on peg
#                                                       $position[$disk].\n";
#     }

#     for $i (1 .. $disk-1) {
#        if ($position[$i] eq $start) {
#            die "Can't move disk $disk from $start because $i is on top of it.\n";
#        } elsif ($position[$i] eq $end) {
#            die "Can't move disk $disk to $end because $i is already there.\n";
#        }
#     }

#     $position[$disk] = $end;
# }

# hanoi(3, 'A', 'C', 'B', \&check_move);
position = [""] + ["A"] * 3

def check_move(disk, start, end):
    if disk < 1 or disk > len(position) - 1:
        sys.exit("Bad disk number %s. Should be 1..%s" % (disk, len(position)-1))

    if position[disk] != start:
        sys.exit("Tried to move disk %s from %s, but it is on peg %s" % (disk, start, position[disk]))

    for i in range(1, disk):
        if position[i] == start:
            sys.exit("Can't move disk %s from %s because %s is on top of it." % (disk, start, i))
        elif position[i] == end:
            sys.exit("Can't move disk %s to %s because %s is already there." % (disk, end, i))

    position[disk] = end
    print position

hanoi(3, "A", "C", "B", check_move)




# sub total_size {
#    my ($top) = @_;
#    my $total = -s $top;

#    return $total if -f $top;
#    unless (opendir DIR, $top) {
#       warn "Couldn’t open directory $top: $!; skipping.\n";
#       return $total;
#    }

#    my $file;
#    while ($file = readdir DIR) {
#      next if $file eq '.' || $file eq '..';
#      $total += total_size("$top/$file");
#    }

#    closedir DIR;
#    return $total;
# }

# NOTE: in python you'd probably prefer
#       os.walk
# ALSO: it would be unnatural to duplicate the 
#       buggy first version of this function so we'll
#       just be satisfied with it working. :) 
def total_size(top):
    total = os.path.getsize(top)

    if os.path.isfile(top):
        return total

    try:
        for filename in os.listdir(top):
            total += total_size(os.path.join(top, filename))
    except OSError:
        print "Couldn't open directory: %s; skipping." % top
        return total
    return total

# sub dir_walk {
#   my ($top, $code) = @_;
#   my $DIR;
#   $code->($top);
#   if (-d $top) {
#     my $file;
#     unless (opendir $DIR, $top) {
#       warn "Couldn’t open directory $top: $!; skipping.\n";
#       return;
#     }
#     while ($file = readdir $DIR) {
#       next if $file eq '.' || $file eq '..';
#       dir_walk("$top/$file", $code);
#     }
#   }
# }

# the traditional python way would be to use os.walk()
def dir_walk(top, code):
    code(top)
    
    if os.path.isdir(top):
        try:
            for filename in os.listdir(top):
                dir_walk(os.path.join(top, filename), code)
        except OSError:
            print "Couldn't open directory: %s; skipping." % top


        
# sub print_dir {
#   print $_[0], "\n";
# }
def print_dir(x):
    print x

#dir_walk('.', sub { printf "%6d %s\n", -s $_[0], $_[0] } );
dir_walk('.', lambda x: sys.stdout.write("%6d %s\n" % (os.path.getsize(x), x)))

#dir_walk('.', sub { print $_[0], "\n" if -l $_[0] && ! -e $_[0] });
dir_walk('.', lambda x: (os.path.islink(x) and os.path.exists(os.readlink(x)) and sys.stdout.write("%s\n" % x)))

# my $TOTAL = 0;
# dir_walk('.', sub { $TOTAL += -s $_[0] });
# print "Total size is $TOTAL.\n";

# can't run += or global w/in lambda
total = 0
def accumulate(x):
    global total
    total += os.path.getsize(x)
dir_walk('.', accumulate)
print "Total size is %s." % total


# sub dir_walk { 
#   my ($top, $filefunc, $dirfunc) = @_;
#   my $DIR;
#   if (-d $top) {
#     my $file;
#     unless (opendir $DIR, $top) {
#       warn "Couldn’t open directory $code: $!; skipping.\n";
#       return;
#     }
#     my @results;
#     while ($file = readdir $DIR) {
#       next if $file eq '.' || $file eq '..';
#       push @results, dir_walk("$top/$file", $filefunc, $dirfunc);
#     }
#     return $dirfunc->($top, @results);
#   } else {
#     return $filefunc->($top);
#   }
# }

# sub file_size { -s $_[0] }

# sub dir_size {
#   my $dir = shift;
#   my $total = -s $dir;
#   my $n;
#   for $n (@_) { $total += $n }
#   return $total;
# }

# $total_size = dir_walk('.', \&file_size, \&dir_size);
def dir_walk(top, filefunc, dirfunc):
    if os.path.isdir(top):
        results = []
        try:
            for filename in os.listdir(top):
                results.append(dir_walk(os.path.join(top, filename), filefunc, dirfunc))
        except OSError:
            print "Couldn't open directory: %s; skipping." % top
            return
        return dirfunc(top, results)
    else:
        return filefunc(top)

def file_size(x):
    return os.path.getsize(x)

def dir_size(d, accum):
    _dirsize = os.path.getsize(d)
    total = _dirsize + sum(accum)
    return total

# sub dir_size {
#   my $dir = shift;
#   my $total = -s $dir;
#   my $n;
#   for $n (@_) { $total += $n }
#   printf "%6d %s\n", $total, $dir;
#   return $total;
# }
def dir_size(d, accum):
    _dirsize = os.path.getsize(d)
    total = _dirsize + sum(accum)
    print "%6d %ss" % (total, d)
    return total




# sub file {
#   my $file = shift;
#   [short($file), -s $file];
# }
# sub short {
#   my $path = shift;
#   $path = ̃ s{.*/}{};
#   $path;
# }


def _file(x):
    return os.path.basename(x), os.path.getsize(x)

# sub dir {
#   my ($dir, @subdirs) = @_;
#   my %new_hash;
#   for (@subdirs) {
#     my ($subdir_name, $subdir_structure) = @$_;
#     $new_hash{$subdir_name} = $subdir_structure;
#   }
#   return [short($dir), \%new_hash];
# }
def _dir(dir, subdirs):
    new_hash = {}
    for subdir_name, subdir_structure in subdirs:
        new_hash[subdir_name] = subdir_structure
    return os.path.basename(dir), new_hash


# sub print_filename { print $_[0], "\n" }
# dir_walk('.', \&print_filename, \&print_filename);

# we add a result parameter so that the function
# can work as dirfunc
def print_filename(x, result=None):
    print x
dir_walk(".", print_filename, print_filename)


# sub dangles {
#   my $file = shift;
#   print "$file\n" if -l $file && ! -e $file;
# }
# dir_walk('.', \&dangles, sub {});
def dangles(x):
    if os.path.islink(x) and not os.path.exists(os.readlink(x)):
        print x
dir_walk('.', dangles, lambda x,y: ())



# sub dir_walk {
#   my ($top, $filefunc, $dirfunc) = @_;
#   my $DIR;
#   if (-d $top) {
#     my $file;
#     unless (opendir $DIR, $top) {
#       warn "Couldn’t open directory $top: $!; skipping.\n";
#       return;
#     }
#     my @results;
#     while ($file = readdir $DIR) {
#       next if $file eq '.' || $file eq '..';
#       push @results, dir_walk("$top/$file" $filefunc, $dirfunc);
#                                           ,
#     }
#     return $dirfunc ? $dirfunc->($top, @results) : () ;
#   } else {
#     return $filefunc ? $filefunc->($top): () ;
#   }
# }


def dir_walk(top, filefunc=None, dirfunc=None):
    if os.path.isdir(top):
        results = []
        try:
            for filename in os.listdir(top):
                results.append(dir_walk(os.path.join(top, filename), filefunc, dirfunc))
        except OSError:
            print "Couldn't open directory: %s; skipping." % top
            return
        return dirfunc and dirfunc(top, results)
    else:
        return filefunc and filefunc(top)


#@all_plain_files = dir_walk('.', sub { $_[0] }, sub { shift; return @_ });
all_plain_files = dir_walk(".", lambda x: x, lambda x, y: y)


# { _tag => "p",
#   _content => [ "But I don't ",
#                 { _tag => "font",
#                   _content => [ "want" ],
#                   color => "red",
#                   size => 3,
#                 },
#                 " to go to bed now!",
#                 ],
# }

# so I'm going to punt on this.  I believe I could
# probably get something usefully similar from
# BeautifulSoup or lxml but I don't really
# want to vary from the original perl code *too* much
# so I'll just use a hardcoded dict representation
# and see how far I get with that.
tree =  { "_tag"      : "(document)",
          "_content" : [{"_tag" : "h1",
                         "_content" : [ "What Junior Said Next",
                                     ],                    
                         },
                        "\n\n",
                        {"_tag" : "p",
                         "_content" : [ "But I don't ",
                                        { "_tag" : "font",
                                          "_content" : [ "want" ],
                                          "color" : "red",
                                          "size" : 3,
                                          },
                                        " to go to bed now!",
                                        ],
                         }
                        ]
          }   
# sub untag_html {
#   my ($html) = @_;
#   return $html unless ref $html;   # It’s a plain string
#   my $text = '';
#   for my $item (@{$html->{_content}}) {
#     $text .= untag_html($item);
#   }
#   return $text;
# }
def untag_html(html):
    if type(html) != dict:
        return html

    text = ""
    for item in html["_content"]:
        text += untag_html(item)
    return text

# sub walk_html {
#   my ($html, $textfunc, $elementfunc) = @_;
#   return $textfunc->($html) unless ref $html; # It’s a plain string
#   my @results;
#   for my $item (@{$html->{_content}}) {
#     push @results, walk_html($item, $textfunc, $elementfunc);
#   }
#   return $elementfunc->($html, @results);
# }
# we are going to cheat hear a little and also
# pass around a concatenator function
# since perl's "push" function has some 
# "magic" properties that don't map well to python
def appender(accum, item):
    accum.append(item)
def extender(accum, item):
    accum.extend(item)
def walk_html(html, textfunc, elementfunc, 
              concatenator=appender):
    if type(html) == str:
        return textfunc(html)

    results = []
    for item in html["_content"]:
        concatenator(results, walk_html(item, textfunc, elementfunc, concatenator))

    return elementfunc(html, results)

# walk_html($tree, sub { $_[0] },
#                  sub { shift; join '', @_ });
walk_html(tree, lambda x: x, lambda x,y: "".join(y))

# sub print_if_h1tag {
#   my $element = shift;
#   my $text = join '', @_;
#   print $text if $element->{_tag} eq 'h1';
#   return $text;
# }
def print_if_h1tag(element, results):
    text = "".join(results)
    if element["_tag"] == "h1":
        print text
    return text

#walk_html($tree, sub { $_[0] }, \&print_if_h1tag);
walk_html(tree, lambda x: x, print_if_h1tag)


# @tagged_texts = walk_html($tree, sub { ['MAYBE', $_[0]] },
#                                  \&promote_if_h1tag);
# sub promote_if_h1tag {
#   my $element = shift;
#   if ($element->{_tag} eq 'h1') {
#     return ['KEEPER', join '', map {$_->[1]} @_];
#   } else {
#     return @_;
#   }
# }
tagged_texts = walk_html(tree, lambda x: [["MAYBE", x]],
                               promote_if_h1tag,
                               extender)
def promote_if_h1tag(element, results):
    if element["_tag"] == "h1":
        return [["KEEPER", "".join([ _x[1] for _x in results])]]
    else:
        return results


# sub extract_headers {
#   my $tree = shift;
#   my @tagged_texts = walk_html($tree, sub { ['MAYBE', $_[0]] },
#                                       \&promote_if_h1tag);
#   my @keepers = grep { $_->[0] eq 'KEEPER'} @tagged_texts;
#   my @keeper_text = map { $_->[1] } @keepers;
#   my $header_text = join '', @keeper_text;
#   return $header_text;
# }
def extract_headers(tree):
    tagged_texts = walk_html(tree, lambda x: [["MAYBE", x]],
                                   promote_if_h1tag,
                                   extender)
    keepers = [x for x in tagged_texts if x[0] == "KEEPER"]
    keeper_text = [x[1] for x in keepers]
    header_text = "".join(keeper_text)
    return header_text


# sub extract_headers {
#   my $tree = shift;
#   my @tagged_texts = walk_html($tree, sub { ['MAYBE', $_[0]] },
#                                       \&promote_if_h1tag);
#   join '', map { $_->[1] } grep { $_->[0] eq 'KEEPER'} @tagged_texts;
# }
def extract_headers(tree):
    tagged_texts = walk_html(tree, lambda x: [["MAYBE", x]],
                                   promote_if_h1tag,
                                   extender)
    return "".join([x[1] for x in tagged_texts if x[0] == "KEEPER"])


# sub promote_if_h1tag {
#   my $element = shift;
#   if ($element->{_tag} =~ /^h\d+$/) {
#     return ['KEEPER', join '', map {$_->[1]} @_];
#   } else {
#     return @_;
#   }
# }
def promote_if_h1tag(element, results):
    if re.search("^h\d+", element["_tag"]):
        return [["KEEPER", "".join([ _x[1] for _x in results])]]
    else:
        return results


# sub promote_if {
#   my $is_interesting = shift;
#   my $element = shift;
#   if ($is_interesting->($element->{_tag}) {
#     return ['KEEPER', join '', map {$_->[1]} @_];
#   } else {
#     return @_;
#   }
# }
def promote_if(is_interesting, element, results):
    if is_interesting(element["_tag"]):
        return [["KEEPER", "".join([ _x[1] for _x in results])]]
    else:
        return results


# my @tagged_texts = walk_html($tree,
#                               sub { ['maybe', $_[0]] },
#                               sub { promote_if(
#                                        sub { $_[0] eq 'h1'},
#                                        $_[0])
#                               });
tagged_texts = walk_html(tree, lambda x: [["MAYBE", x]],
                               lambda y,z: (promote_if((lambda _y: _y == "h1"), y,z)),
                               extender)


# sub fib {
#   my ($month) = @_;
#   if ($month < 2) { 1 }
#   else {
#     fib($month-1) + fib($month-2);
#   }
# }
def fib(month):
    if month < 2:
        return 1
    else:
        return fib(month-1) + fib(month-2)


# sub find_share {
#   my ($target, $treasures) = @_;
#   return [] if $target == 0;
#   return    if $target < 0 || @$treasures == 0;
#   my ($first, @rest) = @$treasures;
#   my $solution = find_share($target-$first, \@rest);
#   return [$first, @$solution] if $solution;
#   return         find_share($target       , \@rest);
# }
def find_share(target, treasures):
    if target == 0:
        return []
    if target < 0 or len(treasures) == 0:
        return
    first, rest = treasures[0], treasures[1:]
    solution = find_share(target-first, rest)
    if solution != None:
        return [first] + solution
    return find_share(target, rest)


# sub partition {
#   my $total = 0;
#   my $share_2;
#   for my $treasure (@_) {
#     $total += $treasure;
#   }
#   my $share_1 = find_share($total/2, [@_]);
#   return unless defined $share_1;
#   my %in_share_1;
#   for my $treasure (@$share_1) {
#     ++$in_share_1{$treasure};
#   }
#   for my $treasure (@_) {
#     if ($in_share_1{$treasure}) {
#       --$in_share_1{$treasure};
#     } else {
#       push @$share_2, $treasure;
#     }
#   }
#   return ($share_1, $share_2);
# }
# NOTE: would be more pythonic to use sum, sets()
def partition(treasures):
    share_2 = []
    total = 0

    for treasure in treasures:
        total += treasure

    share_1 = find_share(total/2, treasures)
    if not share_1:
        return 

    in_share_1 = {}
    for treasure in share_1:
        in_share_1.setdefault(treasure, 0)
        in_share_1[treasure] += 1

    for treasure in treasures:
        if treasure in in_share_1:
            in_share_1[treasure] -= 1
        else:
            share_2.append(treasure)
    
    return share_1, share_2

# a more pythonic way:
def partition_(treasures):
    share_1 = find_share(sum(treasures)/2, treasures)
    if not share_1:
        return 

    return share_1, list(set(treasures) - set(share_1))

Higher Order Perl (Python Style) TOC

Higher Order Perl has many things of interest for a python programmer. As I've been going through this book (a free copy of which can be viewed here) I've been translating (as literally as possible) the perl code to python. I will continue to update this page as I finish each chapter. In general my plan is to stay as close as possible to the original perl unless the python transliteration is impossible (perl lambda are more expressive) or too unpythonic (e.g. some times a list comprehension or a set datatype does wonders for brevity). But still in many cases there will be a more pythonic way to write many of these examples. This is left as an exercise for the reader. :)

Please let me know if you see any serious translation errors or have other suggestions.

Friday, January 9, 2009

emacs python mode from scratch: stage 18 - miscellany

stage 19 - miscellany

A grab bag of functions to finish us up:


python-jython-packages
python-maybe-jython
python-fill-paragraph
python-shift-left
python-shift-right
python-outline-level
python-current-defun
python-mark-block

python-jython-packages
python-maybe-jython

I don't really care too much about jython support so this won't get much more that a brief inspection. python-jython-packages seems to be a list of module names that if found in a package will cause jython mode to be used

python-maybe-jython is the function that uses the above list among other things to magically determine whether or not to use python or jython mode.
python-fill-paragraph

Provides a python aware fill-paragraph-function. The usual process would be to define paragraph-start and paragraph-separate and let fill do it's magic, but apparently this doesn't work quite right for the last paragraph in a multiline string. So instead This function narrows the string itself and does the fill action on that.
python-shift-left
python-shift-right

Move some code left or right by the requested number of columns (defaulting to the prima facie indentation level). If there is no region selected then just work with current line. This works by iterating over every line and running indent-rigidly. There is less code for the python-shift-right variation since you don't have to look out for the case where you run out of room for shifting.
python-outline-level

get outline-level. This is just the multiple of python-indent of the current indentation level.
python-current-defun

Work up an indentation level at a time (ie def or class) and keep track of the names of the def and class entities along the way.

This is used by add-log (add-log.el). apparently this is some sore of functionality for working with change logs. this function gives a way to refer to the name of the function where point currently resides. But honestly I'm not really sure how this fits into the big picture. This may well be some feature i depend on, but I'm having trouble figuring out what features use this and why.

If you run M-x add-change-log-entry it opens up something called a change log. I guess I'll just keep my eyes open for this in the future.
python-mark-block

Mark the block. Just as you'd expect.

So that's pretty much the end except for some little details here and there. This was an interesting exercise though if I had thought about how long this was going to take I'm not sure I would have started down this path. I definitely learned a lot, but the main thing I learned was how little I really know about emacs. Now that I'm finished I find a nice little tutorial on how language modes work. Well at least I know what they are talking about now. My main lesson I guess is that I really need to learn emacslisp if I want to become a master. But even then I will still have a lot to learn about the actual environment. But I'm getting there.

Thursday, January 8, 2009

100 Pushups: week 6 milestone - 60

OK, so after almost 7 month of the hilariously named 6 week program for achieving 100 pushups I finally hit 60. And actually it's even better than that because I did a total of 188 reps (in sets of 18 to 30) immediately preceding this.

So I'm making progress. There is probably a more efficient training regimen, but I'm getting there.

I don't know if this really is what got me there tonight but several times earlier in the day I had rehearsed in my head the last 5 pushups. 56. 57. 58. 59. 60! I visualized getting stronger on each one. And I felt sort of lame and new agey doing this, but I really wanted to hit this milestone tonight so I was willing to allow a little "woo" in my life if it was going to give me the edge.

OK, just 40 more. *sigh*

Friday, January 2, 2009

Mindfulness++

I'm one of those guys who always talks about how they hate new year's resolutions and then makes a bunch anyway. Hi, nice to meet you.

One thing that has been weighing on me for a while and especially since I've had kids is that my concentration seems to be a faint shadow of it's former glory. It's probably partly age and partly the induced ADD of having to watch over young children. And don't forget the internets. Damn those tubes!

In any case I find myself so easily distracted, even when I'm doing something that I really enjoy. There seems to be a constantly growing wave of research and advice on this topic and much of it points to meditation/mindfulness as being a practice that will increase your ability to focus. Most recently I read this when going through Pragmatic Thinking and Learning: Refactor your Wetware.

I've actually had as a goal for at least a year to do 8 minutes a night of "watching my breath". Sometimes I'll go weeks of doing this every night. Sometimes I go weeks without even thinking about it.

So there are two parts to this resolution. I want to make this a do not skip part of my nightly routine. And I also want to increase the time to 20 minutes. There is probably nothing magic about the number 8 or 20, but I more often see recommended numbers closer to (at least) 20, so that will be my target. So the other part is to increase my time limit by 1 minute a month all of this year. This will bring me ever so slowly and gently to 20 minutes.

So with any luck, by the end of the year laser beams will be coming to me for advice on how to be so focused.

Any long term mindfulness practitioners out there with advice/encouragement?

Thursday, January 1, 2009

One New Language a Year: Smalltalk

The Pragmatic Programmers recommend learning a new programming language every year. I thought for a long time that this was a good idea, but didn't really act on it. Instead I'd dabble in a dozen languages a year. While this has built my intuition as to what various languages are like it has not made me a competent programmer in a new language. So for this year I'm going to actually pick a language, make a learning schedule and keep on it for the entire year. Now I almost certainly won't be able to stop myself from messing around with other languages to some extent (e.g. I'm slowly working through the Real World Haskell book), but my goal is to direct more of my focus and energy on a smaller target and hopefully get to the point where I really have an additional tool to use for solving real problems.

So then comes the question of which language to learn. In one sense I should probably pick a language that I don't know at all. The problem is that I've played with a lot of languages over the years. I don't think there are a lot of programming concepts/paradigms that I'm not at least familiar with. But there are a lot that I have not mastered. So I will just pick from the list of languages of which I'm not a master. A very healthy list.

Here is a short list of languages that I at least briefly considered:

smalltalk
haskell
something lispy (common lisp, scheme, emacslisp)
clojure (also lispy, but new and different)
erlang
mozart/oz
ruby
perl 6
javascript

All the languages above have some intriguing features but in the end I chose smalltalk (specifically squeak, but perhaps soon pharo). The deciding factors were:

My kids are getting close to an age where I might introduce them to programming so I'd like to have some mastery of a kid friendly programming language. While I think python fits that requirement pretty well, etoys seems like it might fit it even better (scratch also looks interesting).
I want to bump up my familiarity with deeper OO programming practices. In particular, I want to bump up my mastery of design patterns. Smalltalk is one of the languages where design patterns were first developed
I want to have a language where I can return to one of my first loves: graphical simulations. I could clearly do that in python, but well, learning a new language is always a good excuse to play around with graphics libraries.
I want to work on something that is really different in a fundamental way. Being forced to use the squeak development environment (instead of emacs) and an "image" instead of text files fits the bill
seaside just seems inherently cool and I'd like to get my mind around continuations.

I imagine my plan will change as I learn more and follow my interests, but as a first pass:

Learn etoys

This will give me a chance to start off sort of easy and genuinely playful. I've been collecting some links and project ideas but initially I am planning on looking at: Powerful Ideas in the Classroom, Idioms for Composing Games with Etoys, Etoys chapter in Learn Programming With Robots, and poke around in the squeakland site.
Intro to smalltalk/squeak reading/tutorials.
Some items that stand out as worth perusing include: Laser Game Tutorial, Squeak By Example, Design Patterns Smalltalk Companion (I've had this on the shelf for a while and it's really time I read it), Mark Guzdial's "Blue" book, and browse around on this list of smalltalk books.
Get familiar with seaside
In particular I will work on porting over a Django app (which used to be a quixote app) to seaside. I'm not planning on migrating it to seaside for real, but it's nice to have a real world project with lots of ugly edge cases in mind to see how a framework really works.
Depend on smalltalk
By the end of the year I hope to have some project that I maintain (even if just for myself) that keeps me involved with writing and maintaining smalltalk code.

Note: Haskell came in a very close second. It was actually a very hard decision. It seems like Haskell is poised to take off and it would be nice to be part of that wave. On the other hand. I already know that that language is pretty challenging to make progress on and will still be waiting for me next year.

FWIW: I recently came across this list of languages worth learning.

lair of the dustbunny