Saturday, July 10, 2010

Compilation and Compression

Quick, which is bigger in general: source code or the resulting compiled executable?

Try to answer that before reading on. If you are being nit picky consider source code that has no comments and minimal whitespace. And consider *any* sort of language compilation process: python to bytecode, c to machine code, etc.

Recently while reading "What is Thought?" by Eric Baum (a great mix of AI, computational theory, cognitive science, evolutionary theory) he stated matter-of-factly that when you compile a program it gets bigger. He was using this in analogy with how DNA is (relatively) concise but the things it builds (e.g. all the connections in the human brain) are much, much bigger (in the Shannon information sense).

That seemed completely wrong to me. Humans are inefficient and verbose. Machines are efficient and concise. Surely some optimizing compiler (or even just a regular compiler) will show my code to be a joke and transform it into a small jewel-like essence that I wasn't capable of at my high level of abstraction.

And so I started looking at examples. .py to .pyc files. c to binaries. .hs to binaries. And wouldn't you know it, more often then not the resulting "compiled" entity was larger.

I was surprised that my intuition was so far off on this. I asked around with my co-workers and more often than not they had the same faulty intuition as me. So at least I'm not alone. But it's strange that my intuition was so far off on this.

Now I think my intuition is properly adjusted and here's how I think about it now.

Humans aren't inefficient and verbose. Quite the opposite. Saying something like:
print "hello world"

as a simple python program is a very terse way of making *lots* of low level details happen. In fact all programming languages (no matter how high or low you think of them) are very abstract (ie concise) compared to what is really going on. Even assembler is a massive collection of shortcuts and abstractions compared to the actual physics going on.

So no matter what language you are working in you are essentially working with shortcuts that need to actually be fleshed out into more precise and verbosely specific instructions.

What's really funny is that from reading "What Is Thought" I actually see a deep relationship between compilation and compression that I had never considered before. A little Information Theory is a dangerous thing...

Saturday, July 3, 2010

My Plan for Programming Language World Domination

I've recently been playing around with prolog (of all things). This happened after reading in "Coders at Work" that Erlang was originally written in prolog. This blew my mind. My impression from a few weeks of prolog usage in a programming languages course and another few weeks in an AI course that it was a *cute* idea but not really anything like a general purpose language. I guess this just emphasizes the point that you don't really know enough to have a valid opinion after such a short exposure. FWIW, I've been pleasantly surprised at the feel of declarative programming.

This sort of reactivated a crazy idea I had in my youth. Back in high school I had this notion of being a polyglot. At one point I had a different language for each day of the week (French on Mondays, German on Tuesdays, etc.). Needless to say this overwhelmed me and I never really mastered any of them (though I retained an interest in linguistics).

But I still love this idea of learning lots of languages. But now it occurs to me that I *can* do this with programming languages. The main flaw (besides the audacity) of my original goal was that I didn't have access to any native speakers with which to really practice my language skills. But with programming languages you can always practice to your hearts content.

So here is my plan. I will rotate through my list of "keeper" languages as I run through my usual diet of "99 problems", "euler problems", ACM contest problems and other puzzlers I come across. After I confirm that I can comfortably solve relatively interesting (though smallish) problems in my current list of "keeper" languages I'll start work on getting another language up to speed to add to my Bat utility belt.

Part of my interest in such a plan is that I just love learning to think in new ways. OK, that's a lie. Sometimes it's interesting. Sometimes it is maddening. But in either case it feels important and useful like going for long runs and eating right. I don't love exercise, but sometimes it feels good and I'm more worried about what will happen if I *don't* do it than discomfort/nuisance of actually doing it.

My other concern is related to the fact that I quite luckily have been able to program primarily in python for the last 9 years. This has been fun, but I worry that I could be missing out on other cool emerging technologies and that while python is a decent way to express your thoughts, it is not perfect and I shouldn't stop exercising different ways to bend computers to my will.

So here is my initial list of languages that I presume that I can currently solve programming puzzles in: python, prolog and haskell. Haskell is my language of the year and prolog has been quickly ramping up again. After I verify that these guys are reliably at my fingertips the next few languages will likely be from this short list:

  • oz: Gives me an excuse to read CTM and it feels like and important laboratory for programming ideas
  • c: I feel guilty that I've lost my fluency in this. I want to get comfortable with thinking at that level again.
  • java: this is a lingua franca that I should be comfortable with (though I think it has the least to teach me). I used to be paid to do this, but I don't miss it.
  • a lisp (commonlisp/scheme/clojure/emacslisp): Do I need a reason? OK, probably scheme so that I can go through SICP all the way and be able to wield the ninja star of continuations with confidence.
  • smalltalk: Is there a more direct road to OO mastery? I've spent some time before but probably didn't give it enough of a chance. Also being different is sort of the point of this exercise.

There are many more that seem worth cultivating, but I don't dare write them down for fear of overwhelming myself too early on.

Any way, here we go again.

Friday, February 26, 2010

One Language a Year: Haskell - update 2

Not much to report, except that I am still trucking along through Real World Haskell (currently chapter 7). This is much better than last year (the year of smalltalk) where I was already having lots of trouble with motivation by this point.

I've been thinking alot about what the chances are that haskell will become one of my daily use power tools like python is. Or what the chances are that haskell will "take off" like a python.

I've already (mostly) gotten over one of my haskell phobias: dealing with "do" blocks. When I only had a passing understanding, I always got confused with "<-" and "lets" and "return"s in do blocks (and nested do blocks) and the rules seemed rather arbitrary, etc. Of course I'm no expert now, but at least I don't look at them as magical things. They have a precise use and logic and now I (mostly) get them.

A couple things from python I miss in haskell: list access syntax (e.g. x[2:]), default args/kwargs, etc. Both of these haven't really hurt me yet but just the thought that I don't have them available makes me sad.

Monday, February 1, 2010

How to Run

I came across this site a few weeks ago and it has really rehabilitated my capacity for running. I'm not a big time runner but I try to run at least once a week and once a month I like to go at least 5 miles. The problem is that my knees haven't been very happy about this project for quite some time. In fact this summer I was trying to ramp up for a half marathon and by the time I got to 8 mile training sessions my knees (one in particular) really laid down the law. Basically they said stop that. I would end up taking ibuprofen and doing hot/cold treatments for a few days. This did not seem like a very wise course to continue and so I backed off.

So I've been trying the idea of barefoot running for the last month or so and the change is striking. I'm not actually going "barefoot". There is about a foot of snow in my front yard currently, but the running style is still the same. The idea is essentially to run on the front pads of your foot rather than landing on your heel. Rather than sending a shock straight from your heel to you knees and hips, you absorb most of it in the front of your foot.

I did my 5 mile route today and instead of finishing in agony and limping into my house I feel like I could easily go and do it again tomorrow.

It definitely takes concentration and practice to run this way and it doesn't feel completely natural quite yet, but at this point I don't think I could go back to "normal" running again. The main difference currently is that my calves get much more of a workout and I can still feel that and the front pads of my feet are just a little tender. They are absorbing a little more impact so I'm not surprised. It's not the feeling of my foot being abused just the tenderness you feel as you are developing tougher skin.

My wife has also adapted this style. She has suffered from plantar fasciitis for quite a while and she has noticed a huge reduction in foot pain.

So the question is, how is this just now getting around? I sure wish I knew about this 20 years ago. I could have saved my body a lot of wear and tear (possibly removing the need for my knee surgery of 6 years ago). Has this been known forever and I was just oblivious? I'd honestly never heard of the idea but now it seems sort of obvious.]

You can't help but wonder how many other easy fixes for modern problems there are out there.

One Language a Year: Haskell - update 1

So one month in and so far Haskell is treating me much better than Smalltalk did a year ago. What is better? Well to start with my main learning period is a 20 to 30 minute window each morning before work. For whatever reason Smalltalk put me directly to sleep and I would often literally find my forehead mashed into my keyboard. I haven't had that problem at all yet with Haskell. One thing is for sure, if you want to learn a new programming language, you have to be awake.

What is the difference? For one it may just be that I like functional rather than object oriented thinking better. I'm fairly comfortable with object design ideas, but things like design patterns more often than not seem like bandaids over language problems rather than powerful solution cookie cutters. Also I admit that I generally expect object based solution to be over engineered. There is such a thing as beautiful OO solutions, but they seem to be the exception in my experience.

But probably the number one success factor for me is the fantastic book: Real World Haskell. 4 chapters in and I find the pacing quite nice. And there is no shying away from mundane things like reading and writing files and reading command line arguments. I swear I have a Haskell book that doesn't do any discussion of IO until chapter 17. I get that IO is about monads and monads are serious mind benders, but you still need to crank out a "Hello World" program early on. RWH gets that and many other things right.

One of my main rules for making sure I understand everything as I'm going along is to retype in *all* code from scratch as I come across it and to do so without directly copying it. In other words I have to understand it enough to retype it in directly from memory. This is probably something I should always have done, but cutting and pasting is just so easy and there is so little time....

In any case, the year of Haskell is doing quite nicely so far.

Sunday, January 17, 2010

Is anything new/unique to Python?

Looking at haskell I see some of my friends from python. Significant whitespace, list comprehensions, interators (sort of like laziness), tuples. And then it struck me that all of these features in python are derivative. Not that that is a bad thing. But I was surprised that I couldn't think of anything definitively new/unique to python. (Obviously the way these things are combined/balanced is unique)

Hey, internets, did python invent any language features/syntax?

The only thing that I can come up with is maybe the __blah__ (unders before and after) style of exposing syntax overriding.