Wednesday, July 30, 2008

emacs python mode from scratch: stage 1 - syntax coloring

First a note on my method. In order to keep my regular python mode functional while I'm playing around, my copy of the python mode uses different function names where possible (prepend "db-" to the normal name) and I'm targeting .pydb files (instead of .py).

So far this seems to work fine. I have the following in my .emacs file:

;; (load-library "db-python-1")
;; (add-to-list 'auto-mode-alist '("\\.pydb\\'" . db-python-mode))

After changes, I run the load-library command and it seems to pick up any changes that I've made recently. (In case it's not clear my stage 1 library is in a file called "db-python-1.el" and it provides db-python-mode). I plan to change the library file name as I progress and leave the db-python-mode name constant.

I am surprised on how much stuff is going on to get a mode working. But even more surprising is how understandable it is. I definitely don't understand all the working parts so far and not everything I've tried has worked but in general the pieces are falling pleasantly into place.

For stage 1, syntax coloring *mostly* works. For whatever reason I wasn't able to get comments to color correctly. My first guess was that one or more of the following:

(set (make-local-variable 'parse-sexp-lookup-properties) t)
(set (make-local-variable 'parse-sexp-ignore-comments) t)
(set (make-local-variable 'comment-start) "# ")

would do it, but so far no avail. But since I didn't get it working that means I don't really understand the lines above so by my rules I can't add them to the file. I'm hoping that at some point as I'm schlepping code from one place to another it *will* start working and hopefully then everything will fall into place.

I'm intrigued by the rx syntax for regular expressions. A little verbose but it might be better than the fairly non-standard emacs regex syntax. It's interesting to look at the actual regular expression that rx produces. For instance the keyword list that is ORed together dumps out as:


Which seems pretty bad. If you look carefully you can see that it has done some consolidating of terms. I'm not sure how useful it is to do it like that, but if nothing else the rx style is quite readable compared to the above final output.

I also removed db-python-font-lock-syntactic-keywords since adding it in didn't seem to do anything and I really want to understand what all the pieces do.

One thing I had never noticed before is that globals at the top level of the module get a special syntax coloring.

So with no further ado, here is what I have so far:: db-python-1.el

(defvar db-python-font-lock-keywords
`(,(rx symbol-start
;; From v 2.4 reference.
;; def and class dealt with separately below
(or "and" "assert" "break" "continue" "del" "elif" "else"
"except" "exec" "finally" "for" "from" "global" "if"
"import" "in" "is" "lambda" "not" "or" "pass" "print"
"raise" "return" "try" "while" "yield"
;; Future keywords
"as" "None" "with"
;; Not real keywords, but close enough to be fontified as such
"self" "True" "False")
;; Definitions
(,(rx symbol-start (group "class") (1+ space) (group (1+ (or word ?_))))
(1 font-lock-keyword-face) (2 font-lock-type-face))
(,(rx symbol-start (group "def") (1+ space) (group (1+ (or word ?_))))
(1 font-lock-keyword-face) (2 font-lock-function-name-face))
;; Top-level assignments are worth highlighting.
(,(rx line-start (group (1+ (or word ?_))) (0+ space) "=")
(1 font-lock-variable-name-face))
(,(rx "@" (1+ (or word ?_))) ; decorators
(0 font-lock-preprocessor-face))))

(define-derived-mode db-python-mode fundamental-mode "(DB)Python-1"
(set (make-local-variable 'font-lock-defaults)
'(db-python-font-lock-keywords nil nil nil nil
;; . db-python-font-lock-syntactic-keywords)

(provide 'db-python)

No comments: