Tuesday, January 5, 2010

OSX, Vim, and Python

I've been using python and vim for the past few years as my sole development environment since I hate GUIs, and I love how simple and powerful vim really is. I started off with a simple tutorial, and then expanded from there to develop my own custom IDE. I recently discovered that vim7 has support for running python commands inline. This only further's my love of vim.

I also recently discovered a nice little checker for vim called pyflakes. This program allows you to check python scripts for simple errors that are detected at compile time in languages that support that sort of thing. It includes being able to check for un-imported modules, typos, and even things that you imported that you didn't use within that module.

The problem is that I'm using Mac OSX, which comes with vim that isn't compiled with "--enable-pythoninterp", so the pyflakes vim plugin doesn't work. A simple workaround is on the tutorial page I found, by simply installing MacVim though you only get it compiled against python2.5, not 2.6 which is standard now in Snow Leopard. Additionally, the syntax highlighting in MacVim is horrible for python compared to the built in one that ships with OSX, and I still prefer to run vim in the console.

Solving this issue involves a few steps:

1) Build MacVim from source. Of course you'll need the 10.6 XCode to do this, but after that it's pretty straight forward. The only thing to watch is that you have to run "./configure --enable-pythoninterp --with-macsdk=10.6" instead of just "./configure". Other then that you can follow the instructions from the MacVim site for compiling your own version.

2) Copy the MacVim.app folder into your /Applications folder

3) add [ -x "/Applications/MacVim.app/Contents/MacOS/Vim" ] && alias vim=/Applications/MacVim.app/Contents/MacOS/Vim To your shell rc (for me it's .zshrc but for most people it's probably .tcshrc). I noticed that simply copying the Vim binary to your PATH, or sym linking it doesn't work, not quite sure why but this works well enough for me.

4) Copy the python syntax file from /usr/share/vim/vim72/syntax/python.vim to ~/.vim/syntax/python.vim

5) Download and install the pyflakes.vim plugin

6) Open up a new terminal and edit an already created python source file.

After you've done all of that you should see that anything that would normally show up in an error while running pyflakes now automatically get's highlighted as an error.

I've also been using the pyflakes binary included with the pyflakes.vim plugin to check everything in my modules before building and uploading them to our servers. The only issue I've noticed is that sometimes you need to import a module even though you don't need to use it (for example with boto this is used in the sdb.db module to enable reverse-references). I'm still not quite sure how to ignore those or get around that.

One other thing that doesn't work is doing conditional imports, for example to get JSON support the manuals tell you to do:


try:
   import json
except ImportError:
   import simplejson as json



But in pyflakes, this will result in an error.

Friday, December 18, 2009

Aeron Chairs

Newstex has a tradition of giving all of it's employee's something related to the business we do (Authoritative Content Aggrigation) every christmas. One year it was a Kindle,  once an Apple TV, and once an iPhone. This year, however, Newstex bought the most expensive desk chairs ever produced.

It took a while to adjust this chair, about 400 different settings to change, it can be relatively comfortable, but it's one of those things that we all look at yet never want to buy (who spends $850 on a chair?). In any event, this is the exact same chair as we use to have at the old RIT DataCenter when we had to stay up late running scripts which should have been handeld by a good crontab.

I've realized today that I've moved quite a bit away from that original job in the DCO. Back then I was running scripts by hand that should have easily been automated, and today I work for a company automating processes that most people would consider far to complex for a computer to handle. Now that we've migrated our entire operations into Amazon Web Services, we have the scalability and availability of the largest companies, without any big investments in hardware, or having to worry about hardware upgrades at all. Just today I performed a live upgrade on our websites without a single second of downtime.

Saturday, December 12, 2009

Paging SDB results in boto

Boto can be a great tool if you're querying against SDB, and it helps you out by managing paging automatically for you so you don't have to keep querying it for the next set of results. If you're dealing with a web-based application, however, you have to deal with your own paging and simply iterating forever over a large result set will eventually time out your connections. To solve this, you can use the built-in paging system provided by boto.

Everytime you query using "db.select" in boto, you get back a result set. Most people probably just think of this as an iterator, since it does all the magic behind-the scenes and only queries when you start iterating. It also stores that magical "next_token" within itself so it can query for the next page of results from SDB. Normally, you wouldn't even notice this attribute, but if you're dealing with a service that needs to return in a short amount of time, it can be quite useful.

Additionally, there are two important keyword arguments you can specify to the "select" command on any domain. These are max_items, and next_token. The max_items keyword tells boto to return after it has yielded that number of results, instead of simply handling the paging automatically for you. It's also quite important to add the limit SDB command to your query or boto will return in the middle of the result set and you will lose those middle results!

Ok, now to the code:


>>> import boto
>>> sdb = boto.connect_sdb()
>>> db = sdb.get_domain("default")
>>> rs = db.select("SELECT * FROM `default` LIMIT 10", max_items=10)
Notice that we set "LIMIT" and "max_items" both to 10.

Also note that "rs" is the result set of your select query, but only runs after you start iterating, rs.next_token should be blank now

>>> rs.next_token
>>> for i in rs:
... print i
Your first 10 results will print out, now rs.next_token is set:

>>> rs.next_token
u'r........'
Now you can pass that next_token back to the SAME select, it must be the EXACT same query for next_token to work:
>>> rs2 = db.select("SELECT * FROM `default` LIMIT 10", max_items=10, next_token=rs.next_token)
>>> for i in rs2:
... print i
Your next 10 results will print out


After you get to this point, it's a simple process to rinse-repeat. Once you run out of results, rs.next_token will be empty.

Monday, September 7, 2009

Snow Leopard and lxml

As many of you might already know, I've been dealing a lot with XML and python recently. I also recently upgraded to Mac OS 10.6 Snow Leopard. To my surprise, after the upgrade I noticed that I was now being pushed from python 2.5 to python 2.6. At first I was a little concerned since everything I had been doing was in 2.5, and all of our servers are running 2.5, but I soon found that I could continue to build for 2.5 and upload my eggs, so I welcomed the change.

After a while, I became aware that subversion was also upgraded. Building pysvn again proved to be a difficult task. I upgraded from python 2.5.1 to python 2.5.2 in order to try to also install the development libraries required to build pysvn from source. I had many issues there and then decided to move on and just install subversion 1.6.1 and the python 2.6 egg from the pysvn website.

I then discovered that lxml wasn't installed properly, and every time I tried to install the egg it told me that lxml was a pre-requisite of lxml... a bit of a confusing error message, but basically the change came about because 10.6 builds for 64 bit by default. Eventually, after attempting to build lxml for quite a while, I discovered the issue was that it wasn't working with gcc-4.2, the default compiler. If you get an error like this:

/Developer/SDKs/MacOSX10.4u.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: No such file or directory
...
lipo: can't figure out the architecture type of: /var/folders/....
Then you have to do two things.

First of all, make sure you install the new XCode including the XCode 10.4 SDK (an optional install that is unchecked by default). For some reason most of the python modules I've found seem to rely on the 10.4 SDK and they even back-port the install for 10.3.

Secondly, you have to change your default compiler from gcc 4.2 to gcc 4.0. This can be done by removing the symlink file in /usr/bin/gcc and re-linking that to gcc-4.0. This can be done by doing:

sudo rm /usr/bin/gcc
sudo ln -s /usr/bin/gcc-4.0 /usr/bin/gcc

After all that is done, you should be able to download lxml and build it using the standard Mac build command:
python setup.py install --static-deps

Friday, July 17, 2009

REST Method calling

For a few years now, I've been strongly encouraging usage of REST interfaces when building any sort of application. Providing a simple REST interface on top of your application means that you can provide a service and have someone else deal with creating the interfaces. This allows you to have a simple JavaScript client, a Dashboard widget, an iPhone client, and even a Command line client all operating on the same back-end.

The one problem I've found with REST interfaces is that they only allow you to operate on resources with 4 methods, GET, POST, PUT, and DELETE. These methods as stated by most, allow you to Read, Create, Update, and Delete objects ONLY. In working with my applications, I often find the need to let the client perform other operations directly on the object. One possible solution to this problem is to overload the POST method with an optional "ACTION" parameter associated with it. This, however, always seemed hack-ish to me and more of something you'd see in SOAP then REST.

My proposed solution was to use HTTP to it's fullest and allow arbitrary HTTP verbs to be used on objects and collections. For example, if you had a user object at:
/users/moyer
REST says you can do:
GET /users/moyer
To perform a "GET" (which would read the object) operation on the object, or
DELETE /users/moyer
To perform a "DELETE" operation on the object, so why not
RESETPW /users/moyer
To perform a "RESETPW" operation on the object? This turns a simple REST interface into a fully developed remote procedure calling system built directly off of the HTTP specifications.

I have tested this usage using the python httplib standard library, proxying through apache and hitting a CherryPy and Paste backend server. All of my tests suggest that the client libraries and server libraries all will support arbitrary extensions of the HTTP specification. This is the natural flow of progression to extend REST to support more methods on an individual object.

Tuesday, July 7, 2009

OSX WebDav Client

In working on my custom WebDAV server, I discovered an interesting note about the OSX built-in WebDAV client. First of all, if you use finder to copy files (instead of cp in terminal), it sends all files over in chunked transfer-encoding. This is entirely allowed within the WebDAV specification, but it doesn't help that they also send it using the WRONG header. They send the encoding method as "Chunked" instead of "chunked".

While working with Python Paste, I found that chunked encoding wasn't supported at all, so I switched over to using CherryPy. After testing locally and getting everything to work (CherryPy apparently ignores case), I deployed this system to production behind an apache server to add in SSL support (Yes, I plan to use Nginx in the future, but we have apache set up here already). After setting up the proxy, I noticed that I could no longer send files to my WebDAV server and a strange message was appearing in the apache error.log:

[error] proxy: Chunked Transfer-Encoding is not supported
[error] [client xxx.xxx.xxx.xxx] Handler for proxy-server returned invalid result code 22

After a lot of googling, I found a single page (written in german), which describes the issue:

The end result was that I had to enable mod_headers, and place this line in my apache config under the virtual host:

RequestHeader edit Transfer-Encoding Chunked chunked early

This fixes the header before mod_proxy gets to it and changes the "Chunked" to "chunked" which tells apache how to handle the request body.


Monday, March 23, 2009

Now that's service

After fighting with my mac pro all week, I finally took it in on Friday to get the graphics card replaced. I had deduced this was the problem, but still I expected to get some heat from the apple store. After bringing the computer up to the store, an apple rep quickly came out to meet me and carry the beast the rest of the way to the table. He pushed up my appointment by half an hour, and then I even got them to look at it even earlier because someone hadn't shown up yet. 

Not only did they not ask me a dozen questions, they didn't even plug the thing in. I said "The graphics card is dead" and they said "Ok, we'll get you a new one".... Thank you apple for not treating me like an idiot.

They had to order it, which took a total of a day, less then 24 hours later I got a phone call saying I could pick it up. They even took it out to my car for me... and all of this was under warranty so I didn't pay anything. Now that's what I call service.  Yes, apple's may be more expensive, but this is what you're paying for and it's WELL worth it.