Django installation gotchas

Standard

In this blog I have earlier declared my love for both CodeIgniter and WordPress, but I am very unfaithful when it comes to frameworks and programming tools, so now I am testing out the Python framework Django instead (but I am still a newbie). It’s my experiences with Google App Engine that lead me into the Python/Django world, and when I am now considering what framework to use for an upcoming big project Django is definitly on the shortlist. It will not be a Google App Engine project due to that some of the limitations of that platform makes it a bad fit for the project in question, so it would be my first standalone Django project. So far I have only installed all the stuff I need and got it to play nicely together, but that was a real learning experience in itself, and by sharing it in this post I hope to save some other fellow geek some time and peace of mind.

Before getting started I want to thank Alexis Bellido at ventanazul.com for his patients with my Django questions, check out his post Django questions and answers with a Swedish guy for some more info on Django setup (if you didnt figure it out I am that Swedish guy).

Installation guides
When I set up Django locally I wanted to make my local development environment as similar to a production environment as possible, it will hopefully make production deployment easier down the line. To do this I am running Python 2.5, Django 1.0, PostgreSQL 8.3 (and thus the neccessary python driver psycopg2) and mod_python on Apache. There are some good installation guides out there, so setting up this stuff was mostly a walk in the park, check out the install guide in the Django Book, the quick install guide at Django Project or the install guide at WebMonkey. Of course there are some gotchas that took me quite some time to figure out…

Uninstall old versions of Django
Before getting started I already had Django 0.96 installed, and I  just installed Django 1.0 over it assuming that it was going to replace the old version. I was wrong and when I tried to run my new Django installation I got errors like “NameError: name ‘url’ is not defined” and “ImportError: cannot import name WEEKDAYS_ABBR” – neither which made much sense to me. It turns out that if you do not uninstall an old Django installation first you get a mix of new and old Django files, and that just do not work very well. So take care to acctually read and follow the instructions at Remove any old versions of Django.

Use the right version of the db driver (duh!)
To get the PostgreSQL working correctly with your Python install you have to use a version of the driver Psycopg2 that matches both your Python version and your Apache version, otherwise things just do not work and you do not really get a helpful error message. You can get the window versions of Psycopg2 here.

Configure Apache for mod_python
Once Django and Apache are installed you can create your new Django project (via django-admin.py startproject myNewProject) where Apache can find it (in htdocs for example). To be able to use Apache as your webserver you also need to install and configure mod_python. Installing is straight forward (see the install gudies above for more info), but configuring Apache took me some time. You need to edit httpd.conf and add the following to the end of the file:

<Location "/myNewProject">
SetHandler python-program
PythonHandler django.core.handlers.modpython
SetEnv DJANGO_SETTINGS_MODULE settings
PythonOption django.root /myNewProject
PythonDebug On
PythonPath "['C:\Program Files\Apache Group\Apache2\htdocs\myNewProject'] + sys.path"
</Location>

myNewProject is of course the name of your own Django project.

Hopefully this is of use to someone other than me, if not, then at least I have my notes organised for the next time. If I have missed anything or gotten anything wrong I would very much appreciate your feedback!

Google App Engine limitations, and how to get around them

Standard

Using Google App Engine has great advantages, but there are also serious limitations in the platform that it’s important to be aware of. Not all applications can be implemented and executed with Google App Engine. Google is working hard on removing a lot of these limitations, and they will probably do so eventually, but in the meantime that isn’t really any help if you are writing a web app today.

This post was originally published in swedish on Mashup.se, my blog about swedish mashups and APIs.

Only Python
If you don’t know Python you don’t have choice if you want to use Google App Engine, you just have to learn. It’s not a difficult programming language to learn if you already know how to program. Django, the Python framework that really speeds up writing applications for Google App Engine, is also quite easy to pick up (tip: Use the Google App Engine helper for Django). Before I started with Google App Engine I hadn’t written a single line pf Python and after a few intense days of concentration and coffee consumption I knew Python, Django and Google App Engine quite well.

Most 3rd party python libraries work perfectly on Google App Engine (Django is just one example), but there are limitations. Only libraries that are 100% Python can be used, so if the library has any code in C it can’t be used on Google App Engine. If a librarry has any code that makes a HTTP request or similar it can’t be used on Google App Engine either. All HTTP requests have to be done via Google App Engines URL Fetch API.

Differences between local dev environment and the production environment
One of the advantages with Google App Engine is that there is a local development server that simulates how Google App Engine works when an application is in production. This makes it easy to develop an application locally and then deploy it to the live servers on Google App Engine. As a developer it is important to pay attention to the differences between executing an application on the local server compared to the production environment. It’s no fun to spend time writing code that just don’t work once deployed.

The biggest difference betweent the two environments are that the requests made to 3rd parties work differently. If you have an application that uses the Delicous API it will work fine locally, but once deployed it wont work at all. The reason is that Delicous is blocking all requests from Google App Engine IP-addresses. Same thing is true for the Twitter API due to some HTTP headers that Google App Engine sets (Twitter claims to have fixed this now, haven’t had a chance to test yet). To avoid these problems you need to test your application often in the live environment, especially when you use APIs to call other services.

Datastore
The Google App Engine datastore has some limitations due to being a distributed database. The most obvious limitation is that there isn’t an “OR” operator in GQL (the Datastore version of SQL), but that is easily handled when coding. A more annoying limitations is that it is not possible to create a new entity (data object) in the Datastore via the Google App Engine dashboard unless there already exists an entity of this type, and that there is a real noticable delay between what one can see in the Datastorer in the Dashboard compared to what really is stored in the Datastore. This makes it very hard to really check what data that is stored in the Datastore in any given moment, which makes debugging more difficult.

No scheduled processes
One of the most restrictive limitations with Google App Engine is that there is no way to start a process, other than via a HTTP request. There is no type of recurring scheduled process (like a cron job) and no triggers or hooks to use to start a process when a special event occures. Almost all web applications need som kind of scheduled process to function correctly – to clean up old data, send emails, consolidate statistics or fetch data from an RSS feed once an hour.

The easiest way to get around this limitation is to create a cron job on another server that calls an URL in the Google App Engine application. If you have a lot of visitors you can also perform the background process as part of a user request. For example you could import data from an RSS feed when a user logs in to your application. This will of course make the user experience slow and there is no guarantee that users will perform the action to trigger the background process att the right times.

No matter which solution is implemented one will quickly bump into the next limitation of Google App Engine, that only short processes are permitted.

Only short processes
Something I have learned the hard way is that Google App Engine only is built to handle applications where a users makes a request and quickly gets an answer back. There is no support for a process that  executes for a longer time, and for the time being this limitations seems to be approximatly 9 secondes/process. After that an exception is thrown and the process killed. It does not end there, even if you are nowhere close to use the assigned CPU resources you can quickly use too many resources with long running processes since they have their own unique resource pool. If you use too many resources you application is shut down for 24 hours, and right now there is no way to buy extra resources.

This is a really serious limitation in Google App Engine that it is really difficult to get around. If you need heavy processes it is recommended that you use Amazon EC2 or something similar. To handle (not get around) the limitation you have to handle exceptions in a nice way and use transactions. More about this in William Vambenepe’s very informative post Emulating a long-running process (and a scheduler) in Google App Engine. He has some tips on how to get around this limitation, even if it is not recommended since you risk that your application is shut down.

More limitations
There are more limitations, read  Google App Engine: The good, the bad, and the ugly? for a longer list. Just keep in mind that some of the limitations mentioned in that post already have been addressed by Google.

To summerize you need to really know what limitations there are in Google App Engine before you spend time and energy on developing an application. If the limitations are not a problem then there is a lot to gain by using Google App Engine.

10 Easy Steps to use Google App Engine as your own CDN

Standard

When the big boys run websites (basically any brand name site you can think off) they use Content Delivery Networks (CDN’s) to host most of their content, especially images, stylesheets, files to download and other static content. The reason they do this is that the less they have to host themselves the less load they have on their servers, and the more content they can host closer to the end user the quicker the user can download it. The most famous CDN is probably Akamai, that almost run their own parallell internet. Akamai and other CDN providers cost big bucks though, so it is nothing for us mere mortals.

But thanks to Google anyone can now run their own CDN for free on Googles servers. It is really easy to set up and storing files for downloads, stylesheets etc on Google instead of on your own site takes the load of your servers (and consumes much less bandwidth of your hosting account) and speeds things up for the end users. It’s a win win situation, and it is also really really cool! If you are interested in more information about how to get the maximum performance from your web site then I recommend you read the excellent post Performance on a Shoe String on 24ways.

What is Google App Engine?
Using Google App Engine you can run web applications on Google’s servers. That means that you can benefit from Google’s huge world-wide server farms, it means that it is really easy to scale and to integrate with othe Google applications (for example using Google authentication in your applications). At the moment you have to write your applications in Python (don’t worry: no coding at all needed to use Google App Engine as a CDN, just keep reading), but hopefully they will expand it to other languages soon (personally I want to run PHP and CodeIgniter on Google App Engine!). The App Engine is Googles response to Amazons very successfull web services S3 (for storage) and EC2 (for computing). Amazons services are very powerful, but they do require a deeper level of technical knowledge to use than Google App Engine.

Currently Google App Engine is in a Preview Release (= beta), but it is free for anyone to join, all you need is a Google Account and a cell phone (more about this later). What you get is 500MB of free storage and around 5 million free pageviews a month, if you use more than that there is a small cost (see the Google App Engine blog for more details). The cost for these extra resources are almost the same as for Amazons Web Services, and with the freebies and ease of use thrown in Google App Engine is a bargin.

How to set up your own CDN
To use Google App Engine as your own personal CDN you need to install some things on your computer and edit a few configuration files. All this work is a one time thing though, after that all you need to do is run a simple program to upload new files to Google. Sorry to say that the scripts you download is for Windows only, if you are on a Mac or using Linux then you need to make your on script to do what deploy_digitalistic_cdn.bat does (if you do so please add this to the comments of this post for any one else to use).

  1. Since Google App Engine only works with the programming language Python you need to download and install Python on your computer. If you have a Mac or run Linux you most probably already have Python installed, so you can skip this step. Download the correct installation file for your OS from Python 2.5.2 from http://www.python.org/download/ and install it. Use the default settings, except install it under “Program Files” instead of directly on the C: drive (or install it wherever you want, but in then you need to modify the scripts below.
  2. Download Google App Engine SDKDownload the Google App Engine SDK from http://code.google.com/appengine/downloads.html and install it. During the SDK installation it will check if you have Python or not, so if you have a Python installation problem you will know it already here. The Google App Engine SDK is needed to be able to write and upload applications to Google. Just use the default settings when installing the SDK.
  3. Sign up for Google App Engine at appengine.google.com. For this you need a Google account (your GMail address for example, if you dont have one it is free to create one).
  4. GAE SMS verificationCreate Google ApplicationOnce you are signed up you need to create an application, so just click on the button “Create an Application” and give your application a name (called “application identifier”). This name needs to be unique among all users applications, so it might take a while to find a unique one. In my case I used “digitalisticcdn”. Save your new application. After you have created your first application you need to specify your cell phone number. Google will then send you a SMS with a code that you enter into their site. This confirms that you are the owner of this Google App Engine account (so don’t use it for spamming ;).
  5. Download the file http://digitalisticcdn.appspot.com/files/digitalisticcdn.zip (hosted on my private CDN!) and unzip it to your harddrive. If you want you can rename the unzipped directory from “digitalisticcdn” to whatevery you want, for example the name of your own application. It doesnt really matter, it just makes easier for you to keep track of things in the future.
  6. Use a text editor to edit the app.yaml file in the digitalisticcdn directory. Change “application: digitalisticcdn” to “application: <your application identifier” and save the file. This will tell Google App Engine what application to upload your files to.
  7. Now it is time to add all the images, stylesheets, files, videos etc you want to upload to Google to the folders in the digitalisticcdn directory. Put all images into the /images folder etc. You can create any number of subfolders inside the images, files, stylesheets etc folders (for example /images/webhostninja.com/ninja.gif). You can always add more files at a later time, so if you just want to set things up to work you can skip this step for now. There is already an image in the /images folder for you to test that all is working as it should be.
  8. Download http://digitalisticcdn.appspot.com/files/deploy_digitalistic_cdn.bat and edit it in a text editor. This file needs to point out your Python installation, your Google App Engine installation and your digitalisticcdn directory. If you installed the Google App Engine SDK in the default directory and Python in C:/Program Files/ then you don’t have to worry about those settings. Just change the last part of the file to point to your digitalisticcdn directory. Keep in mind that all paths with spaces in needs to be surrounded by quotes.
  9. Double click on the newly changed deploy_digitalistic_cdn.bat file to upload all the files in the digitalisticcdn directory to Google. The first time you do this you need to specify your Google username and password.
  10. Ninja from WebHostNinja.comYou now have your own private CDN! Go to <your-application-identifier>.appspot.com/images/ninja.gif (in my case digitalisticcdn.appspot.com/images/ninja.gif) to see that it works.

How to use your private CDN
To use the files you upload to your Google App Engine CDN you just need to use the URL to the file on your site. If you want to show an image of a cool ninja from WebHostNinja.com you would just use digitalisticcdn.appspot.com/images/ninja.gif as your image source in your HTML. The same goes for stylesheets, files to download or whatever else you want to share on your CDN.

At any time you can add new files to your digitalisticcdn subdirectories (/images, /stylesheets etc) and run the deploy_digitalistic_cdn.bat file to upload them to Google. If you remove files from your digitalisticcdn directory and then run the bat file they will be deleted from your Google application.

You can check the statistics of your Google Application at appengine.google.com. For example you can see how much bandwidth and disk space you are using. It will take quite a lot of files and usage for you to use up the resources you get for free, but if you have a super popular site then it is worth taking a look here every now and then.

If you have read so far and found all this usefull then please Digg it. Thanks!