Today the Knowledge@Wharton tech team put into the wild something I’ve been working on for some time: a new platform for Knowledge@Wharton and India Knowledge@Wharton. The new platform consists of the following:
- Windows 2008 Load Balanced Cluster
- Core Services Code Base and ColdFusion 8
- Development and Publishing System
Windows 2008 Load Balanced Cluster
We built a two node cluster using Windows 2008 64 bit Enterprise Version. One node is a VMware instance, and one node is a blade server. I like this configuration as I only have to worry about a machine warrantee on one node, but I have the backup of a hardware-based node if something goes wrong with our VMware installation. Not that such an event is likely; I would prefer not to tempt fate.
I’ve mentioned it before that we don’t use Load Balancing so much for load as for availability. By having dual node clusters for our production environments we buy ourselves zero downtime patch cycles. We did have a little trouble getting NLB on Windows 2008 working, but we did get it fixed after talking to Microsoft support.
The upgrade went really smoothly. I’m used to using cnames to handle this sort of move, but due to SSL considerations knowledge.wharton.upenn,edu has an A record. The easiest way to make the change was to add the new nodes to the existing Windows 2003 cluster, then remove the windows 2003 nodes. It worked like a charm, and I think it will be my new procedure as it was shockingly easy.
Core Services Code Base and ColdFusion 8
In looking to upgrade Knowledge and India Knowledge to ColdFusion 8 I had to touch a lot of the code. Not so much because there was a problem with it, but because we wanted to take advantage of new features. In the course of doing that I discovered that the main Knowledge site and India contained a lot of duplicated code between them. I was able to centralize it and then add new features to both sites. There are two main features that I added to the central code base: cached queries and search driven folksonomy.
Caching the queries was pretty trivial. I rolled my own instead of using an existing caching framework or native ColdFusion caching. I wanted an easy to flush cache system that didn’t need to be too complex. Because of the highly normalized nature of the database, I couldn’t get a tremendous performance boost through indexing; caching however has proven to be the correct solution by a long shot. It makes sense; we have a lot of frequently read, rarely written data here. I’m just surprised at the overall boost to the site we accomplished with one fix.
“Search driven folksonomy” is a cool idea that my boss Dave had back in 2006. It was running for awhile then got deactivated for some reason and I just re-implemented it. Basically, instead of having people manually tag articles, instead use our search referral keywords to tag articles automatically, then when an article hits some sort of critical number of hits for a keyword to an article that keyword becomes a tag on that article. We’ve enabled the collection piece for now and will enable tag display once we tweak the model a bit after getting some real data.
I can’t take credit for the look and feel. This was done by Dave and a co-worker, Sanjay. They worked on pushing Knowledge to a more current centered layout, along with a few other tweaks to accommodate advertizing without compromising the editorial content.
The one thing I contributed here was a custom tag that converted an article to an array of
tags. Then the article custom tag was able to wrap around other custom tags and display them in the flow of the article at set positions in the array, or the next pre-determined location in the array, or at the end. It made for a very flexible way to showcase link suggestion or article tools within the flow of the article thereby freeing up space for the aforementioned ads.
Development and Publishing System
This was the hardest to tackle part of the whole thing. Because I was asking people to change the way they worked. But Dave and Sanjay were open to it, especially since I promised that it would make their lives much easier after a little bit of pain.
The old model consisted of doing development on a shared development server with no source control. Changes were manually pushed to production. Occasional copies were made of the code. Communication about changes were ad-hoc and not necessarily as frequent as the changes.
The new model pushes development to local installs of ColdFusion. Source control is handled through Subversion hosted on Unfuddle.com. Communication about changes occurs on every update, thanks to Unfuddle’s notification system. The shared development server gets automatically updated from the trunk on every svn commit via svn commit hooks. Then to move the code around I have one click ANT tasks that handle updating development from Subversion, updating staging from development, updating production from stage, and a unified task that can does all of the updating in sequence (subversion to dev to stage to production in one click with a warning that you should only do this if you are sure about it.) All of this is to accommodate all of the various publishing needs we have. I then wrote ColdFusion that calls the ANT tasks, and an AIR application that calls the ColdFusion. This gives us a one-click publishing tool that we can run from a browser or a desktop application.
We replaced one node of the cluster yesterday, fixed a few bugs, then replaced the other node today – all in all, a very smooth upgrade. I’m extremely happy. It’s a lot to accomplish in 3 months. Mostly, after years of working on very backend systems which never get touched by users, it’s extremely gratifying to work on something that I can show off.