Adrian Rossouw is a Node.js developer based in Cape Town, South Africa.
He has helped to build and launch several applications in node.js since finding the platform in 2010.

In a previous life he was a core contributer to Drupal and founded the Aegir Hosting System.

Why I started a Node.js user group 27 May 2012

I wrote this article to be able to plug the Node.js Cape Town user group I started recently. Our next meeting will be on the 7th of July 2012, where I will be tempting fate with a live-coding demonstration of building a basic node application.

My decision to start NodeCPT has a couple of different motivations, which I will deal with mostly separately in this post. This post was originally going to include a guide to starting your own user group, but I decided that it was best to split that into another post in the near future.

The big picture.

Why a Node.js group and not just Javascript?

I firmly believe that Javascript is going to be one of the most crucial technologies in the coming years, supplanting a lot of technologies that have become very entrenched in the industry up to now. I really want to help that along however I can, because I am just incredibly excited about the what the future holds for us.

I think that the value proposition of javascript on the server is not being able to share code with the client, but rather being able to share developers. I think that the same language on the front end and back end, breaks down barriers and ultimately encourages more people to understand the ‘full stack’.

I also find Node.js exciting because it is an amazing opportunity to shake off...

Read more →


Replacing CouchDB views with ElasticSearch 20 May 2012

Originally published on 09 April 2011 on the DevelopmentSeed intranet.

I edited this post to provide more context, so that the references to the project internals actually make sense to those who didn’t work on it.
This internal post eventually led to a public blog post, but this is the journal of my concrete experiences with ElasticSearch.

This devlog follows directly on from my post about performance improvements.

The main gist of the previous post, was that I was having trouble fixing some bugs happening during a bulk import, due to the glacial pace the process was running at, and this pace was directly attributed to needing to generate incredibly large CouchDB views.

What started as a straight forward task, forced me to try some very interesting approaches, eventually turning into a completely experimental branch of the project I have been fiddling with on my free time over the last weekend.

We have ElasticSearch available, let’s use it.

Since I had last written about elasticsearch, we have implemented it for use in the search functionality on the background and analysis pages. Knowing the kind of indexes we were building this incredibly slow view for in couchDB, i thought it might be faster, if not more straight forward, to simply index the data with elasticsearch as well, allow us to make use of it’s extensive query capabilities.

One of the concessions I did however make in my experiment, was that I saved the ‘materialized’ latest values and the respective year inside the object in CouchDB. It made not only the queries and indexing simpler, but made a whole bunch of the code around comparisons and displays cleaner too. It would probably have simplified things for the maps too, so I am of the opinion we should probably have done this a long time ago regardless.

About the data

The data for each of the school districts is split into around 65 indicators, which are then split (sparsely) into up to 10 values for each recorded year. The most complex view we have is used to compare each of the school districts to each other, based on these indicators. We end up with 57059 entries in the view for every dataset that is uploaded, and there are multiple datasets in the system at any one time.

Having this amount of data in CouchDB and the views is not a problem, but being in a situation where there is user-initiated batch imports into the system paints a very different performance picture than the traditional user contributed content workflow. What was killing us was having to import > 16k records all at once, not simply having > 16k records in the database.

...

Read more →


JSON Schema and CouchDB bulk import performance 15 May 2012

Originally published on 05 April 2011 on the DevelopmentSeed intranet.

This project launched successfully, check out the release announcement for more info.
I have revised and double checked this article before publishing, but I am able to re-run the benchmarks.

During the last week I have been responsible for fixing a number of critical issues on the FEBP project to get it ready for the client to be able to work with it. One of the areas I ended up needing to spend a lot of time on was improving performance related to data import and validation.

Background

On the FEBP site, the administrator is able to upload a CSV file describing the data for each of the data sets, which then changes how we interpret and display this data, and they can upload new versions of the data itself. Each dataset and schema is versioned, and the site has the concept of an “active version”.

When the admin uploads a new version of the data, once it is imported they are required to validate it against a schema, and are able to preview the new...

Read more →


Dynamic Form Generation with JSON Schema. 06 May 2012

Originally published on 05 December 2010 on the DevelopmentSeed intranet.
I have taken the time to revise and double check the information contained within it.

Foreword Added on 06 May 2012

This is one of the earliest posts I wrote while learning to use Node.JS. It was written during a phase where I was still trying to turn Node into Drupal. It is also one of the first times I realized that trying to do so was a mistake.

You simply do not need to have a system that automagically generates forms based on a declaritive control structures. In my Drupal days, one of my largest contributions was the Drupal Forms API that worked on similar principles, so this was a very difficult lesson for me to learn.

While I dont think what I was trying to accomplish in this article is the correct approach, the technology I was researching to help me solve it is actually really powerful and useful.

Other people have also realized that schemas could be used to generate forms:

Whenever you find yourself having to come up with a format to declare anything in JSON Schema, wether it be how a config file is structured or some other problem, I urge you to take a look to see if there isn’t already an agreed upon way...

Read more →