A Trip to the North

This December I got to experience the North, above the Arctic Circle, by taking an adventurous trip to the Finnish Lapland (Rovaniemi), Tromso and Svalbard (the Northern-most populated country in the world). Starting from Bristol, traveled to London and took a connection flight to Rovaniemi, through Helsinki. Here we mushed a husky sled, drove a snowmobile into the forest and on the frozen Kemijoki river, visited the Ranua Wildlife Park and the Snowman World Igloo Hotel, and went snowboarding.

Continue Reading ...
Introduction to SQL and NoSQL Databases
Image source: udemy

SQL (Structured Query Language) and NoSQL (commonly defined as “Not only Structured Query Language”) are the 2 most-popular querying languages in database management systems. They are not the only query languages however, with others including: XPath, XBase++ and RDF query language (such as SPARQL – SPARQL Protocol and RDF Query Language).

Continue Reading ...
Parallelism vs. Concurrency

Recently I looked into parallelizing my code and came across 2 terms which are often used interchangeably but do not mean the same: concurrency and parallelism. Both concurrency and parallelism perform tasks virtually at the ‘same time’ since tasks share the same time space. However, this is achieved fundamentally differently. In fact, concurrency can be achieved on a single processor while parallelism requires at least 2. This is because concurrency interleaves multiple tasks, while parallelism divides a single task into multiples which are in turn initiated and ended at the same time.

Continue Reading ...
Confusion Matrix - Alternative Visualization

When performing multi-class classification, confusion matrices do a good job at presenting the results while preserving all information: % correct classification accuracy, % misclassifications and misclassification classes for each predicted class. Its when the number of classes gets beyond ~5 classes that these visualizations start to become inappropriate. The matrices become too large to be presented anywhere; whether on a presentation slide or figure in a manuscript. The issue is further amplified when we have hierarchical classification, where we want to show inherited (mis)classifications down a tree.

Continue Reading ...
Visualizing Hierarchical Trees - XML and JSON Generator
Image source: mbostock

Let’s start off with a quick script. I’m sure you’ve heard of the D3 library. If not, it’s a highly versatile JavaScript library used for visualizing data. It has been used to generate some already-familiar visualization (e.g. correlation matrix) with added functionality and interactivity, or at least easier to generate, and new concepts – too many to choose from. The example gallery alone is quite inspiring. Some others are cool visualizations but not sure if they suit the purpose. Anyway, enough mumbling.

I’ve recently needed to visualize a hierarchical tree, so I customized the Radial Reingold-Tilford Tree. This is based on a JSON file which I wanted to automatically generate from a MATLAB cell array with bacterial taxonomic classification; where columns represent tree levels and rows the different children/nodes at the bottom-most level. It’s a very quick-and-dirty script which first determines the unique bottom-level child nodes and recursively finds nodes with a common ancestor/parent node.

Continue Reading ...