This December I got to experience the North, above the Arctic Circle, by taking an adventurous trip to the Finnish Lapland (Rovaniemi), Tromso and Svalbard (the Northern-most populated country in the world). Starting from Bristol, traveled to London and took a connection flight to Rovaniemi, through Helsinki. Here we mushed a husky sled, drove a snowmobile into the forest and on the frozen Kemijoki river, visited the Ranua Wildlife Park and the Snowman World Igloo Hotel, and went snowboarding.
SQL (Structured Query Language) and NoSQL (commonly defined as “Not only Structured Query Language”) are the 2 most-popular querying languages in database management systems. They are not the only query languages however, with others including: XPath, XBase++ and RDF query language (such as SPARQL – SPARQL Protocol and RDF Query Language).
Recently I looked into parallelizing my code and came across 2 terms which are often used interchangeably but do not mean the same: concurrency and parallelism. Both concurrency and parallelism perform tasks virtually at the ‘same time’ since tasks share the same time space. However, this is achieved fundamentally differently. In fact, concurrency can be achieved on a single processor while parallelism requires at least 2. This is because concurrency interleaves multiple tasks, while parallelism divides a single task into multiples which are in turn initiated and ended at the same time.
When performing multi-class classification, confusion matrices do a good job at presenting the results while preserving all information: % correct classification accuracy, % misclassifications and misclassification classes for each predicted class. Its when the number of classes gets beyond ~5 classes that these visualizations start to become inappropriate. The matrices become too large to be presented anywhere; whether on a presentation slide or figure in a manuscript. The issue is further amplified when we have hierarchical classification, where we want to show inherited (mis)classifications down a tree.
I’ve recently needed to visualize a hierarchical tree, so I customized the Radial Reingold-Tilford Tree. This is based on a JSON file which I wanted to automatically generate from a MATLAB cell array with bacterial taxonomic classification; where columns represent tree levels and rows the different children/nodes at the bottom-most level. It’s a very quick-and-dirty script which first determines the unique bottom-level child nodes and recursively finds nodes with a common ancestor/parent node.