Warning: contains spoilers

If you have seen the new addition to the Star Wars franchise - The Force Awakens - you have probably noticed some similarities to the plots of some of the earlier movies, especially Episode IV: A New Hope. Does the similarity in the story translate to similarity in the social network of the new film? I downloaded the movie script, extracted the social network of characters and compared it to the social networks from the earlier movies.

Read my analysis of social networks from Episodes I to VI here.

Network part

[Continue reading ...]

Some of us are looking forward to Christmas, and some of us are looking forward to the new film in the Star Wars franchise, The Force Awakens. Meanwhile, I decided to look at the whole 6-movie cycle from a quantitative point of view and extract the Star Wars social networks, both within each film and across the whole Star Wars universe. Looking at the social network structure reveals some surprising differences between the original trilogy and the prequels.

Star Wars logo

[Continue reading ...]

In my previous blog post I visualized data on James Bond films both with Google Charts and with ggplot2. Because I skipped the code relating to ggplot2, here I'd like to look in detail at how to use ggplot2 from F#.

Currently ggplot2 is my go-to visualization library (unless I need to embed a plot - check out the James Bond bubble chart!). Here I summarize some of my experiences with using ggplot2 from F# through the RProvider. I also put together a simple wrapper around the most common ggplot2 functions to simplify the usage.

ggplot2 bar plot

[Continue reading ...]

Earlier this week I read an interesting article on visualizing box office success of James Bond films using R and ggplot2 by Christoph Safferling ( you can find it here). The blog post shows how to pull information from Wikipedia and visualize the budget, box office and rating of each film - all this using R. While reading the blog post, I couldn't help wondering how would a similar analysis look in F# using the HTML type provider from the F# Data library.

007 Logo

[Continue reading ...]

This blog post marks day 15 of the amazing F# Advent Calendar. Christmas is getting closer - soon we will have time to relax and perhaps read a nice book. Do you know who wrote the classic Christmas story, 'A Christmas Carol'? All sources claim it was Charles Dickens, but how can we be sure? I'll look at how this book compares to other books he wrote in terms of the language used in the books. I'll also analyse other classic works of literature from the Victorian and Edwardian era and look at similarity of their language. In the end, I'll try to find out if it really was Charles Dickens who wrote 'A Christmas Carol'.

First edition of Christmas Carol, 1843

[Continue reading ...]

For my academic research, I recently wrote a library in F# for fitting basic Gaussian process regression that I used to model time-series gene expression data. I am releasing the code publicly as an Ariadne package. In its current state, you can use the library to model various time-series data.

Gaussian process regression

[Continue reading ...]

Fans of different programming languages always argue about benefits of their language of choice. It is difficult to use objective criteria in a debate like this. Terms like 'clarity' or 'maintainability' are too vague and subjective. What if we used some tools from network science to compare projects written in different languages?

In this blog post I use network analysis to investigate how complex dependency graphs are and if they differ between C# and F#. It turns out that F# and C# dependency networks have quite different structures and use different local network patterns. For example, I'll describe specific types of cyclic dependencies that frequently appear only in C# projects.

Examples of motifs on 3 and 4 nodes

[Continue reading ...]

Have you ever wondered who you should follow on Twitter to get more interesting F# content? Recently I've written a chapter on social network analysis for the new F# Deep Dives book. The chapter shows how to download data on social connections from Twitter and how to do some exploratory data analysis, such as finding accounts that people find worth following.

I worked with a network around the F# Software Foundation's account. What emerged is a nice picture of how F# community looks on Twitter and which users are the most central to the network. Since the results are quite interesting, I'd like to share them with wider F# community.

Snapshot of the F# Twitter network

[Continue reading ...]