Evelina Gabasova (@evelgab)
"F# empowers users to tackle complex computing problems with simple, maintainable and robust code."
Recognizing languagesWhat is the language of this text? A Csillagok haboruja egy uropera filmsorozatnak, irodalmi muveknek es szamitogepes jatekoknak a neve. 
This is Hungarian, of course! 
[ NEAREST NEIGHBOUR CLASSIFIER ]
Get sample text from from Wikipedia pages (done)
Calculate features frequencies of letter pairs
Compare languages using their features
Example using sample English text "the three"



Now calculate probabilities of the pairs



th 
e_ 
ee 
el 

English 
0.3 
0.2 
0.2 
0.1 
Portuguese 
0.0 
0.2 
0.1 
0.3 
Distance is the sum of squares of differences.
th 
e_ 
ee 
el 

English 
0.3 
0.2 
0.2 
0.1 
Portuguese 
0.0 
0.2 
0.1 
0.3 
Difference 
0.3 
0.0 
0.1 
0.2 
Sum of squares: \(0.09+0.0+0.01+0.04 = 0.14\)
English 
Spanish 
Portuguese 
Czech 

Unknown text 
0.10 
0.14 
0.25 
0.27 
[ PERCEPTRON ]
[ LOGISTIC REGRESSION ]
\(f(x) = \frac{1}{1 + e^{x}}\)
Initial weights can be generated randomly
Improve weights using gradient descent
Repeat recursively until certain error or number of steps
FsLab Package www.fslab.org
@evelgab  
evelina@evelinag.com  
github.com/evelinag  
evelinag.com 