By: Wayland Radin and Andrew Erland
May the force be with you
Inquiring minds on the ID team wanted to determine, once and for all, where certain Star Wars icons fell on the light vs. dark spectrum of the force. Naturally, ID looked to Relativity Analytics (specifically Active Learning) to answer this burning question.
Our team imported Wikipedia articles for the Star Wars universe and identified five Jedi records and coded them Light and then found five Sith records and coded them Dark. With enough characters identified as Light and Dark we then turned the Active Learning (AL) model loose on the remainder of the characters.
The force is strong with this one
With only 10 characters, the AL model was able to accurately determine which of the remaining characters were affiliated with the Light or Dark side.
Characters associated with the Light side of the force were almost entirely above rank 50 and the two at are below rank 50 were in fact double-agents! The characters associated with the Dark side generally ranked around 50 or lower.
We decided to take it a step further and calculated Recall and Precision to make sure that our model really was as accurate as it seemed. Recall is a measure of how many of the Light characters are identified by the model. There are 26 Light characters overall, 25 of which are identified as such by the model; thus our Recall is 25/26 = 96%. Precision, a measure of how many characters are misidentified, is also a promising 74%.
Overall, we can be confident that we are correctly identifying most of the Light characters with minimal “review” required of those incorrectly identified. After only coding the minimum 10 records, this model will correctly pull in 96% of the droids we are looking for and 26% we are not.