What nationality would Star Wars characters be in the real world?

JABBA the Hutt, Admiral Ackbar and Chewbacca are Scottish according to some light-hearted language analysis by data scientists in Scotland.

Chewbacca and Han Solo in Star Wars: The Force Awakens

Ahead of the release of Star Wars: The Force Awakens, scientists at The Data Lab in Edinburgh have analysed several hundred characters from the Star Wars films and associated series’ to determine from which language each name is most likely to have come.

Using a list of over 500 names and on each an n-gram model from artificial intelligence was performed.

The n-gram model, from the field known as natural language processing, first splits the name into a sequence of single, double, and triple character strings. For example, the name “Luke” decomposes into the strings “l”, “u”, “k”, “e”, “lu”, “uk”, “ke”, “luk”, and “uke”.

The original Star Wars movie premiered in London in December 1977

Utilising a piece of software called textcat, the frequency of the resulting strings is compared with those of dozens of language corpuses.

From this the software is able to calculate probabilities of a given name coming from each of the languages. The most likely language is noted for each character name.

The technique is normally applied to larger bodies of text and is typically used to categorise written works by similarity, author or subject matter. In this instance the language analysis has been done as a bit of fun and is not intended to be taken too seriously.

Several of the best-known Star Wars characters are given in this article. For the full list of 500 characters visit the blog on The Data Lab’s website.


Admiral Gial Ackbar: Scottish Gaelic

Padmé Amidala: Tagalog

Wedge Antilles: Danish

Jar Jar Binks: Middle Frisian

C-3PO: Catalan

Chewbacca: Scottish Gaelic

Salacious B. Crumb: Catalan

Count Dooku: Slovakian

Jango Fett: Swedish

Boba Fett: Hungarian

Bib Fortuna: Basque

General Grievous: Breton

Jabba the Hutt: Scottish

Qui-Gon Jinn: Scottish Gaelic

Obi-Wan Kenobi: Slovenian

Owen Lars: German

Darth Maul: Welsh

Princess Leia Organa: Romansh

Emperor Sheev Palpatine: Slovenian

R2-D2: Indonesian

Sebulba: Indonesian

Darth Sidious: Irish

Anakin Skywalker: Tagalog

Luke Skywalker: Middle Frisian

Han Solo: Norwegian

Grand Moff Wilhuff Tarkin: German

Darth Vader: German

Watto: Italian

Mace Windu: French

Yoda: Bosnian

The names span a huge number of different languages, from the readily familiar to the rather more obscure. Middle Frisian, for example, was spoken around the Netherlands, Germany and southern Denmark in the 17th and 18th centuries, whilst Tagalog is a modern-day language from the Philippines.

In addition to those given in the previous list a selection of characters whose most likely name derivation is Scottish or Scottish Gaelic is:


Abeloth: Scottish

Queen Apailana: Scottish

Cad Bane: Scottish Gaelic

Cin Drallig: Scottish Gaelic

Mama the Hutt: Scottish

Mawhonic: Scottish

Mon MotHma: Scottish

Ree-Yees: Scottish

Sy Snootles: Scottish

Captain Grear Typho: Scottish Gaelic

There appears to be a connection between the names of the Hutt characters and Scottish. In addition to Jabba the Hutt, each of Borvo the Hutt, Gardulla the Hutt, Mama the Hutt, Rotta the Hutt, Ziro the Hutt, and Zorba the Hutt maps to Scottish, as does Sy Snootles, the lead vocalist in Jabba’s house band in Episode VI - Return of the Jedi.

Dr Richard Carter and Dr Roman Popat work for The Data Lab as data scientists. The Data Lab is a Scottish Innovation Centre focused on helping Scotland generate significant economic, social and scientific value from data through collaboration, education and community building.