Google Normalized Distance

Sometimes statistical approach simply wins over science trying to model real world. Here is an example of that. Google Normalized distance finds the relatedness between two words/concepts. this is based on number of results that comes when two are searched together to what when they are searched individually.

If you are bound by thinking in java. Here is a good implementation of the same. Use this jar ( to make your app finding semantic relatedness between two concepts.

few results

result for Agra & Taj Mahal: 0.37951525964462646
result for Agra & Delhi: 0.43014626260551725

Lower the score more the semantic relatedness between those concepts.


If you are working behind a firewall. you might need following properties to be set before you go through.

System.setProperty(“http.proxyHost”, “yourproxy”);
System.setProperty(“http.proxyPort”, “yourport”);

WordSenseGoogler wordSense = new WordSenseGoogler();

System.out.println(“result for Agra & Taj Mahal: ” + wordSense.getNormalizedGoogleDistance(“Agra”, “TajMahal”));
System.out.println(“result for Agra & Delhi: ” + wordSense.getNormalizedGoogleDistance(“Agra”, “Delhi”));