Why is it difficult for machines to comprehend images?

We have done a considerable work with words. There are search engines where we can find a match in billions of documents in a blink. However the same is not true with images or videos. I think the reason is inherent in the representation.

A language consists of characters, words and some basic rules. There are a finite number of characters used in representation of any language. A dictionary would contain most of the words. Each word has a meaning locally, and contextually when associated with other words. So in a way we have traced a written language in its structural form and given a similar encoding into machines. So the problem is deemed solved.

However with images, there are no fixed set of rules at macroscopic level, it is safe to say there exists an infinite set. On a broader level we are considering few categories right now, like trees, people, cars, and houses and trying to label them. Now the question is why it is inherently difficult. In images how do we tell if two visual references of an object refer to same object i.e. two images of a room each having a bed, pillow, windows and carpet.

bed pillow window
How does the machine label it as a room, bed or pillow. One is to identify/recognize each object independently using features encoded in the machine model. Say shape of an object. e.g. a chair will have 4 legs and back support. Second is using information from spatial domain. The object in itself will be difficult to recognize. However in relation to other objects it can be identified. e.g. a pillow will have a rectangular shape in a 2D image, however when we see a room and a pillow in the bed our confidence that it is a pillow increases. Now say there is a book and a pillow on the bed. By further encoding sharpness of edges, we can deduce or distinguish a book from pillow. Scale is another feature; say a pillow will have a proportional size with respect to its surrounding object.

For a moment, let us think on how we see things. As we grow, we create a prototype of the world as a visual reference in our head. We see cars, trucks, bikes when we go on the road. Next time we see a road we know what all we can anticipate. If we go inside a house, there are number of objects we can expect. So when we see an image there is a limited set of objects which need to be mapped or matched against.

Essentially in our machine models we should be making use of this context. But for using this context we should have a similar model in our machine. Say a prototype of the world like a 3D computer game where each object has enough details available.

A generalized representation of the world with enough details. Image from : http://www.3dcity-world.com/3dcity/

Sometimes objects are occluded or only partly visible. Say a table and chair

With a context it is much easier to predict. Say we a see a room with a car in it. If we see a car outside the window, on a road (and sufficient information available that it is outside), it is a real car. However if it is inside the room, it must be toy car (a miniature model), or the room is a garage. In that case the room will not have the things we see in a normal room and our model should choose garage.

On a low level, digital images are still a group of pixels with each pixel encoding some grayscale value. We can iterate through an image by running two loops, one for each height and width. During the time we run that loop we have to do all our matching/recognition and detection to make machines understand our world and see the things the way humans do.


Camera, Images and EXIF

Last year I lost my camera during a trip up north of San Francisco. I was sad for about a week, till those memory cells faded. Suddenly it struck me why do we loose things, there should be a way to locate them, like we do we mobile phones. 

Another alternative is we do exhaustive search of all the images shared on internet and find if one clicked with your camera. Digital camera’s store meta information in EXIF format which includes even the serial number of your camera. A serial number can uniquely identify any device. stolen camera finder works on the same basis and claims to crawl all the images. However it returned zero search results when I searched for mine. Although I have published many pictures on my picasa album with the same camera, this search engine fails to identify them. There can be many reasons why this approach would fail. 


1. Crawlers aren’t good enough. I wish Google to have started this service, it will help so many.

2. Images with missing EXIF info. I ran a test using EXIF-py, a python library on different images from Facebook, Picasa, Flickr and web and found only a very few preserved this information. Facebook seems to remove exif information from the pictures. I saw serial number for few images in picassa. In some cases only the model number is present but not serial number. 


So it seems most of the images editing softwares do not write exif information. This makes it difficult to search for your lost camera by simply crawling images. So we see we need to look for a different approach to solve this problem.

Writing experience on IPad

Last month I got my hands on an iPad. This device is amazingly wonderful. It opens up lot of possibilities on how we can consume & interact with information in a more natural way. Since then, I have been installing numerous application in order to be more organized. I would like to mention one app called “habit pro”. As we know habits are formed by repeated actions so in this app you can jot down things you would like to be part of your daily life. Later you mark whether you did them or not and also monitor your performance over time.

notebook vs ipad

I was also searching for a daily journal/diary kind app where I can do my everyday introspection & make a note of it (not just 4/5 lines of tweet kind messages). I have been using a conventional diary for about 11 years now. The problem is I have lot of them and lost few while shifting (our lives are so much mobile these days). So I was looking for an app which would translate the same experience for me. I found few on which you can either type of write with a stylus. I thought a stylus would be more natural transition. In search of it, I started behaving like a consumer. I read about different kinds on Amazon and watched a couple of videos on YouTube. Finally went to Fry’s yesterday evening to try them out. There were quite a few of them ranging from $6 to $39. They were good for some occasional marking or drawing but when I started writing, they couldn’t be compared to what it feels when we write with a pencil on paper.

First the iPad screen is smaller than a conventional A4 sheet & the writing resolution is much smaller resulting in ability to write only a few lines on one screen. On top you are writing on glass, so the feedback is poor and you might touch some part of the screen with your palm which would activate some other widget, which is annoying.

Can happiness be kept isolated?

500+ status updates on facebook in past 12 hours. A feeling of happiness & joy being spread across the world by Indians on their 2nd world cup win after 28 years. How can so many people in the world be tied by one string & a moment of pride can bring happiness on all faces? I am little fascinated by this thought right now.

Most of the time our lives are driven in search of happiness which we mostly get by personal accomplishments & related rewards. Some other times it is like the day today, when your community does something good. It is a reason to cherish for the whole community. Some rare occasions like first man “landing on the moon” was a reason for happiness for the whole mankind.

Now my question is, what really is happiness & what is the degree of happiness in various situations. How can happiness be quantified, is happiness a personal feeling or a group phenomenon. Also how long a happy moment lasts or does it really matter. Is happiness a personal phenomenon or has it something to do with a group.

Can a man live in isolation & be happy about it. However evolution wouldn’t have been possible had that been the case ? In the movie “Into the wild”, they say “happiness is real when shared”. Sometimes I correlate that with some of the personal experiences. I have been to hiking, both with friends & sometimes alone too. Hiking brings happiness to me. The happiness lies in the way my body goes utterly tired & I loose my ego, in addition it gives a feeling of personal accomplishment of having gone through a physical challenge. Now I get this condition satisfied in both the times when I hike with friends or without. But what is the difference of degree of happiness in the two cases. When I am alone after I come back home, I cannot talk about it to anyone, since no one would understand. Language cannot comprehend the full phenomenon to someone who has not witnessed the same.

However when it was done in a group, you just mention it & you share & sync with each other. There is a frequency match & it turns out into a bigger joy. A feeling of comradeship comes to you when you look into your friends eyes & how you all have been together during that event. Sometimes the same feeling is shared even with strangers. say you are running on the trail on Saturday morning. The sun is about to rise & you pass another runner on the same trail from opposite direction. You look into their eyes & greet them. A energy transfer happens at that time too & you feel good about it.

There is something subtle about humans & what brings happiness to them. It seems there is a mathematical equation which can clearly define this, just that there are too many variables to take into account & they may vary in degree depending on the persons past experiences.

Are you running ?

Last Sunday we participated in a local half marathon here at San Jose. Seven of us ran & all completed it with a glittering Rock n Roll souvenir. This evening I just thought to pen down few thoughts around running and hence this post.

Why Running?

I would rather ask why not? It is a good physical exercise & can be done almost anywhere. You don’t need any special equipments beside a pair of good shoes. Jogging in the morning can give you fresh oxygen & can be considered equivalent of a short yoga session. Beside improving stamina & endurance it boosts one’s confidence too. Moreover any physical activity has a direct effect on your psych. So if you feeling low or suffocating with boredum,get on to the trail & just run boy run. I have found this very effective. However if you are already playing some sport like football, basketball, hockey, tennis, running can still compliment to your overall fitness.


Rock n Roll 2010 San Jose

How to get started?

  1. Good friends: if you have people around you, who are disciplined & actively run & appear fit, make friends with them. Sometimes they can be inspiring & they may even pull you up in case you lag behind.
  2. Self conciousness: Many people start an activity with an impulse but fail to keep that over longer duration. Impulse is good for creating an enviornment or getting a quick start but to continue that you need a insight & motives behind your actions. You should do it for yourself and not because someone else wants you to do it.
  3. Rhythmic music: I would say this is more of a personal choice. I have found significant distance patterns while running with music than without. During the initial period you would feel like stopping if you are out of breath. But music can help you ignore that thought & you keep stepping with every beat.
  4. Participate in group activities: Running in a group can boost each others morale & can develop a feeling of comradeship. If possible participate in local marathon’s & keep collecting your Souvenirs. Also they would look good in your drawing room & a good conversation starter.
  5. Inspire others: Inspiring others can have a feedback effect on you. You become more disciplined & focussed. One of our friend, who inspired us has run 15 marathon’s so far & everytime there is some local event, he is the one who first registers for it. This creates a healthy community around.

Beside these things, keep yourself hydrated always. Glucose or some energy drinks after run can help body recover. Do not run everyday (if you run long distance). Give your body some time to recover before you hit the trail. A good warmup is the most important part of your run. Spend at least 10 minutes on warmup for a 30 minutes run, let the body be aware of your endeavours. Durning practice applications like “Run Keeper” can help you track your progress or share with friends.

During marathon there will be stalls for water & glucose. You don’t need to drink at each stop. Know your body & do not drink or eat much during the run. Keep a check on weather, in case of rains keep your mp3/ipod/phones protected in some plastic sheet.

I think this is all I had to say. Happy running!!

Living with Constant Noise

Noise is in the air and you cannot do much about it. I know this is the sad part but not any more. In this blog I will share a work around I found a month ago.

Here I will discuss interesting uses of earmuff. It is mainly used by workers in noisy environments. But for the past one month I have been using it for my every day use. I am almost addicted to it now. In the past i used to be woken up by barking dogs in the middle of the night. I could do nothing but press my pillow against my head & wait for the dog to get silent again. (dog belonged to my neighbor). If you live by a road side how can you avoid the honks & bombarding piston of lousy motorcycles. it’s almost inevitable. But not any more. whatsoever may be the source of sound this instrument takes care of it. For college going kids this can be an excellent companion.

earmuff uses

Read, meditate & leisure

But being made out of hard plastic it restricts your sleeping position. you can only lie on your back with these ear-muff on. If you can fund me we can research on making soft muffs which will be flexible & will give feel like no muff. I remember last Sunday night I was feeling little cold, it was all quite. For a moment i thought to take my ear-muff off but it was not long when I heard my roommate snorting to the loudest. But thankfully I had the solution. I remember someone saying ” you cannot change the world but yourself”.

Beside these if you are looking for a perfect noise cancellation head phones then you can make one out of these. Just put on your ipod ear phones & wrap them with this ear-muff. This all comes under 20$ price.

Disclaimer: I am not sure if constant usage of ear-muff can cause any long term injury. So please consult your physician before you using for daily use.