How Google Images Doesn’t Work

Google Images, a service from the lovely people of Google, allows you to search for images in much the same way you use ‘regular’ Google Search to find text. It’s a simple idea, but it’s implementation doesn’t work well.
How It Does Work
With any of Google’s technologies, we’re left to making easy assumptions and sometimes outright guesses based on the behavior we can observe from it’s use. With Google Images, there are a few rules we can assume to have some truth as of it’s implementation on 9/27/2004.
- Words in filenames have more worth than any other factor. If you want an image to appear in Google Images search results, name that image the word you want it to match in a search. Example: The filename fountain.jpg will increase it’s chances of being included in an image search for the word ‘fountain’.
- Multiple words can be associated with an image in the image filename by separating those words with dashes or underscores or even spaces (represented by %20 in the URI). Example: The filename new-york-city.jpg will have a better chance of matching an image search for ‘New York City’.
- A page’s worth in regular Google Search is clearly factored in to it’s image ranking on Google Images.
Why It Doesn’t Work
Web gallery systems that assign the filenames of images as numbers (i.e. 21.jpg, 231-32.png) are very handicapped by Google Images because of it’s extreme bias towards filenames as clues. I do not use any automated gallery system on this site, but I do number my images that are in my various galleries and, as a result, my images are generally ignored by Google Images.
The easiest complaint to lodge against Google Images and possible solution to their current state of affairs is that it does not seem to pay attention to the alt attribute of the image tag. The alt attribute is specifically designed to allow you to describe what the image actually is or what it portrays. For example:
<img src=”new-york-city.jpg” alt=”New York City with the sun setting behind it.” />
In theory, Google should be using this information as a huge clue to the content of the image. For the most part though, it does not seem to do so.
Those who use XHTML Strict are actually required to use the alt attribute for each and every image on their site. Obvious Diversion not only uses XHTML Strict but is also complies with Section 508, a set of guidelines to make online content accessible to people with disabilities. (Side note: complying to either standard is not nearly as complicated or as time consuming as you may think and you’ll end up contributing to making a richer and more intelligent web. Accessify is a great place to start.)
An Example of Failure
A quick note before jumping into the example. I fully recognize that a single example does not necessarily create any truth that can be assumed beyond that specific example much like the questionable worth of a single testimonial as compared to a peer-reviewed study, but I believe the following to be indicative of Google Images general implementation at this time.
I have a number of galleries of photographs of my time in Taipei. One gallery in particular is of a day trip to a town called Danshui. A regular Google search for ‘danshui’ results in my gallery appearing in spot 8 out of 3,990 results, much to my surprise. Here’s are the Google Search results for Danshui. Here’s are the Google Image results for Danshui. Of the 501 results that Google Images returns for Danshui, how many from my gallery show up? Zero. Clearly, Google Search believes this particular page is relevant to the word Danshui, so why doesn’t Google Images come to the same conclusion? The success of Google Search in finding appropriate content displays the failure of Google Images to do the same.
Every image in my various galleries has appropriate alt content. The advantage to this is that not only can blind users gain some worth from the gallery, but Google, the world’s largest blind user, should be able to use this information to have a greater understanding of what that JPEG or PNG actually portrays.
I do not believe that Google Images currently pays attention to the alt attribute, but let’s say for a moment they did. There may be some deficits to how I am currently using the alt attribute in the eyes of any search engine. Not a single one of the images has the word ‘Danshui’ in it’s alt attribute. Why? In the context of the page, it seems unnecessary. The title of the page includes ‘Danshui’, the h2 on the page includes ‘Danshui’, and the one sentence description before the images also has ‘Danshui’. In this context, to include the word Danshui in each alt attribute seems way too heavy handed and for a blind user having the page read to them, it would be incredibly irritating.
Any Conclusions?
It’s tough to be conclusive about a technology that could change tomorrow and will certainly shift over time, but I believe the deficits in Google Images’ implementation displayed above are true as of this articles writing and I sincerely hope they are proven false with future implementations to come.
What should a user do today when adding images to the web? Focus on smart and relevant alt tags for your images while placing those images in a relevant context. Read about the experience created by browsers for people with disabilities. It is better to be more concerned with the human experience than to optimize for a search engine. Focus on the user, and let Google worry about how to make Google better.

Permanent Link
Comments (4)


Nothing seems to have changed in a coherent way. The spider is very slow to turn up (site indexed same day in google, images who knows when). Having the name of the subject in the image name doesn’t necessarily give good results either.
A search for sanchis brought up an image labelled s2.jpg.
Confusing. If you have more recent intelligence on how to optimise for google images, please let me know.
If anything, the shorthand guide is to name your files appropriately (using dashes or underscores for filenames with multiple words) and make sure you include alt text that makes sense.
Thanks for the update. I suppose the biggest problem now is long lag between spidering and cataloging. My older images turn up all the time but the fresher stuff not at all.
One of the things I am running into is that a lot of my images I store offsite in larger format on account of the Typepad account storage limits (recently much improved). I have no idea how that will affect google’s image service (the jpg file and the html are on two different servers).
I’ll let you know in a few months or so.
none of my 500 images (pattern images) are searched by google, they are named appropriately i believe, but burried in the php workings of a zen cart business program. should i just create html pages with these images in a separate folder that google can see? and how would i do that, thanks