Monday, January 16, 2012

Bing Vision API

As part of the Windows Phone 7.5 Mango upgrade, Microsoft added a nifty feature to Bing search called Bing Vision. Bing Vision is capable of scanning barcodes, book covers, album covers and posters. It can also perform OCR very capably. It appears that Bing Vision is Microsoft's response to Google Goggles. Just like Goggle Goggles, Microsoft's Bing Vision does not seem to have a public API. This post is an attempt at reverse-engineering the image search portion of the Bing Vision API. I have not been able to reverse-engineer the OCR portion of the Bing Vision yet, but once I get it I will be the first to post it.

Warning: this API is not publicly released by Microsoft, so a lot of it is subject to change. Use at your own risk.

It is important to note that Bing Vision's image search is limited to searching products. If I were to feed it an image of the Mona Lisa, I would get back a list of books and frames of the Mona Lisa for sale rather than identifying it as a portrait by the Italian artist Leonardo da Vinci.


T
hat said, the beauty in Bing Vision is in its simplicity. You simply feed the API call with an image, and you get back the result in an XML. Sounds simple.. right? Now let's dig deeper.

The API that we are going to examine in this post is Bing Vision's image search. Send a POST request to http://
wp.bingvision.ar.glbdns.microsoft.com/ImageSearchV2.ashx with the following headers:

    Pragma: no-cache
    Content-Type: image/jpeg

with the image you would like to search as the body of the POST request.

The following working code sample illustrates how you could issue your request in C#. Note the HTTP requests here are synchronous just for simplicity.

namespace BingImageSearch.NET
{
    using System;
    using System.Net;
    using System.IO;

    class BingImageSearch
    {
        static void Main(string[] args)
        {
            string path = @"C:\image.jpg";
            byte[] imageByteArray;

            using (FileStream fs = File.Open(path, FileMode.Open))
            {
                FileInfo info = new FileInfo(path);
                imageByteArray = new byte[info.Length];
                fs.Read(imageByteArray, 0, imageByteArray.Length);
            }

            BingImageSearch.BingImageQuery(imageByteArray);
        }

        private static void BingImageQuery(byte[] image)
        {
            HttpWebRequest request =
                (HttpWebRequest)WebRequest.Create("http://wp.bingvision.ar.glbdns.microsoft.com/ImageSearchV2.ashx");

            request.Method = "POST";
            BingImageSearch.AddHeaders(request);

            using (Stream stream = request.GetRequestStream())
            {
                stream.Write(image, 0, image.Length);
                stream.Flush();
            }

            HttpWebResponse response = (HttpWebResponse)request.GetResponse();
            BingImageSearch.PrintResponse(response);
        }

        private static void AddHeaders(HttpWebRequest request)
        {
            request.ContentType = "image/jpeg";
            request.Headers["Pragma"] = "no-cache";
            request.KeepAlive = true;
        }

        private static void PrintResponse(HttpWebResponse response)
        {
            using (StreamReader reader = new StreamReader(response.GetResponseStream()))
            {
                Console.WriteLine(reader.ReadToEnd());
            }
        }
    }
}

The service is capable of accepting a barcode image, and it would return to the caller the barcode type and its number, as shown below:

  
    
  

And it can also accept the cover of the product. I get back the following XML when I post the cover of this game.


  
    
      EA FIFA Soccer 12
      FIFA Soccer 12 delivers a true soccer experience with authentic club and league licenses, and intelligent gameplay that mirrors real-world soccer. Compete as any one of over 500 officially licensed clubs and experience responsive, intelligent and realistic action. Enjoy turning defenders with sophisticated dribbling and ball control, snapping off precision shots and placing beautifully timed passes with pin point accuracy.
      EA FIFA Soccer 12
      http://bingvision.blob.core.windows.net/thumbnails/655f98db86ca372371078f0abc4016d6.jpg
      E0703D6431D7D0845005
    
    .
    .
    .
  

It is up to you to choose how you will parse the XML response. If you plan to explore this API further with C#, you might want to create proxy objects (via xsd.exe) so that you can easily serialize the XML and enumerate the objects and their properties. I will not be covering that in this post.