Tuesday, April 26, 2011

Pattern Recognition

One of the technical things I am interested in is called "dimensionality reduction." Some real world problems involve such a large amount of information that it is difficult for computers to process them, either because of their computational complexity or the humongous amount of storage space required to examine the problem.

Hyperspectral Imaging in Remote Sensing is one such problem. A satellite passing over an area of the earth's surface can take pictures and record the information that is in view. Because most of these images are composed of reflected light from the sun, each picture element (pixel) in a picture can measure that reflection with sensors that respond to different wavelengths (colors) of light, including both those that are visible to humans, and also those that are not visible, e.g., ultraviolet, infrared, etc.

When you consider that the pixel represents a small area (depending on the optics of the camera), you can imagine that one small section of the earth's surface as small as one square foot would take an array of thousands of elements to record the reflected wavelengths at each pixel. A similar array of elements would be needed for each pixel in the picture, resulting in millions and billions of data elements, for even a small picture of the earth's surface.

You can view something with Google Earth, for example, to get an idea of the amount of information, but remember that Google Earth just shows points in the visible spectrum. Hyperspectral images represent much more information. Every point on the surface map can be represented as an array of points in another dimension, each representing a different wavelength in the spectrum.

Now, consider the pattern recognition problem. How do we recognize what we are seeing? How do we know when something is a building, a tree, a truck, or a person? In military applications, how do we distinguish between a friend and a foe? And how do we get a computer to do the same thing a human would?

The current state of the art involves training the program with known data, and then testing it with unknown data.

Pattern recognition requires training. We cannot recognize things we have not already learned something about.

Now, why is this issue in a web page focused on epistemology?

The most notable form of pattern recognition for people is fault-finding. We recognize faults in others. Why? Because we are so intimate with those faults in ourselves. Unfortunately, we don't like to acknowledge them in ourselves. Something to think about.