Facebook Twitter Pin It Reddit

In The News

Data Classifiers: Learning to Deal with Irrelevant Data

Tech Briefs Data classifiers are learning. From their early usage by the postal service for deciphering handwritten zip codes, the abilities of the technological marvels have already made life and monotonous (some would say impossible) tasks, so much more manageable. But now according an article in techbriefs.com, a new classifier algorithm is helping to make data classifiers even more useful, by helping them learn to dispense with irrelevant data.

The improved active learning algorithm allows those using data classifiers to efficiently train with less dependence on humans to provide labeled examples. With this new learning method, time investment is minimized and the accuracy of the classifier becomes increasingly more accurate over time.

You might ask, "What's wrong with other, more established, active learning models?" After all, there are quite a few. According to authors, Wagstaff of Caltech and Mazzoni of Google, Inc., other well-established algorithms have experience problems when it comes to irrelevant data, especially those of handwritten classification. The improved learning algorithm has a dual mechanism that helps the classifier actively learn and avoid the query of irrelevant items. The result is an increased learning rate and superior efficiency.

To understand how the learning curve is graduated, it's important to understand the improved learning process. The algorithm gathers irrelevant data from rejected examples, using them to train the classifier. Based off this learning, probabilities are then applied and irrelevant data is rejected on a more consistent basis.

To read the article in full, click here.
This work was done by Kiri Wagstaff of Caltech and Dominic Mazzoni of Google, Inc. for NASA's Jet Propulsion Laboratory. For more information, contact [email protected]. NPO-44094