ANOTHER WEEK, ANOTHER record-breaking AI research study released by Google—this time with results that are a reminder of a crucial business dynamic of the current AI boom.

The ecosystem of tech companies that consumers and the economy increasingly depend on is traditionally said to be kept innovative and un-monopolistic by disruption, the process whereby smaller companies upend larger ones. But when competition in tech depends on machine learning systems powered by huge stockpiles of data, slaying a tech giant may be harder than ever.

Google’s new paper, released as a preprint Monday, describes an expensive collaboration with Carnegie Mellon University. Their experiments on image recognition tied up 50 powerful graphics processors for two solid months, and used an unprecedentedly huge collection of 300 million labeled images (much work in image recognition uses a standard collection of just 1 million images). The project was designed to test whether it’s possible to get more accurate image recognition not by tweaking the design of existing algorithms but just by feeding them much, much more data.

The answer was yes. After Google and CMU’s researchers trained a standard image processing system on their humungous new dataset, they say it produced new state-of-the-art results on several standard tests for how well software can interpret images, such as detecting objects in photos. There was a clear relationship between the volume of data they pumped in and the accuracy of image recognition algorithms that came out. The findings go some way to clear up a question circulating in the AI research world about whether more could be squeezed from existing algorithms just by giving them more data to feed on.

Read more:


Please enter your comment!
Please enter your name here