AI 101 Example - Google News


Using AI to Automatically Find and Group Like Articles

We’ll if we are going to explain how Google is using AI to solve a business problem, we might first want to know how solving that

business problem is going to make money for Google. That is a good question. And I’m not so sure the answer is very clear. Google News does not show ads and its not clear that Google makes any money directly from the site.

However, other businesses can certainly make money by having their articles appear on Google News. If you’re a news outlet and your article rises to the top of Google News, you are more likely to achieve a greater volume of Web traffic. And following that web traffic will be more revenue, via whatever means you make your revenue (ads, subscriptions, etc..) So, even though Google is apparently not making money directly from Google News, its still a great example of AI employed to solve a business problem.

What is the business problem?

At any given moment there are of course tons of current news articles on the Internet… is of course one place people go to find a variety of information. People go to to find standard, more-static Web sites, which don’t change all that often. But Google’s visitors also go to Google - Google News specifically - to find the most relevant news stories as of the moment. And the news is always changing.. And to complicate matters further, a given news story will have multiple articles written about it, maybe hundreds or thousands across several or even hundreds of news outlets.

Google, being one of the Internet’s largest repositories of Internet links, has an interest in creating a repository for news stories that does two things… First, Google wants to discover what the current set of unique news stories is... That is, there may be many thousands of news articles out there at every given moment, but they will all only cover a more or less finite set of current events or given stories. Google wants to boil all of those news articles down to a finite set of topics or stories so they can display them in a more or less short list at Google News.

Second, for a given news story or topic, Google wants to automatically group together all the news articles which fall under that given story or topic. So, rather than displaying a more or less random listing of news articles, each of which could be about any number of overlapping current events stories, Google can instead show you only one main link for each unique story or topic.

Depending upon which version of Google News you are looking at (the mobile version or desktop version) they will in some way show you other links about the same topic or story. Those other links will appear less prominently than the primary article for the given topic. If Google didn’t do this automatic discovery of news topics and they didn’t automatically classify articles under each of those topics, Google News would be of less use to you...

Google might then just have shown you a more random assortment of news articles and you would have to do a lot of the weeding through articles yourself. You might then just skip Google for your news all together and go directly to your favorite news publication where you might see articles from only one news source. According to Andrew Ng of Stanford University, Google News uses a type of artificial intelligence called machine learning in order to group various news articles together.

How does Google achieve this artificially intelligent system?

Well, from what we learned above, they are certainly using computers. That goes without saying. And if you want to envision where these computers are, think big… Google is undoubtedly using a rather large set of geographically distributed servers (a server is just a higher-capacity computer) to achieve it’s combined system that makes Google News results available to us. The next component in Google’s AI system is of course the software component...

Exactly how Google’s software works is certainly a well-guarded secret. After all, if they told the world how their software works, it wouldn’t be long before someone simply copied what Google does. But, we do have some insight into how Google uses software to group news articles together. And we will cover just the very basics of this… Firstly, software engineers at Google have written a lot of complicated computer programs, that at a very basic level are not unlike the simple one-line computer program we examined above.

Google engineers will write code using some computer language, generate some executable computer program or programs from the code, and then some set of operating systems and computers will run those programs. Those running programs will be what discovers, groups, and publishes links to the rather well-curated news articles we see at Google News. Via their software, Google is using what are known as algorithms.

You’ve undoubtedly heard of algorithms. An algorithm is simply a chunk of computer code, normally broken down to be as short and sweet as it can be. The purpose of an algorithm is to take some input, to perform some series of calculations or tasks against that input, and to produce a desired output. Generally speaking, you write an algorithm once and pass different bunches of data through the algorithm over and over again. So, Google has come up with some series of algorithms that are clearly very good at grouping news articles together under a given topic or story.

So, let’s take a break to remind ourselves of why we consider these algorithms to comprise a system that is ultimately artificially intelligent... Google News is automatically discovering and grouping all the news articles together under unique topics or stories. If Google were to do this discovering and grouping of stories without computer programs and algorithms, the work could only be done in manual fashion by intelligent humans…

So, what Google is in fact doing with Google News is using computers and software to make an artificial version of something that historically would have needed to be done by humans… With its AI systems, Google has more or less made an artificially intelligent news desk… If Google were to rely solely on an actual news desk staffed by actual humans, lots of humans would be needed to do the job...

Imagine many, many people reading though many, many news articles. Imagine these people trying to boil all these articles down to a common set of unique stories and then properly grouping them together for Google’s systems make them available in an organized fashion at Google News. If Google’s systems can do all this work in an automated way, it makes no sense to employ a large team of humans to do the work. Artificial intelligence has more or less replaced the intelligent human in this scenario.

So, back to the algorithms for a moment… As we mentioned above, Google is using a type of AI called machine learning to discover and group the news articles together. One might think that Google has to tell their algorithms what different components or attributes to look for in given news stories. However, according to Mr. Ng, this is not the case… Google is using a particular type of machine learning called unsupervised learning. What this means is that Google’s algorithms that examine news articles are automatically combing over a vast set of news articles... Given the stories that exist at any given point in time, the algorithms are then automatically discovering what the various unique elements are across all stories...

Then, based upon the unique elements that were discovered, the algorithms are allowing the articles to be automatically grouped together under given topics or stories. For example, if Google News is going to group together a whole bunch of articles about Hurricane Harvey, Google does not need to tell it’s algorithms to search all news articles for the topic that is hurricanes.

Rather, Google’s algorithms will automatically in the first place determine that hurricane is a current topic and will further determine that hurricane Harvey is an even a more specific topic. The machine learning technique that Google uses to do the grouping itself is called clustering. The various articles are more or less forced into “clusters” that each represent given topics or stories. Because Google does not have to tell its systems what the topics are, and because those topics are automatically discovered, the type of learning used gets the name unsupervised learning. So, this is where we will stop with illustrating this particular example of AI. We’ve shown that Google is able to use computers and software based algorithms to achieve the artificially-intelligent system that is Google News.

If you are interested in learning more about how Google is implementing AI for purposes of grouping articles on Google News, here are a couple of resources for you to start with:

Coursera course on Machine Learning

Quora Article - How does Google Cluster News stories

Leave A Comment

Discover, Learn and Evaluate AI Companies and Solutions

Save content to your library

Save case studies, articles, blog posts and more. Curate your research library with content directly from AI companies.

Login with LinkedIn Login with Twitter

Sign in with Email

The latest updates from AI companies in your industry

Get a weekly newsletter with the latest posts directly from the AI companies. Follow companies to tailor your feed.


Anagog provides a solution suite that offers to mobile app owners the most cost-effective, end-to-end, customer engagement solution.


Experience the leading AI demand forecasting platform.


Optimize hyperparameters to improve the accuracy of your model with the Spell Hyper command


Collaborative Jupyter Notebook or JupyterLab workspace server with powerful GPUs


The fastest and most powerful end-to-end platform for machine learning and deep learning.

Make sure your business and career keeps up with the changing world.
Sign up