It’s been described as Web 3.0, an evolution of the World Wide Web that will enable machines to infer meaning from the content that exists online. Some have even called it the beginning of ubiquitous artificial intelligence (I bet Bill Joy is stocking his pantry right now).
Not a lot of people understand the semantic web. I mean, the king of blogging himself, Robert Scoble, just figured it out last week. I feel pretty good about that. He’s only got a week’s head start on me.
Given the obvious importance of understanding how to prevent the world from being overrun by robots, I decided to educate myself about SW (yeah, i’m down with the acronyms). What I’ve learned is that Tim Berners-Lee is one smart dude (yeah, this SW thing was his master plan). And also this: while the web is really good at helping people interact with computers and with other people, it’s not very good at helping computers interact intelligently with each other…yet. Primarily, because no instrinsic meaning is conveyed about the data being transported over the series of tubes.
As we all know, HTML is the computer language that helps position content on a web page. Let’s say I type “Boar’s Head Inn Charlottesville” into Google. My search results will be based on Google’s servers going out to find instances where that text, or some variation thereof, appears on a web page. It will then rank those pages based on how many other pages on the web link to them, and weight these links according to the popularity of the sites from which they originate (PageRank).
What Google’s servers don’t do is understand whether or not the text they’ve unearthed on the web pages they have searched definitively relate to the Boar’s Head Inn in Charlottesville. They use the hotel’s name and links between pages as a proxy to determine relevance.
Well, what if we could skip that entire step? If the actual Boar’s Head Inn was assigned a unique identifier, any time data related to it was stored in a database, it could automatically be grouped with any other data on the web that related to the Boar’s Head Inn. Machines would take care of determining absolute relevance because they would be able to communicate directly with each other about the meaning of the data they were comparing, not simply the content.
Machine #1: I’ve got data on the Boar’s Head Inn. See? It’s F5rD40FY586. My data relates to room rates and availability.
Machine #2: Oh yeah. F5rD40FY586, I’ve got that too. My data relates to user recommendations. Let’s put ‘em together.
Machine #1: Well that was easy. Wanna go grab a coffee?
As a computer language, HTML is great at helping organize the name, address, photo and description of the Boar’s Head Inn on a web page. But where it falls short is providing computers with any meaning related to that content. That is where RDF, or Resource Description Framework, comes in. Developed in 1999, RDF provides universal standards for the structure of information online. RDF facilitates the semantic web by enabling the evolution of the storage of information from a natural language format to a universal structured format that is easy for both people and computers to understand.
So now, back to our Boar’s Head Inn Charlottesville search. With SW, I don’t have to spend my time figuring out which search results are relevant to me. The computers do that for me. I can instead spend my time on more efficient tasks. Such as what, you ask?
John Markoff at The New York Times has done some solid reporting on the subject (behind a registration wall), and offers some insight into what you might be able to do with the semantic web:
Whereas today’s travel recommendation sites force people to weed through long lists of comments and observations left by others, the Web. 3.0 system would weigh and rank all of the comments and find, by cognitive deduction, just the right hotel for a particular user.
Of course, you don’t have to wait for the semantic web to be able to access that functionality. You simply have to sign up to be a beta tester for VibeAgent.