where content, technology and people meet. (SM) Publishing and content technology executives use Shore to measure and understand their markets and competitors, define marketing strategies and implement successful content products and services using Shore's highly actionable insights into vendors, institutions, individuals and virtual communities.
COMMENTARY: INDEX
CONTENTBLOGGER
INDUSTRY EVENTS
CONTENT NATION

Read ShoreLines, our complimentary email newsletter.

weekly   daily
Sample issue
RECENT ENTRIES
WEBLOGS: ARCHIVES
 
 
ContentBlogger is the 2007 SIIA CODiE Award Winner for Best Media Blog
COMMENTARY:

Insights and headlines from Shore analysts on trends in enterprise and media content markets.
Subscribe to our XML feed (?) or add to: MyYahoo  Bloglines  Rojo  NewsGator Online  CNET Newsburst
 
Thursday, June 26, 2008
It seems like only a few weeks ago that I was blogging about semantic search startup Powerset's soft-launch beta. In fact, it WAS only six weeks ago that we were covering Poweret's soft launch of new semantic search technology. But in that six weeks Barney Pell's crew got in a ton of good PR and a few meetings that have already resulted in a USD 100 million exit into the hands of Microsoft, according to VentureBeat. It wasn't so many years ago that Barney was a part of the bumpy exit of WhizBang Labs and its Web mining technologies. This time around his team was well ahead of the burn rate and blessed with both a good idea and good timing. With tons of cash on hand after their war chest for a Yahoo acquisition Microsoft was ready to vent by spending some large (or, for them, small) at the deals mall to pump up its search for more advertising revenues.

Given Powerset's ability to parse natural language questions as well as to provide "factz" topic clusters that could draw in related content, the target for Microsoft has to be the revived Ask.com portal as much as Google's leading search engine. Already Microsoft's Live.com search engine provides rich search results that emulate Ask's more user-friendly approach to search-driven content aggregation, but Ask still manages more meaningful responses based on natural language queries. Better front-end parsing and clustering of results terms from Powerset's technologies would certainly help Live to get more relevant and rich results that could help to build a larger audience, though how Powerset's technology will fare in absorbing Web content lacking the encyclopedic style of it's trial Wikipedia content remains to be seen. On most test queries using natural language questions one finds Google to be at least or more relevant in its results than existing major search engines, so even with new semantic technology Microsoft has its work cut out for them.

A better match for Powerset might be found on the enterprise side of Microsoft's offerings, where its recently acquired FAST enterprise search technology may benefit from some extra semantic search and clustering mojo - and find somewhat more structured content sources against which to apply semantic algorithms. That's not to say that Powerset won't succeed with open Web content, but in general semantic search technologies are most easily tuned when they're digesting documents with relatively similar styles. It would seem that this would be easier to tune to an individual enterprise's needs overall than to a world of Web content that could be in any shape at any time.

A better question might be why Microsoft hasn't considered purchasing Answers.com if they are so interested in natural language queries. With millions of pre-formed questions already in its WikiAnswers database many natural language questions map very neatly to its answer sets. In other words, sometimes the best answer to a full-sentence is a person who understood the question in all of its semantic details and has already provided the answer. This is far from a goof-proof solution to semantic search, but it's an approach worth considering as a valuable supplement to semantic document parsing.

In any event the Powerset set now finds itself in the enviable position of having sold their ship before it ever went down the launching track into the waters. That's certainly more than a few publishing portals can say these days. Congratulations to Barney and all of the other rocket scientists at Powerset - it pays to have a technology that solves a problem that companies with deep pockets are ready to get their hands on.

Labels: , , , ,


By John Blossom - posted at 8:35 PM
permanent link to this entry        bookmark this entry:  AddThis Social Bookmark Button
  0 comments (click to view or to add your own) 
 
Tuesday, May 13, 2008
There are rocket scientists, then there are rocket scientists - and then there's Barney Pell, long-time Silicon Valley startup maven and currently the Founder and Chief Technology Officer at Powerset. Barney is one of those rare people who has been a rocket scientist via both the NASA side of the term and the software industry side, an outlook that has helped him to assemble many teams through the years that have developed advanced search and language processing technologies. Powerset has unveiled its first effort recently at a new technology to provide rich content from semantic searches, an interesting look at how one can completely reshape the face of a content product via enhanced search technologies.

Using Wikidpedia as its primary target content, Powerset technology analyzes search phrases to come up with search results that match natural language phrases as well as keywords. This being a very early stage debut of technology some search targets work better than others and overall I'd have to say that it's a technology that seems to do best with people and things as opposed to concepts. For example, if you type in "Who is Bill Gates?" you get the screen similar to the top of the above screen grab, which includes a top deck of biographical information from the Freebase reference database followed by Powerset's sets of semantic analysis called "Factz" that focus on what the Wikipedia article says about this prominent figure. One of these sets, for example, tells us that Gates gave testimony, a speech, an address, a demo, a presentation and a deposition. You can click on any of these terms to get more details from the underlying article.

Below the initial bio and Factz information is a set of search results for the initial query, including the best-match article on Microsoft founder Bill Gates. This is in essence the straight Wikipedia article with links mapped over to Powerset's version of this content, along with a handy visual presentation of the article's outline on the right or another listing of key Factz organized within the article outline. I like some of the inferences that it's come up with in the Wikipedia definition of Content that I contributed a while back: "information provides value; experiences provide value; content provides value." True enough.

I like how Powerset prefixes organic search results with federated content, taking a best stab at results on very focused topics that enable people to obtain knowledge more quickly and effectively. The automatically generated Factz, though, suffer from the same problem that most semantic tools experience when they examine a very small data set: spotty inferences. For example, in the Factz about Bill Gates Powerset inferred that he founded Cher, an inference drawn from the fact that biographer Howard Johns was known for revealing the addresses of these and other celebrities. Hmm. Don't think that I'd put that info down on my "final Jepoardy" slate. I am also not so crazy about the organic search results, which tend to err on the side of word proximity. Again, with a relatively narrow data set such as Wikipedia it's not always easy to tune content analysis well to the capabilities of semantic text analysis in search engines.

The big picture for this early-days release of Powerset is that it is a great demonstration of how one particular source of content can be transformed through search and content federation technologies into an altogether different kind of publication. Oftentimes I talk these days about search technologies being similar to datafeed technologies, but in this instance it's important to recognize that search technologies are also end-publishing technologies in and of themselves that can aggregate, filter and organize content in altogether new ways that enhance the value of one or more core publications. Using free content from Wikipedia and Freebase the Powerset technology does a good job of demonstrating this concept simply, albeit with some early growing pains. Publishers wanting to stay in the forefront of content markets are turning in droves to content federation technologies as a solution to add value to existing product sets, so expect to hear more from technologies such as Powerset that help publishers to add value rapidly.

Labels: , , , , ,


By John Blossom - posted at 11:53 AM
permanent link to this entry        bookmark this entry:  AddThis Social Bookmark Button
  0 comments (click to view or to add your own) 
 

To top of page To Top of Page

   
shorename.gif (1190 bytes)
[HOME] [US] [SERVICES] [COMMENTARY] [RESEARCH] [COMMUNITY] [PRESS] [CONTACT]
Copyright © 1997-2008 Shore Communications Inc.  All Rights Reserved - Click Here to Read Terms of Use
Corporate Privacy Policy

 

 

 

 

 

 

 

 This page is powered by Blogger. Isn't yours?