With more publishers of scholarly and learned professional journal articles trying to build revenues through improved marketing, the search, display and sales tools being developed by DeepDyve are finding stronger interest than ever 2010 among publishers. DeepDyve exposes free and premium scholarly content through its own search engine and through the search tools of partners and makes it available through its read-only viewing tool embedded in Web pages. This allows people finding articles to "rent" them on a once-off basis in read-only mode for as little as 99 cents. This can be particularly handy for people who would otherwise have little occasion to purchase a full subscription to a premium scholarly journal, thus opening up this premium content to markets that would otherwise not provide opportunities for new publishing revenues.
How much more revenue? In a recent discussion with DeepDyve CEO Bill Park, he indicated an estimate in the low billions USD for the total market available for "rental" pay-per-view style access to scholarly content. While this is certainly not enough to float the boats of scholarly publishers in general, it's largely found money that will increase their total revenues at a time when revenue growth is a challenge. That's a concept that attracts partners large, small old and new to DeepDyve's services, including newly announced alliances with De Gruyter, one of the oldest and most respected scholarly publishers, and CiteULike, a Springer-sponsored social boomarking service for scientific researchers.
For De Gruyter, an established brand still requires new marketing techniques to reach researchers who do not have access to paid collections in institutional libraries, while CiteULike, a venue that attracts researchers both in institutional and independent settings, provides a way for people in cross-disciplinary research to sample collections that may eventually be a part of their more permanent interests. In both instances the services of DeepDyve are well aligned with the needs of people involved in innovation management as they probe their own adjacent markets and test out new ideas that may be worth research and product investments.
Scholarly publishers are having to adapt to research markets that are increasingly moving beyond traditional academic boundaries, prompting both alliances with organizations such as DeepDyve and their own repackaging efforts to make topic-based slices of content available from a broad selection of their journals. While the topic-based repackaging has its merits, the DeepDyve approach to ad-hoc access on a read-only basis is an essential component of this repositioning of premium scholarly content, allowing publishers to test out what kinds of content are attracting premium access far more quickly than traditional marketing cycles are likely to capture.
So not only is "rental" content valuable in terms of its direct and ad-supported revenues, but also valuable because it is, in effect, "live" market research into "willingness to pay" habits in specific market sectors. It is then up to publishers, of course, to respond to the insight that they can gain from this sales data to consider new slices and titles that can respond to premium opportunities more rapidly. The more partners that a company such as DeepDyve gets, the more insight they are likely to have available to their partners via use and sales metadata to determine such trends. Should Google Scholar join the many established publishers already using DeepDyve, their metadata on content usage could become more interesting yet.
To some degree these concepts are "Publishing 101" ideas, but the speed with which research markets are shifting are changing the ways in which they need to be applied. With permanent collections of well-established journals constantly under the pressure of institutional budget cuts, the pressure is on scholarly publishers to define "must-have" collections that are really responsive to the needs of their customers. DeepDyve's content discovery and "rental" tools can help publishers to respond to both opportunities and threats to premium revenues more rapidly, even as they build premium revenues on an on-demand basis. Yes, this may seem like ancillary revenues to some publishers, but it is revenue that is both sorely needed and which can be a guide to where best to grow broader revenues that are more easily defended in challenging times.
In a typical game of chess, there are three distinct phases of play: the opening, in which a handful of chess pieces stake out strategic territory on the chessboard, the middle game, in which the positions of many pieces are used to jockey for control of the chessboard, and the endgame, in which the pieces are traded and moved rapidly into a reduced and final push for ultimate control of the board and the strategic goal of the game - capturing the king. It takes both logic and passion to excel at chess, but at the end of the day it's a well-executed plan that wins the day.
You might say that Google has been in the process of introducing its own endgame for online publishing, quietly moving dozens of initiatives into strategic positions which in and of themselves may seem inconsequential to the game as a whole - until its ultimate position begins to evolve rapidly. As in a chess endgame, Google's recent moves are swift, monumental in their impact and, potentially, decisive in determining the outcome of how content becomes valuable on the Web. Media critics like Ken Auletta have quipped that Google needs more "Kirks" and fewer "Spocks" to succeed, mistaking the crowded middle game of media posturing against Google for an ongoing battle, when in fact Google has been keeping its well-reasoned eye on the pieces that will be most important for the outcome of the game.
What's the king that needs to be captured in this endgame? The Moment. Media companies continue to churn out outdated moves such as media players serving up magazine-like renditions of their own content, thinking that quality that reflects the last game that they won is what will win the day. In the meantime, Google's intense concentration on processing power in cloud computing, Web-standardized applications and search dominance have revealed a strategy that is quickly eliminating viable moves for many B2B and consumer content and technology companies. After the September introduction of The Second Web via its Google Wave preview platform for real-time collaboration, Google has in recent days extended its dominance of The Moment via three new initiatives: expanded personalization of search results, real-time search results and voice, location and sight-activated mobile searches, including Google Goggles, a point-and-click camera-activated search feature.
Danny Sullivan at Search Engine Land has an excellent analysis of how Google's debut of personalized searching that doesn't require a Google login is introducing a "new normal" for its search environment, in which the content presented in search results will by default be different for different people based on their last 180 searches on Google. What is The Moment for these people? Where their interests have been most recently. Instead of waiting for editorial boards to decide what The Moment should be, Google is yet again trumping traditional editorial functions and allowing people's own behavior to have a seat at the editorial table automatically.
The introduction of content from real-time Web sources such as Twitter, Facebook and other status-oriented messaging services in Google search results extends The Moment into content sources that have split-second relevancy to online content seekers. Klipp Bodnar points out that this stream of tweets and postings means that B2B companies can no longer ignore real-time in favor of traditional SEO strategies if they're going to get people's attention. It's a broader scope than that, of course: nobody can afford to ignore real-time social media content generation now any more than a securities trader can ignore real-time stock tickers. All brands must enter the real-time conversation of The Moment to keep in touch with their markets and to define their markets.
Google's mobile search initiatives, introduced last week at the Computer History Museum, are perhaps the most profound in their potential impact, even if their ultimate powers are years away from being felt. Voice-activated and GPS-activated Web search is being perfected rapidly at Google and through other outlets, but the Google Goggles initiative, previewed in its development phases on MSNBC recently, brings a point-and-click element to The Moment that promises to give Google a real leg-up in mobile search markets. Using the camera in mobile phones, Goggles enables searches for information on things such as landmarks, stores, products and text simply by filling the camera's viewfinder with the item and clicking. Remember all of those fussy infra-red applications that were supposed to get us "beaming" business cards to one another? Now, just take a photo of someone's card and it will be uploaded into a contacts record. In just those few capabilities already targeted, whole content markets are about to develop as people capture content in The Moment.
And who will have all of the search data and metadata regarding all of these Moments? Yep. Yet again, Google is positioning itself to be the cloud-empowered master of what people are interested in right now, giving them the ability to bring people closer to their interests and passions simply by asking for them. And, yet again, by including as much content as possible in serving their customers, Google doesn't second-guess what people consider to be valuable in The Moment. If the stock and news tickers of the 20th century distributing content from central markets and publishers were the gold mines of Moments in that era, Google's absorption and distribution of content from anywhere to anywhere in The Moment has enabled it to enlarge its unique databases far more broadly and rapidly than any other publisher on earth. And, like a chess endgame, the speed with which other players are losing effective counter-moves against Google's strategic position in The Moment is only quickening.
No small wonder, then, that the U.S. Federal Trade Commission is scrutinizing Google's acquisition of AdMob, a leading mobile ad network. Markets thrive when there are still a good number of pieces on the board to keep competition high. But perhaps it's time for the FTC and companies in the content industry to look beyond this rapidly emptying game board and to consider what the next round of content industry chess is going to look like. If The Moment is the new center of the publishing industry, how does content become most valuable in this context? The answer to this question is, in part, to acknowledge that the companies who collect the most input about the world most rapidly become the most knowledgeable about what is happening in The Moment.
It's a phenomenon that I call "the Sensor Society," a world in which our corporate awareness and memory becomes a valuable through common access in a way that reverses the "information is power" equation. Certainly having private information will continue to empower people and organizations in select circumstances, but for the average person or business having access to all information in the right context is becoming a more powerful resource for decision-making. To borrow a concept from my book Content Nation, some portion of the DNA of society is migrating into the Google-dominated cloud, with each of us feeding that part of our collective consciousness through our voices, our camera "eyes" and our fingers touching screens and keyboards. That may be a good thing for society as a whole, but it will be an enormous challenge for institutions who are not ready to accept that migration as a beneficial development.
What does this mean for publishers? It means good things for those that can manage to get their content into these personally defined Moments more effectively. But it also takes an acceptance that "the first draft of history" that many in the media business cherish as their mission is taking on a radically new form. Like the "playback" feature in Google Wave, everyone will have access to who did what where and when soon enough. The question is, who edited it the best? Google has staked its claim as the world's dominant editorial resource for displaying billions of histories a day, sweeping away front pages across the Web into a stream that assembles Moments that matter most to audiences.
We will spend time with content in any number of spaces thanks to this editorial resource, as we have on the Web for many years. But Google has accelerated the endgame radically in the past few months for those not tuned into The Moment. 2010 is going to be a year of momentous change in the content industry. Publishers that are tuned into The Moment will be in good shape to take on all of the inputs of The Sensor Society and to trigger astounding growth in cloud-based content markets. For those that aren't tuned in, well, you better get used to the idea that you're playing a two-dimensional game of chess against a 3-D chess master. Set up the chess pieces again, Spock. It's a whole new game.
A recent press release from Autonomy hailed an IDC report that gave them the leading market share for the search and discovery technology market. While congratulations are no doubt in order for Autonomy, which has thrived as other major competitors have struggled to gain momentum in general enterprise search markets, there's a wrinkle to this boast that should give one pause to wonder. Sue Feldman's indicating in the report that Autonomy has a 14.4 percent share of the search and discovery market in 2008, which is certainly nothing to downplay but also not a crushing dominance of this market. In other words, even the world's dominant enterprise-oriented search technology provider is little more than a niche player.
This is in part because there really isn't "a" search technology marketplace in any strict sense of the term. That may sound strange at first, but it's certainly true that search as a content location tool can only measure its success against very specific needs. Each enterprise, each publisher and media outlet, each marketplace has specific needs for content that determine whether a particular technology has been well tuned to its needs. We can use tech terms such as precision and recall to define in general terms how effective a search technology may be in returning useful information, but if a technology can't deliver editorial value very specific to an enterprise, it's just a general tool that is rapidly and easily commoditized rather than a powerful content tool.
The importance of catering to very tailored content delivery needs was underscored in my mind by a recent chat with Craig Carpenter, Vice President of Marketing for Recommind, a company providing content categorization and discovery tools that are finding particular success in legal and corporate compliance markets. Recommind has focused its capabilities on supporting functions such as e-discovery processes that enable an organization to understand what documents relate to a particular legal matter in the early phases of assessing a case. Going through emails, word processing and other unstructured enterprise documents rapidly to determine which ones relate to key figures in a legal matter or or compliance issue is a good stress test for any search technology. With recent U.S. government rules encouraging the use of electronic tools to accelerate content discovery, Recommind is one of a few companies that are well positioned to both accelerate compliance with those expectations and to eliminate legal expenses associated with the discovery process.
Certainly companies like Autonomy may be competitive in such situations, but when companies such as Recommind are focused more deeply on the needs of specific market sectors, they become, in effect, like subscription enterprise information services, delivering highly relevant content rapidly and reliably. There are, in truth, fairly few ways to attack search from a technology standpoint, so the most profitable victories in enterprise search and discovery technologies tend to go to the companies that have technology that is highly tuned to the very specific needs of a given market or client. That doesn't necessarily make one technology better than another in attacking those problems, but oftentimes only better tuned and one step ahead of other technology providers. So the fact that a company like Recommind is down in the depths of tuning their technologies to legal discovery and corporate compliance can offer them better margins for solving more focused, high-value enterprise problems - often the same kinds of problems that many enterprise publishers are trying to solve.
I do think that companies like Recommind that have done the heavy lifting on difficult enterprise search problems in specific sectors or problem sets can turn out to be double threats in enterprise content markets. Not only do they get to solve higher-value problems that are easier to measure for ROI, they also get to redefine market opportunities into other adjacent markets that may be difficult for others to attack. For example, when you look at the technology issues behind legal discovery, corporate compliance and more general high-value enterprise problems such as records management and knowledge management, there's a lot of overlap with a whole different range of technology services providers. On the other side of the spectrum, being able to categorize and organize content for the legal sector very effectively also begins to nibble at the opportunities for subscription enterprise services such as Thomson West and LexisNexis, which are also focusing more on semantic content organization but not necessarily with the deep technology focus of niche players such as Recommind.
Of course, the opposite forces of two-sided competition from large rivals can push back at niche-oriented technology players, but in general today's markets seem to be favoring specific solutions that make specific pains go away quickly in enterprises, with more general solutions with bigger tickets and fuzzier ROI being strung out on longer sales cycles. I don't think that we'll be seeing many new players like Recommind entering enterprise markets any time soon, but I do think that those that were able to get launched and cash-positive in the past few years are going to be tough competitors in the two-prong fight for content and technology dominance in the enterprise. Individually they may not take up anything like a 14 percent share of search and discovery markets, but when you look at their ability to respond to the best revenue opportunities within those markets, you can pretty much forget about the pie as a whole and start looking for the plums inside the pie that matter most.
If I had a dollar for every opportunity over the past few years to blog about the ins and outs of Yahoo's present and future, I could take you out for a pretty good dinner. The soap-operatic saga of how the leading but beleaguered Web portal lost many opportunities for greater industry dominance are well-chronicled, but now a completing deal for Yahoo to use Microsoft's new Bing search engine in exchange for Microsoft using Yahoo's ad network appears to set the stage for a new assessment of Yahoo's place in the online content industry that rises above the the usual cult of obsession with Silicon Valley personalities. More importantly, this deal is not the only step that Yahoo is taking to strengthen its position as an online destination that solves problems for people with engaging content.
On at least one level the deal appears to be a no-brainer. Yahoo's search capabilities are quite good for consumer search, but they lack Microsoft's investments in the engineering mojo of its Powerset-enhanced Bing search engine to accelerate the maturing of search results into rich, contextual content. Yahoo has good ad technology and brand marketing, but needs both more inventory and more overall market share to get a more serious share of advertisers' budgets. Each organization will be able to take capital out of competing for their common but smaller pieces of the online search and ad pies and concentrate more on drawing market share away from Google and other sites using Google services. In doing so they will be able to build online and mobile revenues more effectively through their combined audiences.
This is all good, and probably well-needed competition for Google to strengthen the online breed. It also puts Yahoo's efforts to re-engineer its future as a direct competitor to Google comfortably in the past: Yahoo's greatest growth came during its earlier technology partnership with Google, which allowed Yahoo to concentrate on user experiences and content partnerships more effectively. Different partners, now, but similar opportunities await. So in spite of the "Yahoo has thrown in the towel" rhetoric floating around - or worse - there's reason to believe that this alliance is a good step towards Yahoo using its more limited assets to do what most successful Web companies do anyway: use alliances to do what you do best and to leave the rest to others. Bing will kill the Yahoo brand no more than Google's search and ad alliance killed the AOL brand; there's plenty of room for Yahoo to be a strong aggregator and services provider through and around Bing's capabilities. It may also, of course, be a way for Microsoft to absorb the benefits of a Yahoo one step at a time while avoiding regulatory issues that an acquisition might raise, but given the iffy online future for both companies individually it's probable that a trial marriage through this deal that strengthens the assets of both companies is a more realistic step at this time than risking capital on a merger.
Yahoo is also not relying simply on Microsoft to reposition its strengths in the Web marketplace. In today's world of virtual aggregation, Yahoo's recent home page redesign beta, which includes links to major online Web sites such as Facebook and eBay is an indication that they have finally accepted that Yahoo's strength as a brand can't grow exclusively on traditional content licensing deals. If Yahoo is to be the "starting point" of using the Web, as suggested by Jerry Yang, Yahoo’s co-founder and former chief executive, then it has to do as the Web itself does and become more adept at using links as a form of powerful brand endorsement. A media cynic may look at this and say, "Well, it's nothing more than a big Huffington Post with some extra ecommerce features," but if it does what people want it to do and they come back for more, then, well, who's going to laugh last? A successful product is first and foremost about meeting the needs of your markets cost-effectively, after all.
There are still many hurdles for Yahoo to overcome before it can be labeled a truly "hot property" again, but the new Microsoft alliance and the home page redesign are both key indicators that Yahoo is focusing increasingly on the things that will keep people coming back for more. The days of walled gardens filled with licensed content built one deal at a time are a waning phenomenon, but that leaves many hopeful days ahead for those who help people make the most of their online experience in whatever garden suits them best. Hopefully Yahoo will remain a key player in those efforts through their latest moves.
There was some scuttlebutt buzzing around last week's duel between the Google I/O developer's conference and the All Things D conference to the effect that perhaps Google had some intelligence about the Ballmer announcement of the Bing-flavored preview of Microsoft's new Live Search search engine that prompted their announcement of their Wave messaging and collaboration technology. Somehow that doesn't ring true, given the breadth of the Google Wave announcement, which is a pretty encompassing technology initiative. By contrast, Ballmer didn't have anything nearly as broad to offer the ATD crowd, but at least he had something to put up against I/O to keep people buzzing about Microsoft, most of which was catch-up to counter announcement's at Google's earlier Searchology event.
If you can find any significant differences between Bing and the earlier Kumo-labeled version of Microsoft's Live Search preview, you have sharper eyes than I do. That's not necessarily a bad thing; there's a lot to be said for Microsoft's leveraging of their new Powerset technology that helps to dress up search engine results with related content and faceted navigation features. But in several forays into Bing searches, I cannot say that I am finding all that many melds of information that are truly impressive. Yes, it's nice to be able to to have comparison shopping data, reviews and related links embedded in searches such as "Samsung LCD TVs," but that's not so different than, say, a search on Google for "JFK to SFO" with the "related searches" option turned on that has comparison flight shopping tools in the search results. Bing is good, perhaps even state-of-the-art, but hardly a game-changer for the state of search in general.
What the maturing Bing search results do seem to indicate is that the lines between destination sites and search engines will continue to blur as content providers and search engines both go in search of more valuable and engaging contexts for high-quality content. For search engine providers, being able to increase engagement time on a given page of search results is good for ad revenues and overall user satisfaction and brand value. For online publishers, the melded results offered in Bing, Google's Universal Search and other evolving search portals represent opportunities to engage audiences at the point of demand with solutions that enhance their own brand value while building revenues from advertising alliances with search engine portals. You might say, even, that the Bing/Google Universal Search approach is like dialing up a custom magazine/shopping guide/newspaper, with increasingly slick and well-organized content that begins to mimic the editorial capabilities of traditional specialty publications.
The parallel between traditional media and on-demand publications assembled by search engines is underscored in Bing by the rich and engaging photographs that appear on the home page of the Bing site. Squint a little bit and you can imagine the cover of a National Geographic magazine or other glossy high-quality publications. The visual promise of Bing's home page is that what you're about to experience is really, really good at a visceral level. The guts of this "magazine" don't yet match the cover, but you can tell that over time both Bing and other search engines are headed in the direction of getting search results to be as engaging and visually rewarding as traditional magazine publications, albeit with lots of the Web-savvy functionality that keeps people coming back.
With these evolutions in mind, publishers need to be prepared to make their content brands resonate in the online pages of whatever on-demand context appeals to their audiences - including increasingly sophisticated search engines that are aiming to keep people hanging around their pages as long as possible. Initiatives such as Journalism Online will help to make search engines more profitable aggregation venues for traditional publishers, but they need to be ready to accept more willingly the idea that search engines can be great publishing partners that help them to get their content to their audiences in the contexts that they value most. Certainly Bing will help to convince some publishers of this, but it's still early days for publishers recognizing that The New Aggregation is not a mere thought piece but instead a key component in the future of profitable publishing.
While the concept of the content organization features found in the Powerset search application was always compelling, the original content in the demo application set up for the early version of Powerset was not the most powerful presentation of its strengths. Now in the hands of its acquirer Microsoft, the Powerset features appear to be ready to take on a much-improved content set and interface in the guise of an internal project at Microsoft labeled "Kumo." As revealed by Kara Swisher at All Things Digital, an internal Microsoft memo is encouraging staff to play with the prototype search engine to get some initial feedback.
In spite of some scathing negative reviews from the search engine intelligentia, the screen grabs provided by ATD of the Kumo interface look to be pretty competent. Gone is the over-busy Powerset interface, replaced by and interface that is at once Google-esque and yet unique. The top five web results are followed by results that match different facets of a search term. For example, results for the recording artist Taylor Swift return groupings of content available for her songs, her lyrics, her bio and her music downloads and her albums. On the left are possible searches by related artists and categories, as well as the ability to initiate new searches in video collections, bios and so on.
It's unclear at this point whether Kumo will be just a project name - it's apparently a word that means both "cloud" and "spider" in Japanese - or whether it's just an internal marker that may disappear at its features get absorbed into Microsoft's Live Search engine. For that matter, it's unclear that the features will make their way into production at all, though they are certainly useful enough. What is clear, though, is that Microsoft is going to continue to search for new ways to make alternatives to Google palatable in a way that might appeal to both enterprise and media audiences. I don't think that too many people harbor illusions about the ability to crack Google's dominant market share in search any time soon, but competition is good for the breed, they say.
I suppose the most intriguing aspect of Google's success that challenges the challengers such as Kumo is how Google has attained its success without explicit content categorization features. One can go to dozens of knowledge management and search conferences every year and hear about how important good content categorization features are for the success of search engines - and then look at the nearly naked search results on Google to contemplate just how true that may be. The assumption that categorization specialists have is that having categories makes it easier to browse content collections. Well, that may very well be true if you are in fact interested in browsing relatively finite and well-organized collections of content, but in general search engines have become less about browsing and more about delivering specific answers for most people. The average searcher seems to be trained now to refine their own searches via the "white box" rather than to traverse through browsing categories.
This isn't to say that content categorization isn't useful: it's more a matter of where it turns out to be most useful. Where it does seem to help most is in portal solutions where someone has come to a specific page of content and may want to explore that site or database from different facets. Where people understand that there's a finite, well-curated collection at their disposal, categorization seems to do quite well. Where it's a matter of sifting through billions of pages for the needle in the haystack, most folks are getting used to typing in the best search string that they can think of. With that said, the features in Kumo do provide an interesting and engaging alternative to Google search results, but they'd probably be better off either in specific content portals that need enrichment or in creating an on-demand portal from its results sets, so that it will be a more browsable set of content in its own right - and then, perhaps, attract a higher breed of advertising, if that's the goal. Instead of trying to out-Google Google, perhaps challengers such as Kumo need to think about how to out-aggregate the aggregators to build better revenue margins for smaller search operations. Something to wrestle with, perhaps.
The shameless self-promotion division of Shore is proud to announce that I'll be amongst the speakers at next week's SIIA Brown Bag Lunch panel presentation on Wednesday, 11 June focusing on how to attract, monetize and retain audiences and clients through search technologies. The panel will be moderated by Leslie Kues, Senior Director at Microsoft's FAST with my distinguished co-panelists Kate Noerr, Founder, Chairman & CEO of MuseGlobal, Stephen Baker, Chief Revenue Officer for EveryZing and Barbara Kroll, Director, Corporate Strategy for Wolters Kluwer. It promises to be a great panel, including both publishers using search in enterprise and media markets as well as two leading technology companies helping publishers and enterprises to get more value from search as a publishing platform. Registration information is here, it's going to be available as a live event at the McGraw-Hill Building in New York as well as an online video event.
As for myself, I will be emphasizing how search is a publishing tool that is not just about the "white box" and a list of results but a technology that can enable content to be aggregated in a "just in time" publishing environment to support a wide variety of content applications for media and enterprise markets. If you're planning to come you may want to catch my earlier entry "Beyond Search Engines: The Database is Now" to get a feel as to how search engines are starting to replace databases as the primary content gathering mechanism for content applications and its implications for publishing. Long story short, the way that financial markets thought about stock tickers and trading room system middleware is how more advanced publishers are beginning to think about search engines.
Hope to see you at the brown bag - no food but plenty of beverages and great cookies - trust me.
There are rocket scientists, then there are rocket scientists - and then there's Barney Pell, long-time Silicon Valley startup maven and currently the Founder and Chief Technology Officer at Powerset. Barney is one of those rare people who has been a rocket scientist via both the NASA side of the term and the software industry side, an outlook that has helped him to assemble many teams through the years that have developed advanced search and language processing technologies. Powerset has unveiled its first effort recently at a new technology to provide rich content from semantic searches, an interesting look at how one can completely reshape the face of a content product via enhanced search technologies.
Using Wikidpedia as its primary target content, Powerset technology analyzes search phrases to come up with search results that match natural language phrases as well as keywords. This being a very early stage debut of technology some search targets work better than others and overall I'd have to say that it's a technology that seems to do best with people and things as opposed to concepts. For example, if you type in "Who is Bill Gates?" you get the screen similar to the top of the above screen grab, which includes a top deck of biographical information from the Freebase reference database followed by Powerset's sets of semantic analysis called "Factz" that focus on what the Wikipedia article says about this prominent figure. One of these sets, for example, tells us that Gates gave testimony, a speech, an address, a demo, a presentation and a deposition. You can click on any of these terms to get more details from the underlying article.
Below the initial bio and Factz information is a set of search results for the initial query, including the best-match article on Microsoft founder Bill Gates. This is in essence the straight Wikipedia article with links mapped over to Powerset's version of this content, along with a handy visual presentation of the article's outline on the right or another listing of key Factz organized within the article outline. I like some of the inferences that it's come up with in the Wikipedia definition of Content that I contributed a while back: "information provides value; experiences provide value; content provides value." True enough.
I like how Powerset prefixes organic search results with federated content, taking a best stab at results on very focused topics that enable people to obtain knowledge more quickly and effectively. The automatically generated Factz, though, suffer from the same problem that most semantic tools experience when they examine a very small data set: spotty inferences. For example, in the Factz about Bill Gates Powerset inferred that he founded Cher, an inference drawn from the fact that biographer Howard Johns was known for revealing the addresses of these and other celebrities. Hmm. Don't think that I'd put that info down on my "final Jepoardy" slate. I am also not so crazy about the organic search results, which tend to err on the side of word proximity. Again, with a relatively narrow data set such as Wikipedia it's not always easy to tune content analysis well to the capabilities of semantic text analysis in search engines.
The big picture for this early-days release of Powerset is that it is a great demonstration of how one particular source of content can be transformed through search and content federation technologies into an altogether different kind of publication. Oftentimes I talk these days about search technologies being similar to datafeed technologies, but in this instance it's important to recognize that search technologies are also end-publishing technologies in and of themselves that can aggregate, filter and organize content in altogether new ways that enhance the value of one or more core publications. Using free content from Wikipedia and Freebase the Powerset technology does a good job of demonstrating this concept simply, albeit with some early growing pains. Publishers wanting to stay in the forefront of content markets are turning in droves to content federation technologies as a solution to add value to existing product sets, so expect to hear more from technologies such as Powerset that help publishers to add value rapidly.
The announcement of Adhere Solution's partnership with MuseGlobal to launch the "All Access Connector," a federated content integration solution for the Google Search Appliance, is one of those situations where an event is both obvious and profound in its potential impact on the marketplace. As enterprises today face an explosion of internal and external content sources that they need to integrate to create insightful content services there is a huge gap that has arisen between what most content platforms can do to unify that information and what enterprises really need. This is particularly true in enterprise search, where many search services fail to provide access to all of the sources that a person typically needs to access.
Federated search solutions have been one route to address this problem, querying interfaces to multiple searchable sources and assembling the results "on the fly" to yield a combined search result. Instead of trying to shoehorn all of the needed information into a single database or search index federated search enables content to live wherever it has to and to come together when needed via multiple queries into integrated search results. Some do this better than others, and some have been at it for longer than others. MuseGlobal falls into both camps pretty handily, having been providing federated content solutions for more than a decade which has allowed them to hammer out an infrastructure that will pull together thousands of different types of content sources together via federated queries.
All well and good, but the question is, how do you make this sing in the eyes of enterprise users? MuseGlobal's support of Adhere Solutions, a company that includes Googlephile Steven Arnold's son Erik Arnold as a Director, points towards a very powerful possible answer to that question: the Google Search Appliance. While the GSA is a popular search tool in many major enterprises it's not been deemed the "go-to" search interface when it somes to getting all the right content from the right places all in one place in many instances. Federated content capabilities from MuseGlobal united with the GSA seem to fill that gap very handily. Capable of searching any number of search engines, internal and subscription databases and feeds as well as harvesting content via its own site crawlers, the MuseGlobal platform turns GSA into a clearing house for all of the content sources than an enterprise user might want - all delivered on the highly popular Google interface that provides access to Web content as well.
Combine this with both Google's programming interfaces for applications development and MuseGlobal's own extensive library of content integration tools and all of a sudden the GSA looks like a lot more beefy competitor for expanded use within the enterprise. And since the MuseGlobal library of source connectors includes many interfaces to subscription content services as well it's a platform that can put subscription database providers on a new footing with their users as well. All of a suddent the GSA looks less like a user-friendly also-ran and a lot more like a growing hub for enterprise and online content resources.
We hear lots of talk about workflow as the key solution that's going to enable value-add enterprise content services to build new revenues, but the ability to pull together a comprehensive set of sources that their customers' users really need to do the job is a slow and laborious process oftentimes for many subscription database providers to accomplish. At the same time enterprise portal providers are stymied oftentimes by users who refuse to use their solutions to any great degree because they're used to getting the answers they want from the search engines they rely upon as ther real "go-to" workflow solutions. The All Access Connector solution offered by Access Solutions and MuseGlobal offer both camps a lot to think about as they ponder how best to ensure that they are delivering the content that their users want in the applications that drive their productivity the most. The era of The New Aggregation's ability to deliver more content value from more content sources more rapidly than ever is upon us in full, indeed.
I really love Rafael Sidi's Really Simple Sidi weblog, it's a great compilation of insights into sciences publishing that is easy to read and is in my daily bookmarks of news sources to monitor. Turns out that Rafel is a big fan of ContentBlogger also, so I was pleased to get a preview briefing from him on Elsevier's new Illumin8 product making its debut today. While it's hard to draw major conclusions on the significance of any product Day One, it appears that Elsevier has enabled Rafael's team to come up with what promises to be a real breakthrough in STM workflow solutions focused on getting the right insights into emerging solutions to scientific problems effectively.
The problem in big-stakes scientific research and development fields is that most search tools are oriented towards topical approaches to research that don't necessarily focus on relating problems and the organizations and people focusing on them with the solutions and benefits that they provide. For example, if one were to look for research, news and Web content relating to the HIV virus, the typical search engine is going to look at a search centered on that term and come up with documents that relate to this topic - but not necessarily focus on the solutions and benefits being provided by specific research studies for available new products.
This is a critical factor when trying to select a new line of scientific research or to understand how to position a new product based on that research. How quickly can one define what solutions are in play for specific types of scientific problems by specific companies or universities? Who's delivering the most beneficial solutions? Illumin8 addresses these kinds of questions by adding an important semantic twist to search processing. Instead of focusing just on nouns to define how content relates to a topic Illumin8 clusters results based on how they fall into verb categories that align topic groups such as organizations, products, experts and technology with problems and benefits associated with those topics. Using this tool one can discover easily not just recent research, Web postings and news stories but the items that the real problems being addressed by that research and the real benefits being revealed very rapidly.
Illumin8 has a very simple search interface thus far, a "white box" approach that will move from topics to problems and benefits mapping automaticaly or the ability to define more sophisticated queries using special keywords. You can choose from news, research and Web content or any combination of these via a checkbox interface and adjust your precision/recall balance for getting lots of results or just of few of the best matches with a slider bar. Search results come with graph bars and totals to make it easier to see which keywords and clusters of topics, problems and solutions are coming up most frequently in results.
While lacking some of the interface sophistication of a more mature product like Collexis that focuses deeply on helping people navigate expert network relationships and still needing to address some entity mapping issues the fundamental power of Illumin8 is quite evident even in its early introduced form. More sophisticated analysis of verbs as valuable tools in semantic processing is in part behind the proliferation of "sales triggers" intelligence products such as Generate and InsideView, which enable sales professionals to understand when news and other content sources are pointing towards companies involved in activities that impact their sales processes. Applying this type of processing to scientific studies and product development is likely to help scientific, medical and technical companies and organizations to get a similar leg up on understanding who's moving towards revenue-impacting insights more quickly.
It's an approach that can probably yield tangible benefits for many types of business information as well as consumer information. It would be nice, for example, to see a semantic engine such as Illumin8's applied to product and catalog sites. To some degree many existing search engines factor these kinds of semantic issues into their processing behind the scenes, but Illumin8 demontrates that when one focuses on the problem-solution relationship from a product standpoint instead of a straight topic approach the benefits can be dramatic.
I am skeptical oftentimes when new products claim to be "workflow solutions," but Illumin8 seems to be pointing towards a pain point that people in R&D departments encounter often enough without real effective solutions being offered elsewhere that it probably qualifies as such a tool. It's another way of saying that there just might be some significant ROI in there if someone can do the research to tease it out from an early adopter community. Hats off to Rafael for a nifty product launch - helps to have that blog - and to the folks as Elsevier for giving Rafael a chance to strut his stuff. Hopefully Illumin8 continues to grow in scope, substance and quality.
Steven Arnold writes a thoughtful post on his Beyond Search blog about the inadequacy of traditional databases and search engines to deal with organizing and delivering content when the Web and many private content collections measure in petabytes and exabytes of information. Steve hints at a "next generation" database management system that can start to leapfrog over these problems, but the greater question is perhaps unasked in his article. Namely, as the problems that people need to solve with content technologies become increasingly complex and increasingly fleeting, why is it that we really need permanent unified databases to solve those problems? There is an important need for data normalization, but if normalization can be achieved "on the fly," as leading content federation services can provide, do people need a database or instead data objects that solve specific problems in the moment?
When data normalization was associated with creating massive databases that would be used for repeated functions such as payroll management or publishing functions such as newspapers or directories permanently structured databases made a lot of sense. But as market advantages gained through content publishing fall increasingly to those who can mine unstructured content, aggregate content from disparate sources and enable people normally confined to consuming content to create it and organize it, the traditional database is being relegated to one of many silos from which advanced content services can develop on-demand content solutions. Search engines, which rely on databases that can be queried in a standard format to provide standard answers, are beginning to fall into this same role of specialized answer tools. If you look at the typical search results page today from major providers you're looking at federated content from multiple sources, logically related to a greater whole but residing in separate storage environments and coming together in the moment as the answer to a specific question or need.
In short, what we have called a database is no longer a storage and indexing device. Rather, the database is now, the content sets that we assemble in a given moment to solve the moment's problem. Its structure is consistent thanks to XML standards, data dictionaries and data mining normalization tools, it can be stored as needed for time series analysis or corporate compliance, it can be shared with others to develop collaboration services or new forms of content and analysis. But in the next moment our needs may shift, sources may change structure or become unavailable or be replaced by different sources.
Market advantages tend to flow from institutions who can take advantage of content most effectively, and in the markets we can see how this concept already impacts business in a large way. In financial markets profits are shifting from public securities exchanges, whose transactions are built around highly normalized databases and data formats, to private transactions on highly complex financial instruments, whose underlying complex calculations on financial risk and return may apply to only a single transaction at a time. There is structure in such transactions, yes, and lots of normalized data, but the uniqueness of the content's structure at the moment that a deal is executed is far more important than its standard components.
Search engine providers such as Google understand this paradox explicitly and work hard to provide value-add interfaces that enable people to use search engine content as one of many feeds that can power "mashup" consumer and enterprise content applications. The Google search engine may be one of the world's largest databases but if other content in a form that's more usable in a specific context can come along and complement it in the moment, it becomes rather moot beyond a certain point whether or not it's in Google's index or another index. This federated approach to content value becomes at least as important as the quality of the individual sources. In a "the database is now" world, quality is as quality does - and it may mean something else a moment from now.
The implications of this concept for content publishers is enormous. Long used to building their standardized databases, the long-promised New Aggregation is on the verge of becoming the value leader for both enterprise and media publishers. Through the on-demand federation of content sources into aggregated content solutions the uniqueness of insights for small audiences is becoming a much more important method for creating value in aggregation than the pervasiveness of standardized insights.
Make no mistake, we'll be using today's search engines and databases for a long time as building blocks for federated content services, but we'll be less fixated on owning databases and more focused on owning the contexts in which they provide solutions. This is likely to change the pricing structure of content aggregation services significantly and to force traditional publishers into becoming on-the-fly aggregation services pulling in content agnostically from many sources that may not be under their direct control for more than a few moments. Subscription databases will yield, sometimes gradually and sometimes very rapidly, to subscription contexts, services that can assemble content from anywhere consistently and reliably for workflow and lifestyle applications. Yesterday's email inbox is becoming today's content inbox via feeds and social media: tomorrow's federated inboxes will be even more rich and complex through databases that live in the moment.
Social media and enterprise content federation services have already pressed many of these changes forward, but expect 2008 to be the year in which more than one company will begin to recognize the value of databases in the moment. The database is now - and so is the opportunity for publishers and enterprises to move beyond isolated content solutions.
Sometimes two distressful situations can combine to create relief, rare though that might be. Such seems to be the lucky break that both Microsoft and FAST Search and Transfer caught in the recent acquisition of FAST by Microsoft. FAST needed fast relief from crippling cash flow problems generated in part from a sales strategy that reached beyond their ability to deliver on ambitious promises. Microsoft on the other hand had failed to create any significant sales momentum behind its own enterprise search efforts, with players such as Google beginning to breathe down their necks more warmly with each passing day. So a mere USD 1.2 billion in cash works quite nicely to bring together two impressive partners that promise to dominate enterprise platforms for some time to come.
FAST's rapid growth over the past few years into an increasingly dominant position in enterprise search markets is just the ticket that Microsoft needs to position itself in increasingly competitive enterprise platform markets. With ever more content being consumed in enterprises via non-Microsoft platforms, domination requires a more agnostic approach to assembling on-demand content than Microsoft has been able to manage recently. FAST offers both solid enterprise search technology and an installed base of global corporate clients that Microsoft can leverage very effectively with the combination of FAST search capabilities to gather content and Microsoft's Sharepoint servers to store and aggregate content.
This last point is especially important for Microsoft's future revenues. With its Vista operating system rendered a ho-hum at best by most enterprise users and panned widely in consumer markets Microsoft needs to shift the center of its profits to platforms sy uch as search engines that are more central to what drives internal publishing in today's enterprises. Each page of search results can become in effect a purpose-built portal: in effect, the database is now, the content that's required to solve immediate business problems. Search technology such as that offered by FAST holds out the promise of search engines becoming the focal point for Microsoft's enterprise publishing strategy, offering Microsoft more opportunity to have offerings that scale effectively to both global and mid-sized corporations. That $1.2 billlion make look like relative pocket change today, but in terms of the market share secured and the future market positioning that will be required to counter slowing sales on its aging operating systems it's a major investment in securing Microsoft's future cash flow.
Attendees voiced the value of the range of tracks from strategic management of knowledge to the practical aspects of selecting and living with search software and applications, down to the nitty-gritty of taxonomy implementations. Traffic was good in the vendor booths of the Expo area, as technologists and content managers mingled over receptions, meals and seminars.
The opening keynoter for ESS was Susan Feldman, Research Vice President, Content Technologies, IDC. describing a market in flux with many competing technologies. Search is the missing piece for enterprise software, and large software vendors are entering the market. SaaS options are good solutions due to the complexity of search technology, and need to have the latest version.
The keynote was a nice lead into the session that I chaired on "Solving the Multiple Search Engine Problem" addressing approaches to the proliferation of departmental search vendors within organizations. Rennie Walker, Wells Fargo, described "waking up one morning with the multi-search engine blues", resulting in creating a Search Center of Excellence (COE). Swetswise uses a federating search software, Museglobal, to deliver a subscription delivery product incorporating multiple search indexes. Miles Kehoe, New Idea Engineering, identified the challenges of maintaining distributed search engine indexes--a practicality not addressed by vendors.
Security, ediscovery and regulatory compliance were themes in other presentations. Search across multiple repositories brings the thorny problems of access control to the underlying content. Depending on the application, different levels of security may be necessary, down to the sub-document level. Choices include "early binding" vs. "late binding" options for access. Additional challenges include the changes in Federal Rules of Civil Procedure of 12/1/2006, making risk management of the enterprise search environment more critical.
Steve Arnold, highly regarded industry expert on search engines chaired a keynote panel originally entitled "Giants Do Stumble: Are Google and Microsoft in Decline?" modified in the final program to "What's Next for the Search Engine Giants", questioning product managers from Google and Microsoft, who provided little new insight. Both companies are relative newcomers to the enterprise search space, and had vendor booths in the expo, joining traditional vendors. Arnold, in a later session, honed in on Google and his analysis of their patents to predict new directions.
Findability is more than keyword search in full text documents, a message which came through in both the sessions and vendor presentations. Sessions on semantic search indicate progress in actual implementation, which is closely tied to classification and taxonomy systems. Improved navigation, particularly faceted search, are another approach to improve the user experience, and improve findability.
Niche software vendors on the exhibit floor, demonstrated other approaches to improving findability. Siderean uses a relationship approach which intuitively fits research and discovery processes, to improve findability. Cognition was demonstrating their linguistic search software with great promise for in depth research, particularly in scientific and technical literature, with a plethora of potential search terms. Deep Web Technologies showed the power of federating search software, as implemented at science.gov and scitopia.org.
Enterprise search and management of organizational intellectual capital have become mission-critical. The challenge is finding the right approaches for the organization, then the technical tools for implementation. Increasingly, behavioral and linguistic aspects are being recognized as essential factors in the process of adding value to the organization. Search is not easy, and delivering answers to people is not straightforward. It's finding the right combination of solutions that challenges the attendees at these conferences..there is no one-size-fits-all!
I have enjoyed using the Compete.com traffic analysis service, which provides some useful data to compare Web site traffic performance more accurately and finely than the oft-bashed Alexa statistics. While Compete offers a more limited range of sites for analysis and only a year's worth of data to mull through it's able to track real visitors, audience engagement and growth with more meaningful data. On the Compete blog recently was a post that looked at how major search engines are performing in comparison to one another for both traffic and performance. While Google leads Yahoo and Microsoft with 67 percent of market share, the Compete stats claim that Yahoo comes out on top in terms of search fulfillment - the percentage of searches that actually result in someone clicking on a link in a search results page. Compete claims that Yahoo's search fulfillment rate is 75 percent, compared with Google's 64 percent and Microsoft's 61 percent.
Does this mean that Yahoo's search results are more "clickable" than Yahoo's? Maybe so, but it's a rather ambiguous claim to make. One has to assume that with only 20 percent of people using Yahoo for searching to start with that a minority find its search results to be more useful than Google's. So for that minority they seem to use them more effectively. Overall, Yahoo searches are more optimized for people in a purchasing mode than Google search results, which tend to be optimized more for people seeing general information. With this in mind it could be that Yahoo tends to lead shoppers somewhat more specifically to product information that they're seeking - a factor that's likely to attract the brand advertisers that are at the core of Yahoo's marketing strategy.
Yahoo search benefits from doing fewer things better for fewer people, but Compete also shows that Yahoo as a whole performs far better than Google in the total attention that it gets from audiences:
While Yahoo's strong destination content helps to bolster its attention ratings it's losing ground to Microsoft in total page views as Microsoft bolsters its Live.com search engine:
In the middle of this is Google, still the overall search leader but beginning to stagnate as a destination as other search-oriented sites bolster content that transforms search portals more into destination content sites. Google has these abilities also but focuses more on solving a broader array of requirements for a broader search audience. Google also has more partners using its search technology as well as mashups and other API-based services so to some degree the Compete statistics are not revealing the full strength of Google's market presence. Google's growth as a destination search engine may have slowed, but its presence as a technology platform that influences where and how people find content in valuable contexts is growing in highly profitable directions.
All of this should serve to remind us that there is no longer one clear answer to how to create marketable value through search. You can focus on becoming more portal-like, you can focus on being more embeddable, you can focus more on a specific function such as ecommerce or you can focus on a range of functions - but regardless of the focus it's no longer a matter of just having great ranking algorithms or great server farms. Search has become just one of many tools for contextualizing Web content effectively on demand, one that will continue to grow in importance but just one tool in an arsenal of methods to be used for more effective audience engagement.
CNET News covers the first major ratings results from its revised audience ratings methodology at comScore's Media Metrix unit and the results are not altogether rosy for major portal providers. According to CNET under ComScore's new qSearch 2.0, Yahoo lost market share from a year ago and is now at 23.5 percent for July, while Google gained share, reaching 55.2 percent market share. The New York Times notes also a fall in Forbes.com's audience measurement from 15.3 million in its original February data to a revised figure of 13.2 million.
One of the key factors aiding Google in the new measurement system is comScore's inclusion of search queries initiated via Google's infrastructure through search partners, as well as queries into "universal search" categories such as news or images from a search engine's home page initiated off of an initial query.
All of this builds audience share, which despite protests from other portal providers about quality audiences is still a major factor. The difference now, though, is that ratings companies are recognizing that in a world of embedded content, OEM relationships and mashups the "here" of content is less about who comes to your site and more about how your content gets in front of audiences in many venues. Jeff Jarvis notes in a "portals are past" rant that it doesn't matter if you get 10,000 impressions on a site with an audience of 100 million impressions or from multiple sites with smaller audiences, which is somewhat to the point but misleading.
With advertisers focusing increasingly on conversational marketing and contextual ad placement the new audience metrics are rewarding publishers whose content can engage those audiences in as many finely defined contexts as possible. The issue is less the total size of a portal's audience and more the ability of a portal to define the right audiences for advertisers. It isn't so much a matter of "big is bad and small is good" as it is getting the right context for your audience no matter where they congregate.
This is where Google has done itself an enormous favor over the past several years in encouraging the use of its content via mashups, Google Co-Op and other tools that make it easy for both professionals and amateurs to use Google content in so many different contexts. There is a lot to be said for the strategies of portals such as Yahoo! and Ask.com to engage audiences more deeply at their own destination sites to build quality audience engagement but they have lagged behind Google in defining unique contexts for content beyond their portals that may be less heavily branded but of equal value to advertisers. At publisher sites such as Forbes.com the problems are not so different, with a preponderance of traditionally syndicated content building up clicks but failing to produce enough unique content that can make a dent through their own syndication strategies to take advantage of new audience metrics.
In all of these instances Google gained an advantage by focusing on syndicating context rather than content, avoiding the expenses and lethargic pace of traditional content licensing deals in favor of making it easy for people to find anyone's content in the right context and to build additional and unique content around it. This can happen on large portals or small portals - it matters not to Google, as long as it keeps growing.
We've long held that portal strategies were topping out, so none of this comes as a terrible surprise, but it's interesting to see how advertisers in search of meaningful metrics are now one of the key drivers that are showing the way to online publishers who may have doubted the value of Google's strategies to advertisers. Traditional portals will continue to be important as branding mechanisms for content producers and marketers but the highly portable value of context is beginning to to carve away at the bottom line of portal producers.
CNET interviews Jaideep Singh, the CEO and Co-Founder of the newly launched personal profile search engine Spock, and reveals insights into what is perhaps the hottest online content product launch this year. The Spock team has already assembled about 100 million tagged personal profiles of both living and historical people, including high profile people from the past like Diana, Princess of Wales and somewhat more mundane people from today like, well, me. Spock has carved out a very clever niche for itself, providing bone-simple search and navigation features like Google, personal profiling and networking as found in social media services such as LinkedIn and Facebook, content tagging, bookmarking and voting features like Digg and del.icio.us and content embedding features like PhotoBucket that enable a Spock profile to appear on Web pages beyond Spock.
There are all too many instances of features checklists like the above that could result in tragically bad content services but that's not the case with Spock. Through its system of content tagging and linking Spock winds up being a very powerful tool to research people who might have something to say on a given topic or to find out people who may have a connection to someone who you need to research. For example if you try a Spock search on "global warming" you get to no one's surprise a Spock profile of Al Gore as your first entry, but it's followed closely by Bill Clinton's profile (listed as "global warming advocate" [sic] as well as having a relationship link to Al Gore) and then by profiles of numerous global warming skeptics, including Rush Limbaugh. These are interesting and highly relevant search results that Google, as good as it may be from its own perspective, simply cannot duplicate.
Anyone can tag a person's profile returned on Spock with additional keywords that may be relevant to the person or add a vote for an existing tag. This is an exciting combination of content categorization and user feedback that provides the ability to create more relevance for a given person's relationship to a tagged topic without having to rely on evaluating external content sources. However Spock does quite a bit of external content evaluation as well, using patented algorithms to determine relevance, personal links and profile information. This information may be verified and edited by a person logging in to the Spock service and claiming their profile, much as in the Zoominfo online directory of professionals. In building a profile one can add links to existing personal profiles on social media services or links to relevant Web pages. Others may add links to your profile as well and vote on them, so there is a social media aspect to profile building also.
There's very little redundant information in Spock, it's mostly links to relevant information found elsewhere, as with other search engines. But the social media features, profile links, user tagging, bookmarking and personal profile validation features combine with straight search capabilities to create a truly unique experience with very useful information. Given that people have been "Googling" people for a long time you'd think that a major search engine like Google would have come up with Spock-like features to add value to personal searching, but Spock found that need and has filled it very nicely. While it may lack some of the strong business oriented capabilities of finding professionals via services such as LinkedIn, Jigsaw or Zoominfo the Spock method seems to try to be a Switzerland of sorts for social media profiles: have as many as you want wherever you want them and Spock will use them as useful input for building yourself an all-encompassing profile and content directory on their own service.
The mixture of both solid results and fun exploration is sure to make Spock a very popular and useful service for people in both personal and professional roles, a factor that is likely to encourage people to build and maintain a high profile via Spock's search services. Spock helps to fill in the area between purely automated searches that fail to incorporate personal wisdom on both people an topics and does so in novel ways that challenge both conventional search engines and more traditional directory services to consider how people can be exposed most effectively to audiences searching for both information about people and both personal and professional relationships with people. It's still early days for Spock, of course - performance is so-so at times and there are still some bugs to be found in basic features such as profile claiming - but as a tool to probe into people within the framework of key topics expect Spock to become a trend-setter for some time to come.
Read/Write Web notes the following comments by Tapan Bhat, Yahoo's vice president of Front Doors, at the recent NextWeb conference in Amsterdam. Tapan told attendees that search would not dominate the web in the future:
"The future of the web is about personalization. Where search was dominant, now the web is about 'me.' It's about weaving the web together in a way that is smart and personalized for the user."
Well, yes and no, Tapan. Yahoo's personalization plays are exploiting the trend towards audiences aggregating their own content from various sources, including feeds, widgets, bookmarking services and other social media tools. User-defined aggregation plays a key role in defining where and how people look for and find content. But where is most of that content coming from? Search engines power many of the mashup and widget-oriented aggregation plays that are touted as the leading edge of social media. Be it through Google or more enterprise- and media-oriented services such as MuseGlobal, Mark Logic, Nstein or Really Strategies search services are evolving into the back ends for value-add content services that place valuable content in customized contexts well beyond traditional search results. So there's no escaping the importance of search and its ability to return the most relevant and useful content.
Where Tapan may have a point is that people aren't really looking towards new search engines to solve their problems. paidContent.org noted the arrival of Ask3D, a refreshed version of the Ask.com interface that, well, looks pretty much like the old interface but a little prettier. Ask.com is a good search engine, but I think that the personalization movement is a little bit off target. It's not so much about "let me personalize my search results" as it is "tell me what I want to know." If user-defined personalization accomplishes this, great, but Google's emphasis on anticipating what users need on a more personalized basis is probably closer to what will succeed for the 80-percent crowd. As noted by Information Today the new "Universal Search" interface does a lot to customize search results to a specific context automatically, a concept that Google will expand upon as it integrates content from its wide array of search-based services even further over the past several months. For the 20 percent or less who will demand more control and features sooner there's now Google Experimental, which includes early-stage features that may make their way into the Universal toolkit soon enough.
So is search really "done" at this point? As the hottest problem to solve perhaps search is indeed past its peak, even though search engines will still continue to be refined. But the new generation of content services have search at their core and will add in feeds, Web mining and other capabilities to aggregate content on the fly far more effectively than information services have done to date. We all applaud Factiva's new integration of audio and video content into its search capability, for example, but the real proof of the pudding will be the applications that Factiva's clients choose to build off of such content. Consider search at this point the ad hoc database building tool of choice for millions of users that is only beginning to be used to its fullest extent to create highly valuable content services.
The native language of Hawaii has given us words like "aloha" that have slipped into general use as well as more other terms like "wiki" that have been appropriated for new uses. Add to that list of appropriations the Hawaiian word "mahalo," which means "Thank you" in everyday conversations and now refers also to Mahalo, the new user-driven search portal under development by Jason Calacanis. "Mahalo's goal is to hand-write the top 10,000 search terms," goes the boilerplate on its page templates, an objective that's being lead by ex-Anchors from Netscape and like-skilled guides. Visitors to Mahalo can suggest links for inclusion in the service. How does this all work? As an alpha-level product you have to give Jason some slack but in truth it's not something that you're going to figure out as a user in a few seconds. Thank goodness for the FAQ.
On one level Mahalo is quite simple: type in a search term, get either a page of information and links that's been largely edited by a Mahalo guide or something that's been generated automatically for terms that they haven't populated as of yet. Being day two there are lots more pages that are misses than hits, but a listing of the top 20 searches appears on each search results page to give Mahalo visitors a sense of who's looking at what. You can also enter questions in a natural language style, which will provide results that look a bit like an amateur's version of Answers.com (partnership, anyone?). An example of a topic page more fully populated by Mahalo guides is Apple, which lists a "Mahalo Top 7" links for the term, disambiguation (Did you mean: "Apple, the fruit? Apple, the Beatles' record label?"), financial information, products, news, blogs and fansites, information and reviews, upgrades and support, photos and videos, competitors, and "culture". Items that Mahalo guides really dig get a little icon. In theory users can make comments on Mahalo pages, but in my short tour I haven't seen any yet.
Well, this is certainly...innovative. Or utterly derivative, depending on your point of view. I know from personal experience that there is one huge brain between Jason's ears and it seems as if every idea he ever had or absorbed about the content industry exploded all at once from his noggin onto the pages of Mahalo. From one angle what we have here is About.com with user input: docents put together some light content that surrounds links. Okay,we know that works. Kind of. From another angle we have a dot-com era version of Hoovers, a light assemblage of business and product info to guide the initially curious. Interesting, but who is this aimed at? From yet another angle we have Wikipedia, a catch-all encyclopedia format that tries to catch a wide variety of facets about a given topic. Digg and other social bookmarking services enter into the picture with Mahalo Top 7 bookmarks, but there's not a strong sense of how useful the first seven results will be: social bookmarking services don't rank relevance all that well. And of course there's the analogy to Answers.com, one-stop answers to questions from the best sources available. Except we really have to trust someone called a "guide" as to his or her judgment on sources.
Finally, there's the question of when I will know when to go to Mahalo. Will it be when I have a question that's one of the top 10,000 search terms? Oooh, is what I want to find maybe number 15,000? I dunno. Try "most popular," Jason, people will be able to get their heads around that more easily. Do I go there to get the latest news? Hmm, they have news feeds from Fox and other partners but but why would I get them here rather than other places - and why aren't the guides lending a hand with filtering and updating the news? While Wikipedia may be in the hands of "those darn users" I have a fairly high level of confidence that information on almost any popular topic will be updated within minutes, if not seconds, of something happening in the real world across a huge array of topics. I also know that Google will insert hot news at the top of my search results and that user-generated sites will help me to find the really cool news pretty quickly. I don't know how true that's going to be of any well-intended editorial staff covering tens of thousands of topics every day - even with help from users. Will I go there for shopping? Probably not, services like eBay and Google will scrape together the information that I need more effectively. Will I go there for reference information? Maybe, but with such a generic approach to content organization I'd probably prefer to type in a term on Google and branch off to Wikipedia, Answers.com, Hoovers or other key sources that it finds so easily. Will I go there to browse their taxonomy? Probably not, I've gotten too used to getting information on any topic level with one phrase and a click.
So, when DO I go to Mahalo? That's something that Jason needs to work on a little more. There are a lot of very interesting individual features and there's definitely a need out there for something between algorithmic search engines and the chaos of social bookmarking, but I am wondering whether this is more about a product vision or more about what to do with all of those ex-Netscapers who were inspired by Jason. If it's more the latter then it's not clear that a fairly limited and relatively anonymous editorial staff is going to have the horsepower or the respect within a given topic arena to build relevance creds. It gives Jason the control over writers that he desires, but in specific topic domains it may take more editorial talent to pull this off than he can afford.
There are so many ideas forming at once in Mahalo that it's far too early to write it off as a mish-mosh of interesting concepts - especially since people are growing tired of the "gaming" of search results. Calacanis could put initial feedback to good use, form more useful partnerships and come up with a tool that really stands out for an increasingly sophisticated online audience. But at this point my bet's against it. With Google's "Universal Search" capabilities beginning to phase in and more pure user-generated content plays becoming more disciplined and deep it's not clear that the features in Mahalo will ever mature to the point where they'll gel into a useful product in comparison to more established search and reference plays. At the same time there's far too little a sense of online community in Mahalo to make people passionate about online content feel that this product is really "theirs" in any strong way. In between these approaches there's probably room for a product that combines the best of search, editorial skills and user input to create marketable context for popular topics. But for now I don't think that people will be saying "thank you" to Mahala for its attempts at filling that need.