TechFold - Bold tech & web commentary
Bold tech & web commentary
TechFold is technology discussion, commentary, reviews, and opinions from well outside the valley. There's no koolaid to drink here, and TechFold is not in SL, or on Twitter.
Hakia vs. Google: the 5 Million Dollar Question
Hakia just loaded up with another $5M to keep working on their semantic search “web 3.0″ product.
Out of interest, I compared search results for what I thought would be a good “semantic” search on both ol’ faithful Google and newfangled Hakia.
The query was: “How to protect my privacy on facebook.” Hakia’s results are here, Google’s here. Screenshots and commentary follow.
I figured it would be a good test of the two because there’s meaning to be extracted from the semantics of the sentence - “how to” implies that I’m seeking a tutorial, “protect” means that I have a specific action in mind, as opposed to just seeking editorial content on “facebook” and “privacy” and so on.
The verdict? Hakia’s second result was a splog trying to get me to sign up for a credit assessment. The rest of the Hakia results were a low quality mish-mash of blog posts and miscellaneous content which by and large were discussions about Facebook - for example, a bunch of HuffPo stuff.
Google on the other hand, started with a PDF from Canada’s Privacy Commissioner on how to adjust privacy settings in your Facebook profile, and continued with several highly relevant tutorials (including Facebook’s own privacy pages), before breaking down into discussion posts.
The bottom line? Google’s result was massively superior. On the other hand, Hakia has built in social networking, offering to introduce me to others who have received similarly dissatisfying results for the same search. Super! Ok, fine. I’m not giving Hakia a fair shake with a sample size of one. But nor will anyone else, and first impressions count. At least Powerset keeps their weak beta results private; Hakia should have done the same, IMHO.
google, hakia searchIf you enjoyed this post, make sure you subscribe to my RSS feed!
Google Results Suck Hard… so does my SEO
I’ve always tried to stay out of SEO/SEM discussions as I’ve invested very little time into understanding the dynamics of search algorithms and the practicalities of SEO. As a content site owner, however, I’m starting to understand that its not just a traffic booster, but fundamental to the survival of my site (upcomingdiscs.com).
UpcomingDiscs has a great crew of reviewers cranking out quality dvd reviews - that Google can’t find. Here’s an example: take a look at the results for “planet terror dvd review” - seems like a natural enough search, right? UpcomingDiscs has a good review of “Planet Terror” - quality, original content.
- The #1 result from Google is a Netscape/Propeller page with one vote.
- The #6 result from Google is a Digg page with two votes.
- UpcomingDiscs doesn’t appear in the index. Rather, I only looked to page 10 of results, but we weren’t there.
- The first 10 pages are a mish mash of review indexes, reviews, newspaper stories about the movie, and so on.
So - what conclusions can I draw from this little experiment?
1. Google’s algorithm is damaged. Propeller and Digg in first and sixth place are not right.
2. UpcomingDiscs needs some radical SEO.
The first of these I have little control over; the second, I have lots. I figured that producing an XML sitemap pointing to original content with clean URL’s would suffice; apparently not.
google, search, sem seoIf you enjoyed this post, make sure you subscribe to my RSS feed!
#2 Deconstructing the TC40: Faroo
#2 in my “deconstructing the TC40″ series (see #1).
Faroo. P2P search.
Summary: DO NOT WANT.
The Proposition: No crawler, no centralized index. Instead, Faroo relies on users (presumably with a browser toolbar); each page a user visits is added to the Faroo distributed index. Visitor metrics gathered across the Faroo network drive rankings in search results.
Value Add: Better search. Advertising revenue shared with users.
Warning Bells: Revenue sharing is commonly a prop to support a core business that doesn’t have the strength to stand on its own. Faroo results will be based on popularity - a poor proxy for either quality or relevance. Similarly, Faroo’s results will be a thin slice of the net. Finally: Faroo’s website talks about Faroo as an alternative to [Google’s] “info monopoly” - mixing “information wants to be free!!1!” politics with business is a warning sign about the company’s readiness to compete on the main stage. Basically no barriers to entry.
Death Knell: Both del.icio.us and StumbleUpon could build out similar “social search” functionality around their indexes very quickly if the concept got any traction.
faroo, google, search, tc20, tc40 techcrunchIf you enjoyed this post, make sure you subscribe to my RSS feed!
Recommendations & Discovery are the New Search
Recommendations and discovery are becoming more and more important - to the internet in general, but also specifically as a complement, supplement, and (gasp!) replacement to core search activities.
The big question I’m wondering is how long it will be before a “Search” box appears on the StumbleUpon home page - leveraging their human derived index to allow for narrowly focused, highly relevant stumbling.
In one sense, the rise of “R&D” can be seen as a response to Google’s dominance - services like StumbleUpon, Medium, and so on offer entrepreneurs and investors a way around Google to influence people’s web-usage patterns.
In another sense, however, the evolution of R&D is a natural consequence of technology’s march. R&D services are fundamentally search engines: they just use a different class of algorithm (clickstream correlation), and accept queries in a less-structured fashion. Effectively, R&D services are another point in the spectrum of human vs. algorithmic search options: Google pins down one end, while Mahalo, ChaCha, and even Yahoo! Answers compete at the other. StumbleUpon and their ilk exist in the center blending the human element (clickstreams) with the machine (automated aggregation and analysis).
calacanis, chacha, delicious, google, mahalo, medium, search, stumbleupon yahooIf you enjoyed this post, make sure you subscribe to my RSS feed!
Corrected: DNHour.com - Digg for Domain Name News
DNHour.com is a community driven news site for the SEO domain name industry. IMHO this is a great vertical to target - the domain industry (and somewhat-related SEO industry) has a very high noise-to-signal ratio, and a good community (and algorithm) would go a long way to sorting some of the wheat from the chaff. Though its currently a little sparse on the community interaction side, its brand new, and I’m willing to give it a chance. I’m adding the DNHour feed to my reader. [found via Press Release]

From the press release:
DNHour.com is founded by a Malaysian-based domainer and serial entrepreneur, Koay Al Vin. After missing out on some big domain name purchases, he decided to keep in the know by tediously scouring domain forums and listing sites for what is available. He founded DNHour.com to ease the process and today, all domainers can help each other by sharing those important news and events at DNHour.com.
Other Coverage:
Net Monetization says get into DNHour early.
CORRECTION: Koay Al Vin contacted me to indicate that the site is focused specifically on Domain Name news & finds - not SEO as I assumed. Thanks for the correction!
digg, dnhour, domain+names, domaining, domains, marketing, optimization, search, sem seoIf you enjoyed this post, make sure you subscribe to my RSS feed!
FoxMarks to Power Search Engine, kill Mahalo
TechCrunch reports that Mitch Kapor (of Lotus 1-2-3 fame) is building out his bookmark synchronization service Foxmarks into a search engine, using meta-data entered by users to power search results. Foxmarks counts 20 million unique URL’s in its human-powered index.
Del.icio.us should have been used by Yahoo! to do this the day after they were acquired. I suggested something similar in an earlier discussion about how Google’s AdSense could better serve ads. Point being: Yahoo is again failing to innovate or capitalize on their assets. They truly are the death by shareholder standard bearer.
Is Foxmarks a Google killer? Probably not. But: they have a growing asset with their human powered index, and the core bookmark-syncing system provides a value-added vector for spreading the tool and the brand. A flurry of press coverage could provide it with the impetus it needs to grab a piece of the search pie.
Is Foxmarks a Mahalo killer? Yes. Search engines fall on a continuum of totally alogorithmic (Google) and totally human (DMOZ, Mahalo). Foxmarks sits in the middle, occupying what I think is the sweet spot - it blends the volume handling of the algorithm with the data categorization of humans, adds in aggregation to increase credibility and smooth outrider results, and bundles it all together.
Of course, all of this discussion is academic. I haven’t seen Foxmarks in action, as I wasn’t at Foo Camp.
algorithm, calacanis, foxmarks, google, mahalo, search seoIf you enjoyed this post, make sure you subscribe to my RSS feed!
Mahalo’s launched a closed DMOZ

Warning: this is a lousy post. I’m exhausted after consecutive 14 hour days at the office.
Back when the Internet was new and fresh, Yahoo changed the world with its human powered search index. Time marched on, Yahoo went algorithmic, and the human-index torch was taken up by DMOZ: The Open Directory Project. DMOZ was, and continues to be, a human categorized index of websites: anyone can an editor, and DMOZ claims 4,830,584 sites indexed by 75,151 editors.
Of course, DMOZ has fallen on hard times. Originally sponsored by Netscape, AOL and Netscape still claim ownership and do little to throw attention its way. So, the DMOZ human-indexed directory of the web languishes in obscurity while projects like Mahalo by and large re-invent the wheel.
Its weird. Some minor functionality tweaks, and DMOZ - categorizing the internet since 1998 - could have been Wikipedia. A little SEO applied to the DMOZ site templates and it could be huge again. Why are AOL and Netscape sitting on a historical gold-mine of data and a once-viable user community and doing nothing with it?
Let’s get back to Mahalo and re-inventing the wheel. TC has a good summary, and there’s the Calacanis post and press release too. In a nutshell: Mahalo offers human indexed search results to top queries, building a growing library of indexed-queries over time, falling back to Google results when no human indexed results are present.
My Random thoughts…
- So - essentially, they’re rebuilding DMOZ - associating websites with keywords manually - but doing so in a closed, system dependent on the preferences and goodwill of the editors.
- Do you want your results selected by an individual? Personally, I’m more comfortable with a community (DMOZ), and most comfortable with an algorithm (Google!) that is at least impartial, if not as always perfect.
- Does the human index solve the problem of poorly constructed queries?
- So… what does Mahalo do better or different? What sets is apart from the always in the margins ChaCha? Hasn’t About.com been doing this for years too?
- Personally, I think DMOZ has been long under-utilized by both the searching public, and companies that could tap its open database of indexed links.
Prognostication
Mahalo will get a small dedicated following and grow very slowly before topping out a marginal but respectable market share of a few percent. Its not enough of anything new to have a big impact. Its not “better enough” than anything out there to have a major impact. The main thing that will kickstart its popularity will be the savvy marketing of Calacanis. Expect it to fade to obscurity quickly when Calacanis leaves.
More Coverage
- CenterNetworks has a great roundup of some of the issues around Mahalo. Is this just a big link farm SEO play?
- Rex Dixon agrees with Allen.
- Redeye VC compares Mahalo to ancient pre-bubble1.0 human index Magellan.
- Webware relates Mahalo’s strategy to target the short tail.
If you enjoyed this post, make sure you subscribe to my RSS feed!
Retrevo - Gadget Search Vertical Nirvana
Consumer electronics - aka “gadgets” - is a huge enterprise: just check the cross talk that dominates Digg about PS3’s, Wii’s, and Apple, or the burgeoning reader populations of Gizmodo, Engadget, and Crunchgear.
The existence of Retrevo, therefore, comes as little surprise: its a search aggregator that pulls together information in a targeted, gadget focused vertical - only specific gadget-focused content sources are indexed.
To illustrate the notion of a vertical search engine, I put together my first ever Dion Hinchcliffe style graphic.
- The blue bars indicate the mix of formats and topic areas that define content found on the web: video, blogs, gadget blogs, adult video - everything gets a blue bar.
- The yellow “horizontals” denote the major search engines and their attempt to index *everything* on the net - a very wide reach.
- The relative size of each yellow bar illustrates their different depths - Google has a bigger index than MSN or Yahoo, and they all overlap.
- Retrevo is the red/pink bar: this is the essence of a search vertical: trying to own a single (or small group of related) blue bar(s), cover it very deeply (note the relative coverage of that particular bar compared to Google/MSN/Yahoo), and index it thoroughly.

So - that’s Retrevo in a nutshell: they’ve pegged “gadgets” as a vertical blue bar, and built an application around indexing it and serving focused results.
Content and links found via Retrevo are nicely categorized:

- Product Documents: Manuals! A spectacular feature, Retrevo indexes manuals - perfect for figuring out how to work items who’s packaging is long since lost. Only shows up for searches containing a specific product name that Retrevo recognizes.
- Manufacturer Info: Official site and product pages for searches that include a prominent brand.
- Reviews and Articles: Wikipedia content where applicable, reviews from sites like dpreview.com.
- Forums and Blogs: Retrevo indexes specific topical forums, as well as the usual suspect gadget blogs.
- Shopping: Retrevo serves up results from Amazon, eBay, and other recognizable A-list retailers.
I particularly love the inclusion of forums: they are a huge source of user-contributed content, conveniently sorted by taxonomy (depending on how different forums are organized) and keywords (thread titles). Forums seem to have gotten a the short end of the stick in the wider internet world, having been passed hype-wise by blogs, Twitter, and so on - but Retrevo has correctly identified them as a massive repository of contributed knowledge.
I found Retrevo search results to be consistently good: my search for GPS turned up good blog entries, product sites, and shopping options, as well as topic-area knowledge such as the Wikipedia article on GPS technology. My search for “Casio Exilim EX-S3” turned up lots on my trusty old pocket shooter - including the instruction manual - that’s just awesome.
A Few Suggestions
Of course, I would be remiss if I didn’t include a few thoughts as to how Retrevo could be better.
- Ditch the Preview: Retrevo devotes half of its screen real estate to a preview pane that loads a page who’s “preview” button you click on in search results. Thoughts: this had more relevance before browsers were tabbed. Set every link to “target=_new” and call it a day. Reclaim that real estate for nicer results. If you absolutely need previews, use Snap Shots. I suppose the preview pane may increase “stickiness” - but the utility of a search tool is judged on task-completion-effectiveness: deliver on that value point well and stickiness shouldn’t be a problem.
- Build a Co-brandable Gadget Search Widget: Build a Retrevo search widget that can be co-branded, and then cut deals to get it on Engadget and Gizmodo, and of course any other gadget blog that wants it. The more people to who your functionality is accessible, the better - let the network work for you.
That’s it - Retrevo is a great vertical application. Its a Google-killer - in its narrow scope. If it can distribution and top-of-mind awareness in gadget-freaks head’s, then I’m sure it will continue to grow and flourish.
electronics, engadget, gadgets, gizmodo, google, retrevo searchIf you enjoyed this post, make sure you subscribe to my RSS feed!
Tabbed Meta-Searching with Twerq: An IP ace up the sleeve?
Twerq wraps a rich AJAX UI around the search engine of your choice (Google, Yahoo, or Live), letting you conduct and maintain searches in parallel on separate results tabs. Its a nice implementation and works very smoothly, and with saved searches and the “QuickTags” features, provides an easy means to access common searches that you do.
Nice implementation aside, I don’t expect commercial success for Twerq, or any meta-search service, for that matter. Just to be clear, “meta-search” sites are those that use Google/Yahoo/??? API’s or page scraping for search results, and attempt to wrap value added features around those results - in the case of Twerq, tabbed results; in the case of othersm, consolidated results, results from multiple sources on the same page, etc etc.
Anyway - the point is, no meta-search engine has ever taken off. The most compelling part of a search engine’s value equation is relevant results, and meta-engines don’t add anything to this other than volume, or at best, some amount of marginal convenience. In the case of Twerq, convenience might be a sellable factor, if (a) browsers didn’t have tabs, and (b) Google History didn’t exist. The marginal convenience value of Twerq is very low given what’s already available with no switching-effort.
Then, of course, there’s the fact that meta-searching is against at least Google’s Terms of Service.
So - the good people at Twerq have an uphill battle. A product segment without any successes, an apathetic response among the techy-public, and questionable legal status. So what’s their angle? Looks like patents are in the works. Hit the “about” link at the very bottom of the page, and the “Development Team”, and you’ll notice the line:
Jeff M. Furr - (Legal Council, Patent Application)
So - is Twerq patenting AJAX-driven tabbed navigation for searching? Or in tabbed navigation in general? Given the dismal state of patent law in the US, you never know what Twerq may be patenting, and what might be granted. Anyway, given the low likelihood of organic success from the site itself, perhaps Twerq is hoping that a patent will provide a sellable asset, or a litigation windfall should some other hapless search engine go the tabbed-route.
google, intellectual property, ip, msn, patents, search, tabs, twerq yahooIf you enjoyed this post, make sure you subscribe to my RSS feed!
Too Bad
…that AmericasSearchEngine.com doesn’t exist. That would have nicely validated all of my ill-conceived sterotypical notions about the USA.
[from VALLEYWAG who has a great riff on how to compete with Google]
business, google, patriots searchIf you enjoyed this post, make sure you subscribe to my RSS feed!

Subscribe to RSS Feed






Subscribe to TechFold RSS




