Australian entrepreneur with FanFooty (alive) and Tinfinger (dead) on his CV. Working on new projects, podcasting weekly at the Coaches Box, and trying not to let microblogging take over this blog.

Monday, March 26, 2007

Citizendium launches, gets ignored

According to Google Blog Search and Technorati, I'm the first blog to link to Larry Sanger's announcement that Citizendium has launched. (The blogosphere seems too preoccupied with yet another iteration of Why Haven't Newspapers Died Yet, Would You Hurry Up Please FFS.)

Citizendium screenshotBrowsing through Citizendium is kind of weird at this early stage. Take the Computer page, for example. Its CSS, layout and page furniture are exactly the same as Wikipedia, but the vast majority of the internal hyperlinks are coloured red instead of blue, because the links refer to pages which have not been created yet. It feels like, as someone once said of the alpha version of Tinfinger, a town that no one has come to live in yet. The Wikipedia version of the Computer page is superior in content and has only one red link among hundreds of blues, but it is locked for editing. The Computer:Talk page reveals that Wikipedia's editors have spent a lot of time perfecting this page via peer review, to the extent that the community (or whoever represents it) now considers this page to be "A-Class". The Citizendium version contains a lot of passages copied verbatim from the Wikipedia article, and many more which are changed only slightly with addition or subtraction of a clause here or there.

The overall impression I get is... probably unfair. As Larry points out in his essay Why the Citizendium Will (Probably) Succeed, Wikipedia itself sucked horribly at this stage in its development, so it would be uncharitable to pick on CZ for its early flaws. I am still concerned about Larry's attitude though:

A good number of disaffected Wikipedians have joined us. Our increasing activity will bring over even more. These are frequently the sort of people we want. After all, our natural contributors like the idea of Wikipedia. They love the ease of contribution, the instant visibility of their work, the sense of shared purpose inherent in strong collaboration, the gradually improving quality, and so on. They love working with Wikipedia's many excellent contributors. Despite all that, they even more strongly dislike having to deal with its many problem users--disrespectful, immature, ideologically driven, or unstable people, that administrators are unable to rein in. Indeed, if the many complaints are to be believed, such people are to be found among Wikipedia's administrators.

Such people will also be found amongst people who leave the big W and go to CZ. Troublemakers. People who have been proven not to work in a team environment. Trolls. Setting yourself up as I Can't Believe It's Not Wikipedia only works if Wikipedia is not working: if it's actually the good people who stay at Wikipedia then CZ will become a renegade hideout of sorts, a hive of scum and villainy where all those disaffected with the Wikipedia culture or power structure go to bitch about the other site.

Friday, March 23, 2007

Ideas I would have blogged if I wasn't Twittering

Too busy to flesh these out into separate blog posts, so here they are in abbreviated form:

I see Nik Cubrilovic summed up a thought that I had too: that Open ID is not living up to the hype because everyone wants to be a provider but very few want to accept Open IDs from elsewhere. I learned very early on in the publishing business that publishing, be it offline or on, is all about your customer database, particularly contact details (i.e. email addresses). If you have little or no control over that, what is your business? Probably not a business. I don't blame AOL, Digg or the like from not accepting external Open IDs, but I do blame those who hype the technology without acknowledging the point that Nik makes so well. Unlike Nik, I don't want to support Open ID in my application, even though it comes standard in the People Aggregator code which we're still in the midst of integrating with Tinfinger. I'll be turning that feature off for the moment. Perhaps Marc Canter will want to make the business case for Open ID for me, but I don't see a compelling one.

I am on board the Twitter bandwagon, which stops about as often as a suburban train at peak hour for 10-minute outages. It's a fine success story at this stage of its launch cycle, though there are certainly things it could have done better. A default friend, for example, would be invaluable, especially if it was a bot that included 10 tutorial tweets about how to use the service. A way to differentiate humans from data feed bots would be very handy so you don't get friendly banter interspersed with the latest TechMeme headlines or Woot items... maybe even a separate tab for those insanely spammy location posts? Twitter with a NetVibes interface would, as the kiddies say, r0xx0r my b0xx0r. Though that's not surprising since they're both based on RSS, which as far as I'm concerned is the real star of Twitter.

I am skeptical about Google's PPA scheme. The simple fact of the matter is that PPA, or CPA or whatever the relevant acronym is, just doesn't pay as well. The Google ads on FanFooty do about four to six times better than our Commission Junction ads, which admittedly are all for eBay. If Google can find a way to keep publishers onside through some technological breakthrough then they will have something, but otherwise it just sounds like the Goog is mopping up the small fry of unexhausted inventory and this is no big deal.

And finally... FanFooty is doing 10 times the traffic and revenue that it did on year-ago figures. Tai and I had a nice meeting with a bloke from Champion Data earlier in the week, it's all good. This week marks the site's second birthday, so it's appropriate that it's only now getting past the crawling stage and is starting to make some noise...

Citizendium, Wikipedia and Tinfinger

I've set up a CompInt tab in my NetVibes to track blogs of those founders who I see as having similar goals to me. Pride of place goes to Larry Sanger's blog about Citizendium. The more I work on Tinfinger the more I am appreciating how the 5-second elevator pitch I have been using - "Tinfinger will do to the Who's Who what Wikipedia did to the Encyclopedia Britannica" - means that we will have to learn a lot from the Wikipedia experience.

Citizendium has already made many mistakes in trying to break free from the Wikipedia model while still retaining many of its advantages. Larry's latest post, entitled We Aren't Wikipedia, is the starkest illustration yet of what I think is his biggest error: he is continually defining Citizendium in terms of comparisons to Wikipedia. In January he and the CZ community made a fine decision to unfork, which should have been the impetus to delete all references to Wikipedia not only from the content pages, but also in the minds of the contributors. Yet Larry spent many post-unfork blog posts detailing his criticisms of the Essjay scandal, and he never lets an opportunity pass to bag the big W for some ethical slight or another, as if he thinks that is what motivates the CZ base. It may well be what motivates them, but if so, then that's not the basis for a healthy community IMO. You want people who are in your community for the community's sake, not just to spite some other project.

As for Tinfinger, we only share Citizendium's similarities with Wikipedia on items 5, 7 and 8 in Larry's list. We do share more differences, though: all but item 7 fit us to some extent. More details on how all of this works closer to the full launch.

The reason I have got so far into this debate is that Tinfinger is going to adopt Wikipedia's data structure: unique identifier strings for pages, which are cross-linked via category, type and property metatags for which pages are automatically generated. We have downloaded the relevant data from dbpedia.org of the roughly 60,000 profiles of people in Wikipedia. Instead of being stored in text-based infoboxes, however, we will store the tags as relational data in rows of MySQL tables to allow greater granularity in searches. However, like Citizendium, we will not republish any Wikipedia articles.

I don't know what that means in terms of licensing - the Wikipedia:Copyrights page doesn't talk about this issue - which is part of why I'm stating this in public. Perhaps there are lawyers well-versed in the GDFL who can tell me whether taking just the titles and metadata of these profile articles, not any of the prose text, means that the GDFL still applies. I want to act in good faith here, so I need some help.

Wednesday, March 14, 2007

Inform in limbo, News 2.0 deflates

Um... when did Inform.com turn off its public news aggregation service? I didn't notice that when it happened. It now appears to have turned into a private consultancy for mainstream media companies - Washington Post, New York Sun and NewsOK.com and VIBE.com being its reference clients. Its public service copped a fair hammering when it launched, and I suppose it makes sense to concentrate on where the money is. If you're going to act as outsourced product development for MSM companies, you might as well get paid by the hour while you're doing it, rather than wait for one of the giants to buy you out.

Looking around, a number of the companies listed in my Feature lists for News 2.0 post at the start of last year have changed their business models on the run, and some of them have stopped running entirely. Topix, which has been in the news recently, has shifted away from GooNews vanilla towards a hyperlocal strategy, not surprising because it was bought out by a consortium of local newspaper chains. NowPublic used to be all about video, but its front page is now a generic textual news aggregator. Daylife started out in January by looking schmick with eye candy everywhere, but their design seems to have been beaten down by user demands to fit the text-stuffed GN norm (also noteworthy: even after Mike Arrington dinged them for lacking RSS feeds at launch, they took eight weeks to add them). Findory and Bayosphere bit the dust, of course. Gabbr appears to be broken, none of its news links work. Backfence looks pretty rundown, like a small town where the train don't stop no more.

Paradoxically, it's the sites which haven't changed which look the best. Newsvine and Techmeme haven't iterated their design at all and they look fine to my eye, though it's a bit worrying that I still can't see any ads on Newsvine. I guess when you have funding you don't need to worry about revenue. Gather actually looks okay and has some ads, although whether it's worth the millions in VC is something only the investors will know.

News 2.0 has not had quite the crash that the naysayers had predicted, more of a slow deflation. Topix and Reddit are the current winners, I suppose, because they had a successful exit, plus NowPublic is pretty cosy with AP so they're on the right track [note: EDIT]. There is still time for some more victories, I bet.

The continuing saga of the Britney/suicide problem with AJAX

Via PaidContent, it was interesting to read Avenue A/Razorfish’s 2007 Digital Outlook Report which, as with most analyst reports, tells us things we already know. In this case, it tells us that page views are becoming obsolete in this increasingly AJAXified world, and some other way of monetising dynamically updated Web content has to be found:

AJAX Metrics and Time-Based Ad Serving. Analytic tools are already capable of measuring AJAX interactions. They simply monitor the number of requests to a server AJAX makes (known as an “event”), which allows us to infer interactivity. However, that’s still not a replacement for the page-view metric. Instead, look for this AJAX measurement to trigger timebased ad serving (e.g., the serving of ads, which refresh, over a given span of time). This seems like a much more appropriate tactic given the sheer amount of time users are spending while consuming audio and video online today.

This little pony would be nice to ride, but I'm not holding my breath. Apart from the technical, political and economic problems getting the ad providers to engage with this or some similar new model, there is also the issue of the Britney/suicide problem - where contextual ad content which is fed into an AJAX-flavoured page may be catastrophically unrelated to the updated content which has appeared since the initial load - which I blogged about almost a year ago and was talking about on the WebMasterWorld forums 18 months ago. There has been no indication that the GEMAYA ad network players have even spent any time on this dilemma, busy as they are working on audio/video content, defending lawsuits and other stepping stones on the path of global domination.

It's a sticky situation. Get it? Sticky! Aaahahaha, I kill me sometimes.

Saturday, March 10, 2007

Megatriples are a thousand times better than "Web 3.0"

Mummy and daddy are fighting and I don't like it. Dave Winer thinks Freebase is just the latest half-baked idea but Rich Skrenta says holy smokes, this is cool. Crazy uncle Nick Carr thinks it's the first major Web 3.0 application.

My position is firmly on Dave's side. Like Dave, I think Freebase and Pipes are just overhyped clones of Google Base, and we all know how that turned out. It seems like Silicon Valley is striving ever harder to build the biggest, emptiest vessel ever. I have little time for empty vessel startups. Give me content, give me value that I don't have to contribute myself. Give me something to latch on to. It's no use banging on about how you're building the Semantic Web if all that you contribute are some architectural drawings, and you expect someone else to do the heavy lifting.

As Kingsley Idehen points out, a far better example of the Semantic Web is dbpedia, a rendering of Wikipedia as searchable, downloadable RDF files. dbpedia holds around 25 million RDF triples, where triple refers to a W3C-approved syntax for abstracts of RDF files. The Persons dataset alone weighs in at just under half a megatriple, and when combined with the 8 Mtriple Infoboxes dataset it will provide metadata for around 59,000 people stored on Wikipedia. That is the sort of thing that makes my mouth water, because it can be added so easily to Tinfinger's 350,000-strong person database.

And while we're mentioning Web 3.0, here's something that has bugged me for quite a while now. Many bloggers pilloried Tim O'Reilly for coining and then trying to cash in on the term Web 2.0, and there was a noticeable push about six months ago to deprecate use of the term by the Arringtons, McManuses and Cashmores of this world. Now the people who want to be hip and cool talk about Web 3.0, as if 2.0 is all done and dusted and it would be totally gauche not to listen to the new stars of 3.0. I call bullshit. No, Tim O'Reilly does not own Web 2.0 and has no right to restrict usage of the term, but by the same token no one has the right to say that Web 2.0 is now useless and obsolete either. Nick Carr and John Markoff are guilty of the same elitist claptrap as Tim O'Reilly, and the surest evidence of this is that they're all on the same hypewagon on this one.

Sunday, March 04, 2007

Podshow metrics showdown: Adam Curry says 52,000,000/month, KATG says 12,000/day

Following on from the claim by Keith Malley of the Keith and The Girl podcast that Podshow had only 12,000 downloads a day of its podcasts (which I blogged about last week), Adam Curry has hit back with a number of his own:

For the record, In December 2006 the network produced 52 million download requests.

At first glance this seems to be a huge discrepancy, but as with most metrics discussions, we're talking apples versus oranges here. First, Podshow hosts podcasts but users can also search for podcasts outside the Podshow network through podshow.com. Presumably, the phrase "download requests" would also apply to searches for podcasts which are not hosted on its network... in other words, that 52 million includes people who listened to Diggnation or Morning Coffee Notes or even Keith and the Girl itself through Podshow searches. This is obviously a ridiculous metric.

Even if we're just talking about shows hosted on Podshow's servers, there are more than 25,000 podcast shows hosted on Podshow according to the comments in the Curry.com thread, but so-called "Podshow shows", i.e. shows that are wholly owned and produced by Podshow, are far less numerous. Wikipedia only lists five flagship shows: "The Dawn and Drew Show, Curry's own Daily Source Code, Madge Weinstein's Yeast Radio, CC Chapman's Accident Hash, and tech vidcast GeekBrief.TV". It is far more believable that Keith's 12,000 figure was relating to those flagship shows, not the third-party hosted shows and definitely not the podcast searches.

Adam Curry is being fast and loose with the facts, but that is to be expected in an environment where credible metrics are non-existent. Along those lines, I would be interested to hear the opinions of podcasters on this comment on the Curry story by Homer the Great (who also appeared in the comments of my last post):

I as a listener have donated money, bought products, supported movements and entered contests. The only thing that I can say is that is the only way that any shit talk could be put to rest is by everyone voluntarily agreeing to a common measuring stick. This said, I think it is a lousy idea. As soon as this is done, then podcasting becomes another form of media publishing that has no clue or commitment to the listeners. People who enjoy podcasts because they don’t behave like traditional media.

Having a Nielsen or Hitwise for podcasting is a double-edged sword. It puts a stop to pissing fights like this one and gives advertisers a reason to invest, but it does smack of the dreaded professionalism that Dave Winer so mistrusts. Just as AdSense-driven Web sites have an advantage over newspapers in that their metrics are completely granular down to the individual reader, podcasters can find out where each and every one of their listeners is coming from. Instead of tailoring content for pre-skewed focus groups and abstracted demographics, the Internet is supposed to herald a new age of perfect market knowledge.

Adam Curry is not helping matters by fudging his own figures. Adam, please break out that 52 million into external shows, hosted shows, and flagship Podshow shows. If you don't, your number has no credibility.

UPDATE: Keith Malley confirmed in KATG episode #454 that his 12,000/day figure was referring only to Podshow contracted shows. Adam Curry has still not coughed up an answer to this specific number.

Thursday, March 01, 2007

KATG claims Podshow has only 12,000 downloads a day

Keith and the Girl, arguably the biggest podcast in the world, has continued its occasional series of attacks on Podshow in its latest episode, Prove Me Wrong, by spending around the first 15 minutes of the episode claiming that the entirety of the Podshow network has only 12,000 downloads per day. This is less than the KATG show on its own, according to co-host Keith Malley. No word on where the figures came from as yet, but KATG do have a history of being leaked sensitive Podshow information from disgruntled insiders as in episode #247, The Podshow Contract, where they read out a leaked version of the contract offered to new recruits which had all sorts of restrictive music-label-like clauses.

The publication of such figures, if true, would be a further blow to the new audio media industry which has already seen consolidation in the satellite radio industry this year. I wonder how those numbers would stack up with The Podcast Network. Podshow has to date received US$15 million in venture capital while I don't remember hearing about TPN getting any funding, apart from out of Cameron Reilly's own pocket, yet it seems that TPN doesn't have much to beat.