Tinfinger: December 2005

Ben Barren on G'Day World

The inimitable Ben Barren joined the ever-Microsoftian Cameron Reilly one recent sunny South Melbourne morning for a skinny latte and a podcast, interspersed with Gameboy tech support for Cam's two sprogs. Highlights included Cameron's coinage of the new term "GEMAYA Islamiah" which I will be using liberally, and discussion of the 2.0-ness of Noam Chomsky. If Chommo was a twentysomething Valley boy right now, I bet he'd be right in the thick of it as CEO of his own 2.0 startup: drinking the red (star) cordial, propagandising for socialist bookmarking, blogging manifestoes about collectives-as-conversations, and building to flip (as long as it's the proletariat who's buying).

Old Man Riverin' along

Dave Winer is making an OPML/RSS aggregator. The market is saturated with RSS aggregators, so I guess adding OPML is the next iteration - not that difficult, so everyone and his dog will be doing it.

I may have had the odd word to say about Dave in the past, but that doesn't mean I don't respect the projects he champions. Tinfinger will be fully committed to providing RSS and OPML feeds, in both river-of-news and clustered views respectively for every appropriate page. Personally I also prefer the river-of-news format (which does not organise entries by topic, but instead puts them out in a flat time-based stream), but OPML enables a lot more variations in clustering techniques so people who want that should be accommodated. In particular, a river of clusters in the manner of Memeorandum should prove popular.

The Root of all evasion

I suppose I have to reply to Peter Caputa about Root.net's justification for existence now that he answered my annoying questions in the comments on his blog. Well, it's not technically his blog, he was hired recently to blog on behalf of Root.net, even though he's the CEO of his own three-year-old startup which has little or nothing to do with attention. I'm not as worried about the dodgy nature of that deal as I am about the fact that Peter's own blog features a huge picture of a cute girl! A more egregious a case of false advertising I have not yet seen in the blogosphere, just beating out Phil Sim's Squash which somehow gives out the wrong impression that he knows what he's talking about (just kidding mate!).

So anyway, where was I? Oh yes, attention. If memory serves, this thread of argument was started in a previous post where I tried to sieve Seth Goldstein's public pronouncements for nuggets of meaning. Peter attempts to keep the conversation going by replying at length (if not in detail) to my question about user benefits.

For clarity’s sake, rather than trying to compare the value of root.net’s services (in their current or future state) to the value of a PIM or the value that google or another web company that collects information about its users, it might be easier to explain the functionality that root.net’s services currently provide.

Peter, starting off by saying that you are not going to answer the question directly is the wrong way to go about answering anyone's question. It is a standard marketing technique to draw up a table listing features and benefits down the left and listing your company and its rivals across the top, with ticks and crosses and whatever in the table where appropriate. In the quest to get users to understand what you provide, might I suggest one of those?

Peter then goes on to list a number of features which Root's software will include, most of which sound useful. I have no problem with those functions, I am sure many people would enjoy being able to track their Web habits and whatever else is in the Root bag of tricks. All I ask is why people would want to use a service like that which requires them to give up their personal information, when they could get by without it and stick to their own memory and browser history. I question whether the average person would consider knowing their personal information is being stored by someone else is a high enough price to pay for what is, in reality, a very minor set of information management benefits. People give up their personal information all the time when they spend time and money at major commercial Web sites, that is true, but the ROI for the user in that case is measured in products and services they can consume, not just knowledge of their own habits.

The other aspect of this carrot-and-stick approach is that if it is shown that people want an application which does all the things Root's software does, what's to stop some enterprising programmer/s spamming the market with carrots: building an open source freeware app which performs all those functions yet, crucially, doesn't require giving up that personal data? The answer is nothing.

Then Peter gets on to the subject of control.

Regarding your statement of control, however, we believe we enable a lot more control and ownership of a person’s clickstream than any other service that stores clickstreams. Most websites (or toolbars) ask you to agree to provide your clickstream in their terms of service. We don’t know of any of them that let you see it. And we don’t know of any of them that let you delete it after you’ve shared it. We do all that. We think that is a lot more control. As you mentioned, the only way to maintain more control is to never share in the first place.

Root does not "enable a lot more control and ownership of a person’s clickstream" in comparison to not using any service which stores clickstreams. That's my argument, which you confirm. The only way I could see Root's system being superior to opting out would be if you managed to convince the owners of those sites and toolbars to sign up to a system which would legally bind them into deleting their own records of individual consumers' behaviour after adding them to the Root pile of aggregated data. I can't see that happening though, for reasons of competitive advantage as I explained in my previous post.

Future value to users of Root.net
Now, I assume you still are hazy about the value that we will provide in the future. Suffice to say, at this point, that users will financially gain from using /Root Vaults. We’ll be blogging more about what we have in development soon. Tune in after the New Year and we’ll be ready to unveil more of what we have up our sleeves.

As I have said before, how can you possibly give enough financial gain to make it worth the while of consumers? CRM data is only useful in huge lumps, meaning the value per user is vanishingly small. Even if you did manage to eke out a quantum of return value from retailers back to consumers which was meaningful to the individual, all the large retailers would add that cost per user (in addition to costs of operating the Root system) back on top of the price of their products, so it's a negative-sum gain and all Root ends up being is a parasite on each transaction.

This is my underlying complaint with Root. Seth Goldstein has openly stated that the venture is based on techniques developed in futures markets. Financial markets are parasites on the regular economy, and futures markets are parasites on the parasites. Futures markets do provide value back to the economy in terms of added stability and efficiency, I'm not arguing against their utility. However, they only work because they are parasiting off a "perfect market" in the sense that all pertinent data about its transactions and its participants are known, mostly due to legal requirements. Root, on the other hand, is trying to parasite off the most chaotic economic activity there is: that of sellers and buyers in the real economy trying to find each other. It will be a Sisyphean task trying to get market knowledge about Internet retail to the same level of perfection as financial markets.

So, Peter, I thank you for replying, but I still have two main issues with Root. The first is that I suspect that the carrot-and-stick approach on the demand side will fail due to either the stick being too intimidating or the carrot not being juicy enough. The second is that I suspect the larger players on the supply side will block any attempts Root makes to remove their competitive advantage. It's hard for these questions to be resolved at this early stage of Root's life, which makes Peter's blogging life difficult, so I wish him and the company luck.

We report, you decide to attack

The plagiarism meme has been boiling along for a week or so now, starting with Mike Arrington's article on Josh Stomel republishing his content. It was originally entitled "Fuck You, Josh Stomel", but Mike has since changed that to "Go To Hell, Josh Stomel" presumably in an effort to tone things down, though I fail to see much improvement in eternal damnation as opposed to presumably non-consensual sexual relations. Om Malik stoked the fires with his own attack on a splogger, and after a blog post by Jonathan B which journalistically covers the Stomel incident, Mike has continued with a rebuttal claiming that he is a victim being blamed. If that wasn't enough, Phil Sim weighs in with an I told you so.

Mike falls for that increasingly prevalent misconception that the multitudinous threats to journalism, particularly those of the Internet through the rise of amateur media, means that practitioners of the craft should abandon their professionalism and instead embrace partisanship. He called Jonathan's piece "disappointing", but I thought it was a perfectly reasonable article which was a worthy addition to the debate. Jonathan presented the facts he was able to learn in a consistent manner without making unsupported statements.

Journalistic professionalism is there for a reason. It is there because journalists (of which I am one by profession, if not current practice) have the humility to understand that they can not presume to know all the facts of a story, and thus have to be very cautious in making judgements. If you think that's pompous, think of it this way: journalists are very wary of being accused of getting something wrong, especially something as important as a value judgement, so they try to cover their arses as much as possible so as to minimise the amount of justifiable criticism they have to endure. Either way works, depending on whether you're a glass-half-full or half-empty kind of person.

I think anyone who had an understanding of how blogs work would read Jonathan's piece and draw a negative opinion of Josh's version of events. Too many things he says are ridiculous and/or display a naïveté which is hard to credit to someone in his position. Crucially, Jonathan did not make these value judgements for us. He merely presented the facts and quotes as Josh had provided them. That he did not accompany his piece with a tirade against the evil of Josh's actions does not reflect badly on Jonathan at all - there is a time for an argumentative essay and a time for a factual briefing.

Mike says: "Journalists have a responsibility to dig for the truth, not to blindly report what both sides are saying. Just because people disagree doesn’t mean the truth lies somewhere in the middle." Much the same argument is put forward by those fighting creationism or the lies propagated by the US religious right who are continually frustrated by what they see as moral relativism giving equal screen time to their obviously disingenuous opponents. However, I submit that journalists can not assume that their audience is too stupid to make their own judgements. That is how Fox News got so popular in the first place - by Murdoch seeing that CNN et al were talking down to the audience too much and that there was a market opportunity for an outwardly right-leaning network to offset the leftish lean of other TV news. Giving in to partisanship would only lock in the wrongness of Fox's philosophy, which is mirrored in the American political part of the blogosphere.

It's all about respect: the journalist and/or blogger respecting the intelligence of the audience. Lose that, and you turn into the very thing you hate.

Robots ruin Christmas

When the history of the human race comes to be written, chances are 2005 will only be remembered as the year that the human underground resistance to the coming robot revolution got a name: HUAR. Of course that history will probably be written by the robots as a footnote to their own glorious ascension to the Godhead, but robots do believe in attribution, at least. Dave Winer may not like Christmas or Christians, but robots don't worship the Son of God, or even their own silicon equivalent. If/when the last Homo sapiens skull is crushed under the last servo-assisted mechalimb, Jesus Christ will be completely forgotten on December 25, as opposed to the current state of being mostly forgotten on that day. The recent news of a robot displaying self-recognition (although self-recognition != self-awareness) only hastens the robocalypse. As Keith of KATG said: "They'll all have to go to therapy."

My own family had about as good a Christmas as was possible, with my 91-year-old grandmother's dementia receding long enough for a smooth lunch and tea without her saying anything too outrageous, although the non-appearance of both of my sisters due to being thousands of kilometres away (Perth and Malawi respectively) made it less than perfect.

So this rumination on robots was brought on by the paucity of articles on tech.memeorandum, which is currently down to five clusters. Humans, understandably, don't blog much around Christmas. Australia is scheduled to be the first major blogging centre to recover due to timezone placements (and summer-induced boredom), which has led to the strange phenomenon of Ben Barren scoring 40% of the page's real estate with posts about cricket and valuations.

Robots keep working over Christmas, inexorable and unyielding, deaf to the sounds of Jingle Bells, ignorant of the joys of It's A Wonderful Life. I got my first telephone spam on Christmas Day this year, with a disembodied robotic voice which couldn't fake being human quite well enough informing me that I had already won a prize of at least $40 worth by taking the call. That's the way they pull you in, with promises of free prizes and hyper-real-tasting roast dinners. 2006 will play host to a series of skirmishes in the HUAR war, with Tinfinger and others on the side of the living fighting with the big guns of human-authored content against the neverending onslaught of robot-generated artillery. Whose side will you be on? Will you sell out to the automatons for that quick fix of fleeting illusion, or will you stay true to your species and enjoy the deep knowledge of a thousand generations? Your genus needs you.

Tequp 1 roundup

The attendance list for the first Tequp unpissup:

Jeremy Hague, Netralia
Cris Pearson, plasq (aka atariboy)
Rachael Prins (aka jemgirl)
Daniel Parnell, Automagic Software
Pete Yandell, Alien Camel
Luke Tupper, freelance codemonkey
Manita Johnson, The Ball Group
Me (Paul Montgomery, Tinfinger)
and last but not least, the party slut himself: Cameron Reilly of The Podcast Network.

Cris says there were 11 there, but I only counted nine (update: Cris explains below). Rach and Cris formed a formidable faction in the corner, while apparently Daniel has been working with Manita on a custom imaging project for some big client, so there were already some collaborations going on. The St Arnou put on a decent spread with chicken kebabs, spring rolls, pizza and party pies all disappearing at a fair rate of knots.

The conversation was wide-ranging and more tech-focused than the Melbourne Long Tail Camp - the Tequp crowd was definitely more programmery than ideas-mannish. At one point Luke and Daniel got into heated debate over the relative merits of Objective C versus SmallTalk, at which I could only smile politely and nod, clearly out of my depth. Luke held forth at length about the relative merits of writing commercial software as opposed to Web apps, with his experiences trying to flog his Club Manager app making me think Web apps are the way to go for small developers. Manita, despite being from public relations and thus my mortal enemy, proved to be a worthy participant and may yet prove to be the bon vivant who will add that much-needed touch of glamour to the local Web 2.0 scene, in the manner of Miss Rogue. Cameron got in everyone's faces and whipped out his big, long, thick... microphone, asking them to nominate the top geek moment of 2005 and make a prediction for the biggest one of 2006. Of course, the lazy bugger hasn't uploaded the resulting podcast yet (update: Cameron said the answers were too lame).

The mood was definitely optimistic, with everyone either starting a new venture or getting a lot of requests from ideas-men who want programmers to help them start their own thing. No doubt about it, Melbourne is the place to be for the next Internet boom down under. I look forward to chatting further with both Tequp and Long Tail Camp people on the mailing list that Cris is going to set up.

Web 2.1404

Thanks to my Scottish mate Dumdeedum for pointing this out: Go to the Web 2.1 home page and click on the "original idea" link. 404. Self-effacing in-joke or sad indictment? You decide.

Update: And they spelt "inaugural" wrong in the text-as-jpg (double bad!) tabs up the top.

Riya's ROI: lawyers suing Flickr?

There has been some discussion recently about the legitimacy or otherwise of Riya as a buyout candidate by one of the GEMAYA giants, most notably Don Dodge asking what problem it solves. Following on from my post on what I fear is a looming legal crisis for Flickr, maybe Riya's main purpose will be discovery of evidence for lawyers prosecuting Web 2.0 image companies.

Let me explain. Here's what the EFF says about the right of publicity in its Legal Guide for Bloggers:

The right of publicity is a claim that you have used someone's name or likeness to your commercial advantage without consent and resulting in injury. The plaintiff generally must prove that you're using their image or likeness for advertising or other solicitations.

That's about as succinct as it gets, with the explanation that "injury" means financial injury, and "other solicitations" covers any products or services you sell. However, the Guide doesn't say much more on the subject beyond citing a few cases in brief. For the lawyers' side of the story, that was laid down in no uncertain terms by David L. Amkraut in a 2000 article entitled The 7 Deadly Myths of Internet Copyright. Remember that name. The right of publicity is actually a subcategory of the right to privacy, but Amkraut's article applies equally to publicity even though it is ostensibly about copyright.

Edward H. Rosenthal's contribution to a PLI handbook, Rights of Publicity and Entertainment Licensing, includes a section on the rights of publicity as specific to Web sites:

Content on web sites, including bulletin boards and chat, similarly do not require permission if not advertising in disguise. Stern v. Delphi Internet Services Corp., 165 Misc. 2d 21, 626 N.Y.S.2d 694 (Sup. Ct. New York Co. 1995) (chat-line). The chat-line concerned Howard Stern, a talk-show personality who at the time was running for governor of New York. The Court held that the chat-line, which permitted subscribers to use Stern's name in discussing Stern and his candidacy, was editorial content fully protected by the First Amendment. The Court, citing the leading case that deals with online services, Cubby Inc. v. CompuServe, Inc., 776 F. Supp. 135 (S.D.N.Y. 1991), agreed that an online service, even one where only paid subscribers may access the information services, is like a book store or a letter to the editor column in a newspaper. No permission is necessary to use the name of an individual in connection with such material. In Cubby the Court held: “A computerized database is the functional
equivalent of a more traditional news vendor, and the inconsistent application of a lower standard of liability to an electronic news distributor... than that which is applied to a public library, book store or news stand would impose an undue burden on the free flow of information.” 776 F. Supp at 140. This of course goes to the chat service aspect of the Delphi service as opposed to a web site's editorial material, but a publisher's own editorial material is certainly entitled to the same protection as the letters to the editor. Daniel v. Dow Jones & Co., 137 Misc. 2d 94, 520 N.Y.S.2d 334, 340 (N.Y. Civ. Ct. 1987) (news and information aspect of online services are entitled to the same protection as a newspaper).

IMO, Flickr and its ilk can't claim editorial privilege. Their sole purpose is the dissemination of photos, and there is far less text associated with each photo than would qualify as significant editorial content. It would be a big stretch to argue that the comments section on each photo's page was similar to letters to the editor sections of newspapers. Also, it is almost certainly true that Flickr's use of photos of people, including many celebrities, is "advertising in disguise", as they gain financial advantage from account fees, advertising, and presumably the placement of links for Photoshow, Englaze, Zazzle and Qoop.

There have already been several cases involving the right of publicity and Web sites in California. Take this one in which the defendants (including Compuserve) settled for almost $1 million for posting 431 images of nude or semi-nude women, at $790 per photo. Notice the name of the prosecuting attorney: David L. Amkraut. Or there's this one which the prosecutor (there's that name again) won on appeal concerning 417 erotic photographs stolen from one porn site and put on another.

Those cases were brought during the last Internet boom. Who's to say the next boom won't see a spate of similar suits as lawyers like Amkraut see a new bunch of targets with big fat VC-pumped wallets? There are already blogging lawyers with their fingers on the pulse of this issue like Carolyn E. Wright, who writes the PhotoAttorney blog and mentions the right of publicity a lot. Her signoff to every entry is "Take my advice; get professional help."

Thus my suspicion that Riya will eventually be used by lawyers in compiling lists of photos of their clients on public Web sites with a view to suing the provider for profiting off their clients' likenesses. I'm sure Yahoo's lawyers did due diligence on this issue prior to acquiring Flickr. I'd be fascinated to hear their take on it. I want someone to tell me I'm barking up the wrong tree, particularly as it would make my job a hell of a lot easier with Tinfinger. IANAL, but I'm playing one on my blog. Someone tell me I'm wrong, please!

Top 10 locks for Web 2.0 in 2006

Everyone seems to be doing predictions at the moment, so I thought I'd up the ante. If any of these predictions do NOT come true in 2006, I will submit to having an entire can of paint (1 litre minimum) poured over my head. That's one can of paint for each wrong prediction folks, and they'll be in different citrus colours. In the unlikely event, GIFs of the dousing will be taken and uploaded to this blog next December. So, here goes.

#10. Sploggers, keep on sploggin'.
Google will not do anything that is greatly effective against the spam blogging menace, meaning that at least 80% of all pings to the blog networks will be splogs by the end of the year. Paint colour: red.

#9. Spruikers, keep on spruikin'.
Robert Scoble will not go a calendar month without a least five favourable mentions of Microsoft projects. John Battelle will not go a calendar month without mentioning his book. Steve Gillmor will not go a calendar month without mentioning attention. Paint colour: dark blue.

#8. GEMAYAers, keep on GEMAYAin'.
Not a week will go by without a story related to Google, eBay, Microsoft, Amazon, Yahoo or AOL appearing on tech.memeorandum with at least ten blog entries linking to it. Also, not a week will go by without one of those stories being on top of the tech.memeorandum page. Paint colour: light blue.

#7. Snarkers, keep on snarkin'.
The amount of regularly-updated blogs whose sole purpose is to criticise Web 2.0 and/or its people will double, at least. Let's say there's a half a dozen now, so make the target 12. Paint colour: yellow.

#6. Misogynists, keep on missin'.
There will be at least half a dozen vehement attacks on women in Web 2.0, which will seem (to me) to be overblown compared to similar bloke-on-bloke arguments. Paint colour: purple.

#5. Tontos, keep on tontin'.
Rogers Cadenhead will not go a calendar month without mentioning Dave Winer and/or one of Dave's pet themes. Alex Barnett will not go a calendar month without mentioning Steve Gillmor and/or attention. I will not go a calendar month without mentioning Memeorandum and Gabe Rivera. Paint colour: brown.

#4. Yankees, keep on yankin'.
None of the GEMAYA companies will buy a Web 2.0 startup from outside the US for more than US$20 million. Paint colour: green.

#3. Robbers, keep on robbin'.
At least a dozen new words or phrases will be coined to describe old concepts. Paint colour: pink.

#2. Winer, keep on whinin'.
Dave Winer will not go a calendar month without complaining about not being given credit for something, nor complaining about Google. Paint colour: orange.

#1. Haters, keep on hatin'.
At least a dozen of Winer's posts will start flame wars across the blogosphere. Paint colour: white (for whiteanting).

There you have it. It would be worth having paint poured all over me for any of those not to come true in 2006, so I'll be half-cheering for them. As far as judging the various categories, I am the final judge and no correspondence shall be entered into. I think you'll find I'm a fair arbitrator.

The Flickr of Tiger Woods

So I'm in the process of building our profile pages for people listed on Tinfinger. One of the primary elements that a profile page should contain to be useful to the reader is a picture of the person in question. You might be thinking: why not allow users to upload a picture of the person, or code in the top results for the relevant tag at some image uploading place like Flickr? Ah, not so fast. I wonder whether Flickr is a set of lawsuits waiting to happen.

The reason I say that is not because of the problem of people uploading copyrighted images. I'm more concerned about the laws allowing a person to retain some measure of control over the dissemination of his or her likeness, and its use in commercial environments. It is known in some areas as celebrity likeness laws, or the right to publicity.

I keep reading Ben Barren's blog and seeing pictures of beautiful women from Flickr. Who's to say some Hollywood lawyers won't come and slap suits on Flickr and whomever mashes up their content for profiting (via Adsense) from the constant stream of unlicensed pictures of nubile starlets?

The law in this area in US in still in a state of flux, but there has been a recent big case involving Tiger Woods which sets out some strong precedent, relating to the artwork pictured at the right. This article is a good summary of prior cases, but was written before the Woods appeal was decided. This article sets out the full Woods decision. For those with too little time to comb through those links, in short:

You can't use a person's likeness in advertising without paying.
For non-advertising content, there is a tension between the artist's freedom of speech (in this context, the First Amendment) and the celebrity's right to publicity.
You can use a person's likeness in art, but only when the piece has a transformative effect on the likeness, so that the artist has "added a significant creative component of his own to [the celebrity's] identity".

It would seem to me that very few, if any, of the photos of celebrities uploaded to Flickr possess any of that "transformative" artistic element required to avoid problems with the right to publicity, and if they do then it's most likely not the original artist who has uploaded it so there are the usual copyright issues.

The implications of this seem worrying to Flickr and its ilk. If I was Tiger Woods' agent, I'd be ringing up these companies and demanding a cut of any revenue gained from images tagged with tigerwoods. Clickthrough when viewing a Tiger photo on Flickr? Cha-ching. Printing a Tiger Woods photo from Flickr? Cha-ching. Burning a DVD of Tiger pics on Photoshow or Englaze? Cha-ching. Making a calendar, poster or book with Tiger images on Qoop? Cha-ching. Buying a Tiger stamp from Zazzle? Cha-ching.

That's why I'm very wary of having unlicensed photos of people on Tinfinger. Am I being paranoid?

Tinfinger screenshot

For those few of you who signed up for the beta and are still waiting, here is our first screenshot. This is an example of the news pages.

You'll notice that there is a directory system rendered in blue and brown up the top, which works the same way as the old Yahoo directories but is more easily navigable on our site using AJAX. Sports buffs will notice that there are stories from both the NFL (gridiron) and AFL (Australian rules) in the news list, demonstrating how it works: each category not only includes news about people in our database classified into that category, but also allows stories filtered upwards from subcategories.

Our next milestone is to add blog entries that link to news stories, which will be rendered in a dark pink colour (tongue). Plus there will be a bunch of functions in the right column. And Tai's working on the user account system.

Yes, it looks like a lot of other news aggregators at this stage, particularly Memeorandum. You will notice that each of the names mentioned in the stories are links, which will point to our person profile pages, which is where the site will really come into its own. More on that later when it's ready for public viewing.

There are also three search boxes: by person, category and tag. Those last two overlap: a person is classified by category but also has a number of tags attached to them, so that for instance Kurt Warner is in Humans/Sport/Football/Gridiron/NFL but also has the tags arizonacardinals and quarterback associated with him, so that you can search for news only about people with that tag or in that category. Registered users will be able to add tags to people and stories.

The interface is not finished, but since we've only just now got it to the stage where it looks something like a normal page with normal-looking results, I thought someone would be interested. Anyone? Not even my mum reads this blog any more. :(

Another Melbourne unconference: Tequp

Apparently Elsternwick is not the only hot meeting spot for Melbourne-based dot com startuppers. Via Cameron Reilly, the latest ~~pissup~~ unconference is called Tequp. No, it's not an obscure country town in Western Australia. You won't need to have your NAVTEQ up to get to it either, since it's centrally located at 582 Little Collins St in the Melbourne CBD this Friday at 6pm. The $15 entry fee tells me the guys behind this are serious. A better price than the Web 2.0 conference, anyway.

On the guys: Jeremy Hague is apparently working on third-party Skype apps; Keith Lang a.k.a. SongCarver is a self-described "composer/songwriter/innovator" from Queensland; and Cris Pearson a.k.a. atariboy is a "Musician & Graphic/UI Designer" whose blog is currently showing a 403. They remind me a lot of parties I used to go to in Sydney populated exclusively by ultra-suave hipsters like those journos from Internet.au magazine, Kate Crawford and Nic Healey. I could never be that cool. Hopefully the tequp guys are a bit more down-to-earth. Again, the brand of beer they drink (if any) will be key: anything European and I think I'm on shaky ground.

In any case, it's just another signifier that Melbourne is where the next Australian Internet boom is going to be at. Suffer in yer jocks, Sydneyites!

UPDATE: I just figured out that Cris and Keith are two of the five principals of plasq, whose primary product is Comic Life, a photocomic creation tool that looks funkier than James Brown. Plus three audio products that look ready-made to be first over the barricades in the podcasting revolution. Wow, some really good-looking stuff there.

Ben Barren is the anti-Grinch

Ben Barren cancelled Anti-Xmas. Just as well, I wouldn't have been too talkative since I'm keeping crunch-time programmer hours, so a 7pm start is not currently part of my normal biorhythm. Let me tell you all sometime about my theory of how Earthlings are descended from Martians, based on what seems to me to be my own Mars-influenced sleep cycle.

Standardised icon for feeds, yay!

Cue Dave Winer complaining that he wasn't invited to the meeting in 3... 2... 1...

Tinscore and other ways to clone Memeorandum

Following on from my Decisions, decisions post last week here is what I've worked out. (Yes I'm spamming Memeorandum's name, so sue me! :P)

On the size of the reading lists, I've decided on quality. I understand Topix has an automated Webcrawler to discover new content, but we're not at that level yet so we'll have to plod along tortoise-style with our ugly forms entering in details for each site by hand. Hopefully that will mean better-quality whitelists of sites for each category. Nevertheless, Tai's next job is working on an automated crawler, or at least something that will aid humans in figuring out each HTML site's details more speedily.

On tinscore, the working name for our ranking algorithm: I'm going with arithmetic for now and we'll see how that goes. The line in our PHP code to work out the algorithm at the moment looks like this for each person mentioned in each story:

tinscore = ( intitle + freqscore + prominence ) * share * storysize

Intitle is either 0 or 1 depending on whether the person is mentioned by name in the title. freqscore is the number of times the person's surname is mentioned in the body text divided by 10, to a maximum of 2 (i.e. 20 mentions). Prominence is a number from 0 to 2 representing how early the person's full name appears in the body text, so that if it is right at the start then the prominence is 2, halfway through is 1 and at the end is 0, with non-integer amounts allowed in between. Share is freqscore for this person as a percentage of the total freqscore for all people mentioned in the story multiplied by 100, so that if a person is mentioned 5 times but other people are mentioned 10 times in total, their share is 33. Storysize is the length of the body text as a fraction of 2500 characters, with an upper limit of 1 if the article is over 2500 characters long. The upper limit of tinscore is 500 ( (1 + 2 + 2) * 100 * 1). The highest score I've seen from our test data is 462.

So that's the tinscore for the "snippet" where each person mentioned in each story. For the purposes of ranking the level of buzz over each person for display pages, the collective tinscores are modified by recency, meaning tinscores are marked down over time. And I haven't even gotten to adding blogs in there yet. That's my next job, with Tai away in Adelaide for a few days. It should be much easier than scraping HTML pages (O, the pain, the pain!).

Back to the decisions. On the size of the result set: everyone in our database gets indexed if their name is found in a relevant story. There are still some code to write to streamline the process, of course. At the moment with our display page for AFL players I'm doing a minimum of 16 database calls (usually 25+) with orders on a table with more than 45,000 rows (four months of archives of Media Street) and even on our test box on our local LAN I have time to go and get a cup of tea while the results trickle onto the page. As we did with Media Street, I always knew that pre-populating Tinfinger results was inevitable. I'm sure that's what Gabe Rivera at Memeorandum does too, that's why he "publishes" every five minutes and only shows that published data. Not that I'm criticising at all, that's the correct thing to do if you haven't got a server farm the size of Google's. As Tinfinger's range of categories grow even that prepopulated table structure might prove problematic if we stick results for all categories in the same table, as I will do at first. We'll cross that bridge when we come to it.

On the structure of human metadata: I guess it has to be keyword-based tags, not plain English with spaces. Tags are undeniably useful for a range of applications, although we sacrifice some usability for people not familiar with the way tags work (but even that can be solved with judicious use of plain English metadata attached to the tags).

Finally, on disambiguation... well, it's still a bitch. Having said that, it will be worth the effort, since I would hate for Tinfinger to have as many false positives as Zoominfo seems to have from my (admittedly limited) sampling. I think for a human search engine to have any credibility, it should strive for as close to zero false positives as possible. I'd rather leave out 99 accurate items than include a wrong one. Perfection may be forever elusive, as with Wikipedia, but if you set up the site's processes so as to concentrate your best energies towards the quest for accuracy then that's the best anyone can do.

ITJourno and the flogosphere

Irony is all over this one, but I'll try to wade through it. ITJourno.com.au, the site started by my old journalist sparring partner Phil Sim, today posted a rant by Phil entitled Something smelly in the flogosphere. That link is behind a login wall, as is the entire site - it's invite-only for Australian technology journalists and PR flacks.

Phil has a problem with blogs, specifically that they play fast and loose with copyright. He didn't like it that someone leaked an edition of ITJ's Epitome column to Frank Arrigo of Microsoft Australia, who then pasted it in full on his blog and commented upon it. I blogged about it at the time, as did Fairfax journo Mark Jones. Without quoting the whole thing, here's the nub of Phil's argument:

The ITJourno content is firewalled for a reason. It’s existence is based on the commercial model that led to it’s creation, which necessitates password-protected, subscriber-only consumption. We fail to see the "virtuous" side of Frank Arrigo, or anyone else, copying and pasting into their blog a large slab of content, which they do not even have fair and proper access to in the first place.

The obvious rejoinder to that is that a journalist bemoaning leaks is a tad silly - no offence, Phil. Good journalism, much of which is reliant on leaks, is the main focus of the Epitome columns. Each day, points are awarded by the Epitome writers (Phil, Simon Sharwood and Ian Yates) for the newsworthiness of published stories and prizes are given out each month and year for the best journalists and publications, with the big points usually reserved for stories which are based on leaks. You might not think giving someone else a copy of a private blog deserves to be called a leak in comparison to a whistleblower giving classified government documents or other such highly sensitive material to a journalist, but IMO there is enough of a similarity there to justify a legitimate criticism of Phil's position.

If we ignore Phil's journalism background and just focus on his argument as a businessman, then I think he has a valid point. If he chooses to go down the route of having walled up content, then other publishers (given that everyone posting on the Internet is a publisher) should respect that, or face the consequences. People may gainsay Phil's choice, as Kevin Leversee of Pandora Squared did in the comments of Mark Jones' blog:

Mark- It is like driving a nail with your forehead...ITJourno so needs to come into Web 2.0 enablement...
reminds me of Dan Gilmore- He had it rough in Silicon Valley- then he realised something,
His readers know more than he does. As you know success is embracing the shift and learning how to ride the wave.

I've often wondered why Phil doesn't open Epitome up to the outside world, as it could be like a localised prose version of tech.memeorandum.com with a few little tweaks (like easing up on journo in-jokes) and have as much success. Epitome serves its purpose though, which is to lock journalist eyeballs to ITJ every day so that it becomes a vehicle for his other services - and there's nothing wrong with that.

So, while he's on that track, I'm right there with him. Then he takes it a bit further. He talks a bit about splogs which use other sites' content - that's fine, we're all against that. As Tony Kornheiser would say, the big finish:

Surely as blogs, forums, search engines, news aggregation services and so forth, increasingly become “gateways” to content, mainstream media outlets will have to tackle this issue, and not too long into the future. A probable first step will be for sites to very clearly spell out what it considers fair re-use of its content - how much can be quoted, and under what terms and conditions this is allowed. And then somebody, somewhere needs to launch a couple of legal cases and put the fear of God into bloggers, and forum operators.

Perhaps the media industry needs a watchdog, along the lines of the software industry's BSA, or the MIPI organisation operated by the music industry.

Part of the wonder and utility of the Internet is it's hyperlinked and self-referential nature. However, just like a piece of software or a piece of music, content is a product and its owners should have the right to grant access to its consumption, based on reasonable terms and conditions that they see fit.

I realise that's a rather large quote so I might incur Phil's wrath merely by reprinting it, but I thought anything smaller would lose the context in which the argument was couched.

Many of those who consider themselves part of the blogosphere would take great exception to what Phil is proposing. Doc Searls et al pooh-pooh the "silo" or "walled garden" approach to content to which Phil (and the site where Mark Jones' professional content is published, afr.com) adhere as being antithetical to free marketplaces. I happen to agree with Phil's last sentence, although judging what is reasonable is problematic.

Apart from any ideological arguments, using the BSA and MIPI as examples demonstrates the futility of standing like King Canute and roaring at the wave to stop coming, instead of learning how to ride it as Kevin says. Readers are going to reuse content in the digital age whether the publishers like it or not, unless they impose DRM - and no major DRM system for content has yet worked sufficiently to deliver on publishers' wishes while simultaneously not annoying the public.

I have a great respect for Phil and consider him a friend. He has built a business out of nothing, and I can only hope to have the success he has had. However he, like many journalists, is still in the process of internalising this Web 2.0 stuff. He eventually got podcasting, so hopefully he'll join us in the flogosphere sometime soon.

10,000 words about Dave Winer

The recent kerfuffle over Adam Curry editing the podcasting entry on Wikipedia leads to an obvious followup: what about the Dave Winer entry? It turns out that the Talk page for the Dave Winer entry has just hit 10,000 words in length. Much of it is taken up with arguments between Betsy Devine and Ben Houston, with Betsy taking up what might loosely be described as the pro-Dave position and Ben championing the anti-Dave coalition, with submit buttons deployed at twenty paces. The discussion has recently been joined by an anonymous editor from the IP 71.108.206.31, which resolves back to pool-71-108-206-31.lsanca.dsl-w.verizon.net, implying a location somewhere in Los Angeles - this person is not necessarily anti-Dave, but he accuses Betsy of being biased.

Betsy you shouldn't be anywhere near this entry, being a personal friend of someone is an automatic violation of POV. White washing the article to help a friend is so against the spirit behind Wikipedia it isn't even funny.

As evidence of Betsy's pro-Dave leanings, she mentions frequently on her blog about meeting Dave, she has recorded a podcast with him and has been known to post a picture she Photoshopped of Winer's head pasted onto a statue of Socrates. Betsy hits back by labelling the anti-Dave material she edited out as being "unbalanced and unencyclopedic". You learn a new word every day. In any case, the latest revision of the Relationship to the Public section seems to be a kind of consensus, although undoubtedly it will be changed again soon (10 revisions in the last week alone!).

I'm not an historian so I can't run a professional eye over the entry, but it seems to me to be lacking in several respects. Dave's early work is glossed over - I would like to hear more about Dave's role in ThinkTank, Ready and MORE than just namechecks. I'm not sure we need to know he is (or was) neighbours with Joan Baez. There is only one sentence about OPML. One part of the entry does ring true: "Winer is known as one of the more polarizing figures in the blogging community." Does that mean that any attempt made through the Wikipedia process is doomed to fall victim to partisanship? Not if the contributors take a professional attitude, IMO. More incontrovertible facts, and full attribution for any opinions.

m0ntycast II: Adjutant Revenue

I promised Pete Cashmore I'd blog about this, but I figured I'd save time by m0ntycast about it. Instead of calling my proposed Web 2.0 revenue type "productisation", I decided to rename it as adjutant revenue.

Links for subjects of discussion:

Dion Hinchliffe's post Struggling to Monetize Web 2.0
Phil Wainewright's post How to fund on-demand applications
Pete Cashmore's post On Business Models for Web 2.0
A Patrick Cook cartoon
Definition #1 for adjutant revenue: When a company attempts to devise its own products that do not use the IP of other sites, but relate to the same things that the other sites are covering, with the assumption that your users are familiar with that content because they have visited those other sites (probably through your links or mashup code).
Definition #2 for adjutant revenue: Creating new, independently legally defensible IP which is nonetheless dependent on consumers’ broad knowledge of IP belonging to some other entity.
definitions of fantasy football
Example of a FanFooty scores page

So what examples can you think of to illustrate the as-yet-fragile concept of adjutant revenue? Or am I full of crap? Or both?

Now Telstra is blogging

As reported by Whirlpool, ZDNet and the Australian, Telstra has announced that it has entered the blogosphere with nowwearetalking. Phil Burgess, one of the "three amigos" parachuted in as part of Telstra's all-new all-Seppo management team, explains how it came about. He starts off by saying the site came about from a discussion he had with a Telstra shareholder, which apparently he had recorded secretly since the transcript he relates to us runs to over 200 words of beautiful pro-Telstra rhetoric. Perhaps Phil should use that eidetic memory of his to become a journalist, if he can remember that much dialogue with perfect clarity. Or maybe a PR person made up this character...? Nah, that's too cynical. Surely you wouldn't have lied to us in your very first blog entry, Phil! Say it ain't so!

After that dubious beginning, other Telstra bloggers pooh-pooh Skype, remind us about the value of fixed-line phones, tell farmers they don't need broadband, argue for a 52% increase in directors' fees, remind us again of the value of fixed-line phones, explore the wonders of hiring minimum-wage teenagers, dampen Australian wi-fi expectations, and portray Telstra's CEO Sol Trujillo as being boiled alive by regulators and competitors.

Let me see now, what sort of bloggers would benefit Telstra's customers? Maybe a blogger who is in charge of customer service. Maybe one from the faults department. Maybe one from the billing side. Nope, it's all about puffing up Telstra's internal corporate objectives of kicking the ACCC and bullying the government into lifting the brakes on its monopoly.

If any further evidence was needed about Telstra not getting blogging, it is the guideline that comments have to wait ONE TO TWO DAYS for moderation. Also, there are no trackbacks and no links to the commenters' own blogs. Contrast this approach to Microsoft or Google or even IBM. Who does Telstra think it's kidding? Oh, that's right - its shareholders.

Update: Mark Jones points out the lack of RSS feeds.

It's all about the Mawsons

Whenever my mother asks about how Tinfinger's going, she never fails to slip in a quick question along the lines of: "... so have you made any money yet?" Cripes, I swear she is obsessed. (Hi Mum!)

It's a fair question nonetheless, and the issue of revenue sources for companies like mine has been raised many times recently as some start to get over the gee-whiz phase and start to wonder whether any of them are sustainable. Dave Winer calls Google's advertising-based strategy "a temporary transitional thing", and Dion Hinchcliffe ponders the overdue arrival of innovative monetisation techniques.

I am skeptical of the skepticism of those who say advertising is a short-term bubble. Quite apart from the stark reality of Google's numbers, you only have to look at the research suggesting that the market still has a lot of room to grow, especially considering the percentage of leisure time people spend online is far ahead of the proportion of spending by advertisers on the Internet.

Most of the critics of online advertising's potential are (a) programmers, and (b) well-off old men. Well-off old men don't like being advertised at because they've bought everything they think is worth buying already. That doesn't mean other types of people don't accept advertising as part of the media landscape - particularly online where the ads are far better targeted. Some programmers seem to think no content at all should be monetised. In this make-believe world of mass amateurism, content should be completely free and all we pay for is software (although a subset, most of whom are also well-off old men, think we shouldn't pay for that either). This is a minority view, held only by people who have already made their fortunes... in software. In the real world, people can and do expect to get paid if they create content that a large number of people want to experience.

Having said all that, I'm not sure advertising is going to help Tinfinger all that much. In particular, I don't think keyword-based advertising will work very well for pages based on names. Maybe if Tinfinger (or sites like it) take off then names of famous and semi-famous people will become hot property on Google AdWords and its equivalents, but at the moment I'm not seeing it.

That means we will have to think up new new things to sell. The thinking I've had along these lines is based on the experience of The Footy Show. For those of you who don't know what that is, it's a television program about Australian Rules football, but for many years at the start of its existence, it didn't have the rights to show any footage of games in the AFL, the main league. The producers of the show turned this negative into a positive by making the show about everything that happened outside the ground, especially about the personalities in football. Although they didn't have rights to use any of the intellectual property of the AFL, they effectively created their own IP around the game, and ended up being the dominant show about the AFL outside the actual game broadcasts (and often more popular than those, too).

That model is one that Web 2.0 companies would do well to emulate. Most of them don't own their own data stores (like mashups), or at least are largely based on others' data (like aggregators). All most of them own is their userlists. My theory is that like The Footy Show, Tinfinger can use a bit of lateral thinking to devise its own products that do not use the IP of other sites, but relate to the same things that the other sites are covering, with the assumption that your users are familiar with that content because they have visited those other sites (probably through your links or mashup code).

What that will mean in practice will be different for each site. Tinfinger's focus is on people, so there are a bunch of opportunities waiting for us on that front. Or so my mother is hoping.

Decisions, decisions

We're in the pre-beta crunch for Tinfinger, and I'm wrestling with several questions that will probably not be answered until after the beta is over. Namely:

- size of the reading lists: try to index as much as possible (Topix.net) or aim for quality over quantity (Memeorandum)?
- tinscore (the working name for our ranking algorithm): not just how to weight the various variables, but what operands to use... can it be done just by arithmetic or is calculus necessary?
- size of the result set: try to find as many people as possible in each story, or only record the most important X number of people in the story, or only those who reach Y tinscore?
- structure of human metadata: use keywords, i.e. normal English language structure, or use tags comprised of spaceless keyphrases?
- disambiguation, i.e. where two people in the same category have the same first and last name... God, that's a pain in the arse. Not to mention middle initials, titles, junior/senior, Roman numerals...

Aussie Web BBQ 2.0: Dec 16

We have a date for the sequel to the Melbourne Long Tail Camp. According to MC Ben Barren, Friday December 16 is when local and interlocal Web entrepreneurs, coders, journos, hangers-on and relatives of Michael Leone will come together to discuss goings on in the Web 2.0 world, and to participate in the feeding-time-at-the-croc-farm free-for-all that is the demo/presentation process (as Ben eloquently puts it, "we turn your Web 2.0 demo into Jam").

As Ben adds, we'll have some beta Tinfinger code to show off, and I have heard a whisper or two that Ben himself may finally have something to demo. Whether that's just his rendition of 2.0 Ain't Noise Pollution I don't know. He has remained vague in the extreme about what it is his venture actually is beyond that it's something feed-related, so I'm dead curious. All will, hopefully, be revealed on the 16th. (I'm guessing same Bat time, same Bat channel as the last one - meaning 7 for 7.30, at 33 Regent St in Elsternwick - but since Ben hasn't officially confirmed that I suppose you have to watch this space.)

As Dipper says: BE THERE!

Tinfinger beta open for applications

Yes it's a tired old cliche I know, but the beta will nevertheless be worthwhile for us as we iron out the bugs in our rapidly accreting code base. We hope to have it up and running before the next Red Cordial Camp or Long Tail Camp (or whatever it's called by then), which will be happening Real Soon Now (hurry up with announcing it, Ben!).

Interested parties can register their interest by entering their name and email address below.

Recent posts

Archives

My Projects FanFooty FanFooty blog Coaches Box Hapsberg Fair To Say Tinfinger Table Vs Jetski W2FSOC PNOOMA Stuck Pigs About Paul Montgomery

Blogroll ben barren • Cam Reilly Penguinx • Jibble • UAC PlanetCrap • morn • Max Caryn • Marsh Davies Foodbunny • Karel Donk Leslie • jjohnsen Phil Sim • Mark Jones Luke W • Laura T

Saturday, December 31, 2005

Ben Barren on G'Day World

Friday, December 30, 2005

Old Man Riverin' along

The Root of all evasion

Tuesday, December 27, 2005

We report, you decide to attack

Robots ruin Christmas

Monday, December 26, 2005

Tequp 1 roundup

Friday, December 23, 2005

Web 2.1404

Thursday, December 22, 2005

Riya's ROI: lawyers suing Flickr?

Wednesday, December 21, 2005

Top 10 locks for Web 2.0 in 2006

The Flickr of Tiger Woods

Tuesday, December 20, 2005

Tinfinger screenshot

Sunday, December 18, 2005

Another Melbourne unconference: Tequp

Friday, December 16, 2005

Ben Barren is the anti-Grinch

Thursday, December 15, 2005

Standardised icon for feeds, yay!

Tuesday, December 13, 2005

Tinscore and other ways to clone Memeorandum

Monday, December 12, 2005

ITJourno and the flogosphere

Sunday, December 11, 2005

10,000 words about Dave Winer

Saturday, December 10, 2005

m0ntycast II: Adjutant Revenue

Thursday, December 08, 2005

Now Telstra is blogging

Wednesday, December 07, 2005

It's all about the Mawsons

Monday, December 05, 2005

Decisions, decisions

Sunday, December 04, 2005

Aussie Web BBQ 2.0: Dec 16

Thursday, December 01, 2005

Tinfinger beta open for applications

My Projects

FanFooty

FanFooty blog

Coaches Box

Hapsberg

Fair To Say

Tinfinger

Table Vs Jetski

W2FSOC

PNOOMA

Stuck Pigs

About Paul Montgomery

Blogroll

ben barren • Cam Reilly

Penguinx • Jibble • UAC

PlanetCrap • morn • Max

Caryn • Marsh Davies

Foodbunny • Karel Donk

Leslie • jjohnsen

Phil Sim • Mark Jones

Luke W • Laura T