Tuesday, May 23, 2006

Topix stands up for the little guy in News 2.0

The San Jose Mercury has apparently broken the embargo on a story about Topix doing a deal with the Associated Press, with Topix's marketing dude Chris Tolles adding their spin on the company blog. While we'll have to wait until tomorrow (US time) for the official press release with all the details, the Merc portrays the deal as being about rewarding smaller US newspapers for contributing stories to the AP's wire feed by linking back to the source instead of, as every search engine does, spreading the results across every other site that buys the AP copy and shovels it into their newshole.

My immediate reaction was to wonder what this would do to Newsvine, which buys its own AP feed to fuel its social news features, but I don't think it affects them all that much. Maybe if Topix whitelabelled its forum features for a large number of those local sites, then it might become a problem. I suspect that Topix's long-term strategy might be to integrate more closely with those multitudinous local newspaper sites to undermine Newsvine at the grassroots, which I think is a valid strategy that would nevertheless take a long time to take root. But then again, what do I know about American newspapers? Nothing.

The deal appears to be addressing an issue I've mentioned before, which is how news aggregators choose the link between hundreds of similar sites which buy the same wire feed. Topix are doing the right thing by the newspapers, which is not surprising since they are 75% owned by newspaper chains Knight-Ridder, Tribune and Gannett (though the last of these is in the late stages of selling its stake to McClatchy). As I've said before, only ethics would prevent other aggregators from auctioning off preferred link status to wire stories.

As the Merc piece points out, Topix can only do so much in this field as it is dominated by GEMAYA and old media giants. Would the algorithms at MSN Newsbot reward the newspaper journalist who was the author of a story if they could possibly link to an MSNBC page containing the feed copy of the same story? I was going to say "bollocks" to that, but now that I look at the Newsbot again I notice that they do show stories labelled as "AP via San Francisco Chronicle" and "AP via Daily Press" which link back to non-MSNBC sites. Hmm.

Anyway, I guess the point of Topix's deal is to reward the smaller newspapers, which might not benefit from linkage from the major news aggregators for local stories. I'd be interested to hear from newspaper people for whom this scratches an itch... was it that big of a problem in the first place? Which aggregators are the poorest at giving linkage props where they are due?


Anonymous Mike D. said...

Yeah, I'm not really sure why this even qualifies as a "deal", but it's a good thing so hey, I'm all for it. The same thing could be accomplished if the AP simply added a field to their XML which listed the originating newspaper and associated URL. Heck, we'd be happy to use that field and link back to the papers *anyway*! Attribution is a good thing!

Perhaps that's really all this is. Just a modification of the AP feed and Topix has agreed to use the new data. If so, that's great. Kudos Topix. We're happy to use the new data as well, if it's part of the feed.

2:54 am, May 24, 2006  
Anonymous Rich Skrenta said...

Well actually, even if the AP supplied the originating source in their feed to you, you *wouldn't* be able to match it up with downstream copies of the story -- ones that had been edited by the AP bureau, and then potentially further edited by the local member that puts up the story.

Not without clustering technology that can tell the difference between edited versions of the same story, and differently authored stories about the same event.

Sure, anyone can license the AP's off-the-shelf US & World news product. But if you want to create a more comprehensive product and link to other sources, determining the propagation of a story becomes a real issue. Topix.net is licensing back to the AP the result of our algorithmic determination of the full AP feed's propagation through the news sphere.

Downstream licensees of the AP without the benefit of their own advanced clustering technology may be able to license our news map from the AP...you may want to inquire. :-)

4:35 am, May 24, 2006  
Anonymous Mike D. said...

Ok, well that's potentially interesting I guess. Will have to delve further into the details.

We actually can match similar-but-different versions of the same AP stories already though. Knowing about them in the first place is a bit tricky of course, but once they are in the system (whether by full-text or by remote URL), we perform semantic analyses on the content.

The ironic thing about this is that you actually *need* to build this technology yourself to deal with the AP's own method of updating of stories. For instance, often times the AP will send an incomplete story down the wire, then update it, and then put out another story which is similar but different. You need to know when to update the content in place and when to create a new article entirely. It's kind of bizarre. Not based on IDs at all. This becomes especially important when dealing with in-progress sporting events as well. The "recap" will actually sometimes have a "date created" before the event is even finished because that's when the reports started to come in.

In any case, I'm happy to see any progress towards offering this sort of thing automatically, so well done!

8:56 am, May 24, 2006  

Post a Comment

<< Home