TailRank Personalization / Gregarius sortbyurl plugin

I don’t know if I’ll ever finish my sort by url plugin for Gregarius. Two remaining issues: it’s very cpu-intensive during an update and especially on its first run and items with ads are artificially ranked higher.

Meanwhile, Alex linked to a cool alternative being considered by Kevin Burton at Tailrank. This is really awesome!

[WOW! WordPress demolished my list. I updated the post to fix the layout.]

FYI, my sortbyurl plugin for the Gregarius RSS Aggregator works like this:

Data Gathering:

* On install, process every item.

* When updating feeds, process items which have not been processed yet.

* When processing an item:

* Create a list of links referred to in the item content.

* Add to that list the permalink for the item.

* Use two db tables(urls,itemtourlmap) to make a connection between each item in the url list and the item itself.

Views

* At the top of every page, put 2 links: “Most popular urls” and “Most popular urls among unread posts”

* “Most popular urls” lists entries in the urls table in order of popularity(most popular first) and the feed items associated with those urls.

* “Most popular urls among unread posts” does the same, but only considering popularity in the context of unread posts.

* Under each feed item throughout the app, there is a list of urls that have a popularity score greater than 1, each with a button to see more items that are associated with that url.

Bugs

* I need to automatically ignore as many ad urls and ‘category’ urls as possible.

* I also need to make the ignore this url button work.

* I need to add warnings around the initial data mining process, which can take a long time.

* I need to make the process much less cpu intensive on the server.

%d bloggers like this: