Find Similar Users on del.icio.us

Download: delicious_mates.py

On the social bookmarking site del.icio.us, you can add other users to your network to see their recent bookmarks aggregated on one page. There are two kinds of people in my network: 1. Friends and 2. Users I don’t know personally, but who regularly post interesting links. People who bookmark the same things that I bookmark are likely to have similar interests and are thus likely to continue bookmarking interesting things in the future. Since the information on who bookmarked what URL is public, the process of finding people with similar interests can be automated.

Urban Hafner brought up this idea three years ago, but I could not find an implementation. Yesterday, I wrote a short Python (2.5) script that implements the following ideas:

  1. Look at every link in your bookmarks: Who bookmarked the same page? Add these users to a list of people possibly similar to you.
  2. The more bookmarks another user has in common with you, the higher your similarity.
  3. The smaller the number of people who bookmarked a page, the more significant the fact that another user has this bookmark in common with you.
  4. If a user has lots of bookmarks, common bookmarks are less remarkable. The percentage of common links counts.

Here is what you need to do to find people whose interests are similar to yours:

  1. Download delicious_mates.py
  2. Run python ./delicious_mates.py
  3. Wait — this takes some time.

Then, what you will see will look something like this:

andreas> python ./delicious_mates.py
Your del.icio.us username? andreas.s
Your del.icio.us password?

Fetching list of bookmarks ... (485)

Fetching list of users for each bookmark ...
    1. http://en.wikipedia.org/wiki/Langton's_ant (7)
    2. http://www.nytimes.com/2008/05/13/science/13coat.html?_r=2&partner=rssnyt&emc=rss&oref=slogin&oref=login (0)
    3. http://sifter.org/~simon/journal/20080509.2.html (2)
    4. http://www.intercult.su.se/cultaptation/tournament.php (16)
    5. http://www.pnas.org/cgi/content/abstract/0801268105v1 (49)
    6. http://atlas-conferences.com/c/a/n/i/15.htm (0)
	[..]
    480. http://lifeboat.com/ex/main (129)
    481. http://prize.hutter1.net/ (137)
    482. http://www.psg.com/~dlamkins/sl/cover.html (194)
    483. http://www.idsia.ch/~juergen/ (157)
    484. http://sl4.org/wiki/ShannonInformation (1)
    485. http://www.scottaaronson.com/writings/ (13)

Finding 50 candidates from list of 49937 users ...
    rainer (42/9367) ok
    siggiB (36/1006) ok
    fhtagn (35/2471) ok
    anissimov (35/6281) ok
    jbone (41/19982) ok
    invisibleandpink (16/203) ok
    irchans (16/610) ok
    ferrouswheel (14/946) ok
	[..]
    eggywat (14/4290) ok
    Cunya (8/3133) ok
    getpost (13/9787) ok
    hannu (12/3915) ok
    lispmeister (11/2967) ok
    dean.vanniekerk (12/5873) ok
    rgrant (9/1769) ok

Top 50 del.icio.us mates:
username             weight               # common bookmarks   # total bookmarks    % common
——————————————————————————————————————————————————————————————————————————————————————————————
siggiB               54.92937             36                   1022                 3.52250
invisibleandpink     53.05635             16                   203                  7.88177
fhtagn               20.61878             36                   2474                 1.45513
irchans              15.61606             16                   611                  2.61866
fogeli               12.85109             16                   628                  2.54777
ferrouswheel         7.99177              14                   946                  1.47992
rainer               7.60971              43                   9423                 0.45633
anissimov            6.46154              34                   6309                 0.53891
pdorrell             4.10251              19                   3041                 0.62479
jefallbright         3.87392              12                   1193                 1.00587
ladro                3.76458              12                   1526                 0.78637
miguel1626           3.04347              11                   1039                 1.05871
jbone                2.62672              41                   20039                0.20460
hartmut              2.15612              9                    1316                 0.68389
asciilifeform        2.04304              10                   1581                 0.63251
tmalin               1.89627              12                   2181                 0.55021
herrmann             1.84991              21                   5410                 0.38817
jas0nm               1.82646              20                   3808                 0.52521
[..]

If you look at the script, you will find a few settings you might want to change. For each of these holds: The higher you set them, the more time it takes for the script to finish.

  • MAX_MATES

    is the maximum number of similar users the script suggests.

  • MAX_BOOKMARKS

    defines how many of your bookmarks the script will look at.

  • BOOKMARK_FILTER

    defines which types of bookmarks are analyzed. Remove

    , "no"

    from

    {"shared" : [None, "yes", "no"]}

    to exclude private bookmarks.

  • MATE_MIN_BOOKMARKS

    sets a minimum for the number of bookmarks a del.icio.us user needs to have before he can be considered to be similar to you.

  • MATE_MIN_COMMON

    sets a minimum to the number of bookmarks a user has to have in common with you to be included in the list of similar users.

The script needs two Python modules, the parser BeautifulSoup and Michael Noll’s del.icio.us Python API. If the script does not find one of the modules, it will download the missing module to the current directory and import it from there. If you don’t like this because you believe this is a security nightmare (which it is), don’t run delicious_mates.py install the two modules beforehand.

Do you know some Javascript and have spare time? I would love to see the script converted into a direc.tor-like bookmarklet. The need to download and run a Python script makes finding similar users more complicated than it chould be.

The feature I like best about the online bookshelf LibraryThing is its Unsuggester: Name a book you have read and it suggests those books that are least likely to be on your bookshelf. I like it because it is a means to counteract the temptation to adjust your sources of information such that whatever you read reinforces your point of view. Seeing how easy it is to give in to this temptation, is a script that makes it easier to surround yourself with like-minded people just one more sign of a general trend towards biased, largely isolated online communities?

72 Kommentare

  1. nice job.
    is there a way to avoid private bookmarks?

    Michael Noll writes:
    “Only public del.icio.us data will be mined. This means that this API does not (yet) provide means to access your private bookmarking data.”

    but yours do Òó

  2. Thanks.

    My script retrieves your bookmarks from https://api.del.icio.us/v1/posts/all. According to the del.icio.us API docs, there is no parameter you can pass to the URL to exclude private bookmarks. However, private bookmarks in the XML results do have a shared=”no” attribute, therefore it’s just a matter of extending the regular expression that extracts the links (or, which is probably a better idea, use BeautifulSoup to extract the links). I’ll send you a note once I have added such an option.

  3. patiently waiting :)

  4. New option: BOOKMARK_FILTER defines which types of bookmarks are analyzed. Remove , "no" from {"shared" : [None, "yes", "no"]} to exclude private bookmarks.

  5. I’m in the process of writing something that uses urllib2 and BeautifulSoup and I’m a heavy user of del.icio.us, so thank you very much.

    I have 1000 bookmarks in del.icio.us (http://del.icio.us/dev_eddie) and delicios_mates.py runs slow as hell. It really needs threads/processes

  6. thank you very much indeed!
    now its perfect :)

  7. dev_eddie: The limiting factor are the del.icio.us API guidelines which require a pause of at least one second between requests. Somehow the del.icio.us people managed to make me care.

  8. Your script uses up tons of memory. For me, it had hit nearly 500 Mb by the time it had downloaded 150 bookmarks. With a total of over 600, this is obviously not going to work (especially as I only have 500 Mb physical memory).

    I can’t see what it wrong, perhaps deliciousapi is caching data or something. But I cannot use it in its present form.

    Also, automatically downloading and running code from a script like this (ximport) is bad practice, IMO. I checked the script for malicious code, and that was the first thing to go.

  9. Luke: Thanks for your comment. If deliciousapi’s caching is responsible for the high memory usage, I can easily imagine a version of delicious_mates that does not rely on the API. Using the current version, setting MAX_BOOKMARKS to a lower value will usually give useful results, too.

    Also, automatically downloading and running code from a script like this (ximport) is bad practice, IMO.

    Yes, that’s why I pointed it out in the posting above. If both modules required for delicious_mates.py are installed, no additional code will be downloaded. If you care about your machine, install both modules by hand. I can see, however, that the argument that one shouldn’t even offer the option to automatically download and run code makes some sense, too, especially if it’s the default option.

  10. [...] 16h35: Comment trouver des utilisateurs de Del.Icio.Us qui partagent les mêmes TAGS (liens) que vous. [...]

  11. Good idea.

    Which software have you used for generating the graph ? It doesn’t look like GraphViz as it’s better ;-)

  12. adulau: MindNode, but “generating” is probably not the best term to describe my manual work ;). I’d love to hear about a way to generate similarly good looking graphs automatically.

  13. Alexander am 21. May 2008, 0:35 Uhr

    Uses a lot of memory and takes a long time, but it’s worth it.
    Thanks for making this! :)

  14. Thanks for using it. If anybody wants to create an improved version that uses less memory, is faster, is web-based or has other advantages, feel free to use my code any way you like. Drop me a note and I will link to your page.

  15. 595. http://pkg-exppsy.alioth.debian.org/pymvpa/ (10)
    596. http://blog.doloreslabs.com/?p=11 (305)

    Traceback (most recent call last):
    File “delicious_mates.py”, line 160, in
    main()
    File “delicious_mates.py”, line 123, in main
    usernames = get_users_for_bookmark(bookmark)
    File “delicious_mates.py”, line 110, in get_users_for_bookmark
    url_metadata = d.get_url(url)
    File “/Users/pskomoroch/delicious_mates/deliciousapi.py”, line 267, in get_url
    data = self._query(path)
    File “/Users/pskomoroch/delicious_mates/deliciousapi.py”, line 237, in _query
    raise DeliciousThrottleError, “del.icio.us error %s - unable to process request (your IP address has been throttled/blocked)” % response.status
    deliciousapi.DeliciousThrottleError: del.icio.us error 999 - unable to process request (your IP address has been throttled/blocked)

  16. Pete: It looks like del.icio.us did not like that the script requested ~600 pages and (auto-)blocked you IP address for a while. Seeing that you already had so much data, it’s annoying that delicious_mates.py does not handle DeliciousThrottleError exceptions by either waiting for a while and then trying again or by finishing up the analysis, using only the data it already has. I’ll see what I can do (but for now, I’m off to bed).

  17. mullingitover am 21. May 2008, 1:48 Uhr

    lol this is going to blow up when it tries to index my 4,558 bookmarks

  18. [...] Find Similar Users on del.icio.us (tags: delicious social algorithms python) [...]

  19. dev_eddie am 21. May 2008, 3:24 Uhr

    by the time it has reached 345/1003 processed bookmarks, it is taking 1100MB of physical ram and I have 260 MB swapped to disk. I don’t really mind it taking forever, but i would like it working without stalling before processing 1/3 of my bookmarks and I believe that lowering the MAX_BOOKMARKS to something like 150 or 200 misrepressents statistically my “similar users” hence making the analysis flawed.

    Can it be fixed?

  20. John Blumberg am 21. May 2008, 5:27 Uhr

    Blue Bell PA USA
    Hi Andreas,
    I am JGBMLG on del.icio.us.
    Just a quick note to say thank you for this script. I installed Python just so I could run it.
    Like most del.icio.us users, I enjoy exploring the bookmarks of other del.icio.us users who have bookmarks in common with me.
    A small suggestion: If you happen to return to it, it would be interesting accord special relevance or a different kind of weighting to bookmarks in common where there are relatively fewer total users bookmarking the site at all. When two users bookmark a relatively obscure site, they are apt have more mental similarity to one another than two users who happen to bookmark a popular site.
    Regards,
    John

  21. think will try it.

  22. it works…

  23. Hi,

    I get an error :(
    The last error message you see, is MemoryError but I don’t think that my RAM is full because I have 4GB and only an eclipse running.

    285. http://gui.picresize.com/picresize2/ (320)
    286. http://www.onemanga.com/ (1191)
    Traceback (most recent call last):
    File “delicious_mates.py”, line 160, in
    main()
    File “delicious_mates.py”, line 123, in main
    usernames = get_users_for_bookmark(bookmark)
    File “delicious_mates.py”, line 110, in get_users_for_bookmark
    url_metadata = d.get_url(url)
    File “C:\Dokumente und Einstellungen\Hoschi\Eigene Dateien\delicios mates\deli
    ciousapi.py”, line 267, in get_url
    data = self._query(path)
    File “C:\Dokumente und Einstellungen\Hoschi\Eigene Dateien\delicios mates\deli
    ciousapi.py”, line 209, in _query
    data = response.read()
    File “C:\Python25\lib\httplib.py”, line 509, in read
    return self._read_chunked(amt)
    File “C:\Python25\lib\httplib.py”, line 552, in _read_chunked
    value = self._safe_read(chunk_left)
    File “C:\Python25\lib\httplib.py”, line 602, in _safe_read
    chunk = self.fp.read(min(amt, MAXAMOUNT))
    File “C:\Python25\lib\socket.py”, line 309, in read
    data = self._sock.recv(recv_size)
    MemoryError

  24. [...] norberak gordetako gogokoak Del.icio.useko beste erabiltzaile guztiekin alderatzen dituen, Python (2.5)ean idatzitako script bat aurkitu dut. Honen bitartez, nire kontuko azken 1000 gogokoen arrastoari jarraipena egiten zaie, [...]

  25. Stefan: The problem seems to be known, but your httplib does already contain the line chunk = self.fp.read(min(amt, MAXAMOUNT)) that is suggested as a solution. The right thing to do is probably to report the problem to the python/httplib developers.

  26. dev_eddie: I don’t see what causes the memory issue and I don’t want to spend too much time looking for the cause, but if anyone suggests a solution, I will try to incorporate it into my script.

    By the way: The following three lines of Python code …

    >>> x = {}
    >>> for n in xrange(1500000):
    ...     x[str(n)] = (float(n), float(n 1))
    

    … take 220 MB of RAM on my machine. It seems that large dictionaries may take surprisingly large amounts of memory (and the dictionary that stores all potential del.icio.us mates is large).

  27. John Blumberg: Correct me if I have misunderstood your idea, but it seems to me that the script already implements it! Point three of the feature list posted above says: The smaller the number of people who bookmarked a page, the more significant the fact that another user has this bookmark in common with you. More concretely, if you and Stefan have a bookmark in common and x is the total number of people who bookmarked that page, then 1/log(x+1) is added to Stefan’s “mateability” weight.

  28. [...] Find Similar Users on del.icio.us (tags: python ai web2.0 programming) [...]

  29. [...] Find Similar Users on del.icio.us (tags: *delicious) [...]

  30. [...] Find Similar Users on del.icio.us [...]

  31. [...] Find Similar Users on del.icio.us - [...]

  32. Interesting utility. Thanks for making this available!

    One thing: I frequently don’t care if the candidate friends cover all my URLs. Simplifying slightly, my delicious account has links for me in my various roles as professional (mostly MS with Oracle), hobbyist (python, scheme and linux with MySql) or human (various political, philosophical and arts tags).

    It could be interesting to specify a subset of URLs (perhaps by a collection of tags) and discover similar users based on the subset.

    Ideally I’d be able to set up different subscriptions for the different roles. Lacking that, it might be nice to watch different communities through an automated RSS feed of my own?

    Also, I wonder if using sqlite to hold and manipulate some of the data could make it more scaleable (in space if not in time).

  33. Russell: Using BOOKMARK_FILTER you can specify which tags you would like to include (among other things — the option is powerful). Replace the line BOOKMARK_FILTER = {"shared" : [None, "yes", "no"]} with the following code:

    MY_TAGS = ["python", "scheme", "linux", "mysql"]
    tag_filter = lambda tags: (True in [(tag in MY_TAGS) for tag in tags.split(" ")])
    BOOKMARK_FILTER = {"shared" : [None, "yes", "no"], "tag" : tag_filter}
    

    You could use a service like feedrinse to set up a channel that joins together feeds for individual users.

  34. [...] Find Similar Users on del.icio.us [...]

  35. Andreas: Neat script, still waiting on my results. :)

    Russell: I’m actually mostly interested in those who share the multiplicity of facets that encompass me. “I am more than a sum of my parts” or something. :)

  36. [...] am gasit un articol interesant pe tema asta. uitati aici - link [...]

  37. The joys of an evolving language.
    Death to software evolution

    ; python del*.py
    Your del.icio.us username? maht0×0r
    Your del.icio.us password?

    Fetching list of bookmarks … (1000)

    Fetching list of users for each bookmark …
    Traceback (most recent call last):
    File “delicious_mates.py”, line 160, in ?
    main()
    File “delicious_mates.py”, line 123, in main
    usernames = get_users_for_bookmark(bookmark)
    File “delicious_mates.py”, line 110, in get_users_for_bookmark
    url_metadata = d.get_url(url)
    File “/usr/home/maht/deliciousapi.py”, line 269, in get_url
    document.bookmarks = self._extract_bookmarks_from_url_history(data)
    File “/usr/home/maht/deliciousapi.py”, line 297, in _extract_bookmarks_from_url_history
    timestamp = datetime.datetime.strptime(month_string, ‘%b ‘%y’)
    AttributeError: type object ‘datetime.datetime’ has no attribute ’strptime’
    ;

  38. maht: I guess it would not take great effort to rewrite the deliciousapi Python module such that it does not require Python 2.5.

  39. Nice initiative! The problem is I have more than a thousand bookmarks in my profile. So after doing a test run with a thoundsand bookmarks a changed MAX_BOOKMARKS to 5000. First I tried running the script on my Macbook Pro (Python 2.5.1, 2 GB ram).

    1167. https://www.beatport.com/ (1568)
    Traceback (most recent call last):
    File “./delicious_mates.py”, line 160, in
    main()
    File “./delicious_mates.py”, line 123, in main
    usernames = get_users_for_bookmark(bookmark)
    File “./delicious_mates.py”, line 110, in get_users_for_bookmark
    url_metadata = d.get_url(url)
    File “/Users/stefan/Desktop/deliciousapi.py”, line 267, in get_url
    data = self._query(path)
    File “/Users/stefan/Desktop/deliciousapi.py”, line 237, in _query
    raise DeliciousThrottleError, “del.icio.us error %s - unable to process request (your IP address has been throttled/blocked)” % response.status
    deliciousapi.DeliciousThrottleError: del.icio.us error 999 - unable to process request (your IP address has been throttled/blocked)

    Okay, what does this want to tell me? No more than a thousand requests allows per day ? I rebooted my router to get a new ip address and tried again. Now the script actually gets to around 1400 bookmarks. Then something different happens. Now error message, but my Macbook stars getting slower, and slower till complete unusability. Memory usage peaks at about 1.7GB!

    Since this script blocks my Macbook when processing more than a thousand bookmarks I thought I’ll just put it on my idle Linux box and check a couple days later for the results (Python 2.5.2, 512 MB ram)

    755. http://apple.slashdot.org/ (488)
    Traceback (most recent call last):
    File “delicious_mates.py”, line 179, in
    main()
    File “delicious_mates.py”, line 129, in main
    usernames = get_users_for_bookmark(bookmark)
    File “delicious_mates.py”, line 110, in get_users_for_bookmark
    url_metadata = d.get_url(url)
    File “/home/stefan/apps/del.icio.us/deliciousapi.py”, line 270, in get_url
    document.common_tags = self._extract_common_tags_from_url_history(data)
    File “/home/stefan/apps/del.icio.us/deliciousapi.py”, line 321, in _extract_common_tags_from_url_history
    soup = BeautifulSoup(data)
    File “/home/stefan/apps/del.icio.us/BeautifulSoup.py”, line 1447, in __init__
    BeautifulStoneSoup.__init__(self, *args, **kwargs)
    File “/home/stefan/apps/del.icio.us/BeautifulSoup.py”, line 1070, in __init__
    self._feed()
    File “/home/stefan/apps/del.icio.us/BeautifulSoup.py”, line 1111, in _feed
    SGMLParser.feed(self, markup)
    File “/usr/lib/python2.5/sgmllib.py”, line 98, in feed
    self.rawdata = self.rawdata data
    TypeError: cannot concatenate ’str’ and ‘NoneType’ objects

    Okay, there is type mismatch, so I added a type check in sgmllib.py. Next test run.

    756. http://radar.oreilly.com/ (3381)
    Traceback (most recent call last):
    File “delicious_mates.py”, line 179, in
    main()
    File “delicious_mates.py”, line 129, in main
    usernames = get_users_for_bookmark(bookmark)
    File “delicious_mates.py”, line 110, in get_users_for_bookmark
    url_metadata = d.get_url(url)
    File “/home/stefan/apps/del.icio.us/deliciousapi.py”, line 270, in get_url
    document.common_tags = self._extract_common_tags_from_url_history(data)
    File “/home/stefan/apps/del.icio.us/deliciousapi.py”, line 321, in _extract_common_tags_from_url_history
    soup = BeautifulSoup(data)
    File “/home/stefan/apps/del.icio.us/BeautifulSoup.py”, line 1447, in __init__
    BeautifulStoneSoup.__init__(self, *args, **kwargs)
    File “/home/stefan/apps/del.icio.us/BeautifulSoup.py”, line 1070, in __init__
    self._feed()
    File “/home/stefan/apps/del.icio.us/BeautifulSoup.py”, line 1102, in _feed
    markup = fix.sub(m, markup)
    MemoryError

    Now I’m kinda clueless. What can I do to get this to work?

  40. great idea, ändy, as usual ;)
    roughly one year ago, I also thought about exploiting the connection between people (with their special interests) and their bookmarks. the idea was similar to yours, just the other way around: your tool uses bookmarks to find people with similar interests, I planned to found a social bookmarking service that would use information about people’s interests (given e.g. by a facebook-like profile) to find similar people’s bookmarks. unfortunately, as usual, I had neither the time nor the knowledge to implement this - and I still don’t. So I just wanted to share that with you, just in case you are interested.

  41. Stefan: To me, the MemoryError makes it look like BeautifulSoup can’t parse arbitrarily large XML strings, but I might be wrong.

    Michael: Sounds interesting! Given Facebook’s platform for developers and OpenSocial, this should be easier than it would have been a year ago.

  42. Well the same bookmark has been sucessfully parsed before, but that was on a machine with four times the ram so maybe the script simply needs more than 512 mb ram to function properly. What really irritates me is that in my case the script actually gets drastically slower the more bookmarks it processes and it also renders the machine it’s running on unusable.

    For a third try I have it now running on my old laptop (1 gb ram) which I’m using for nothing else at the moment, so it doesn’t matter if the script blocks the machine. I added some exeption and error checking to prevent crashes. I’m curious to see how far it gets this time. Extrapolating the current processing speed it should take another three days to finish with my ~5000 bookmarks.

  43. [...] Find Similar Users on del.icio.us Must study this for a Django project (tags: del.icio.us python delicious social web2.0 socialnetwork socialnetworking programming) [...]

  44. [...] 32- Find Similar Users on del.icio.us [...]

  45. [...] 32- Find Similar Users on del.icio.us [...]

  46. [...] Find Similar Users on del.icio.us (tags: delicious social) [...]

  47. Much thanks, very useful!

    As this ran, I couldn’t help contemplating what doppelganger it might find. The lengthy run time taken just builds suspense.

    Oh yeah, 8.2GB memory! Wow. Only 479 entries & 236k users….

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME COMMAND
    18864 pteluser 25 0 8454m 8.2g 3284 R 100 34.8 152:32.55 python

  48. [...] Find Similar Users on del.icio.us - I’ve really gotten into del.icio.us lately and I think it’s kind of cool to see who has similar interests as you.  I’m not one of those social network stalkers though. [...]

  49. Hi Andreas,

    in Ubuntu (and, I suspect, Debian) the feedparser module for python is not installed by default. Users need to ‘apt-get install python-feedparser’ for your script to work.

    If the probably everyone has is memory, due to a mega-dictionary holding matching names, I guess a simple solution is to use a file-based database like Berkeley-db or similar. Perhaps I will have some time to add that and I will return a patch to you…

    Viele Gruesse
    pbts

  50. Yes, I believe that would be a solution to the memory problem and I would be happy to apply your patch.

    By the way: Since this post was first published, I have made a few small changes to the script. The memory usage should be a slightly better, since the dictionaries now contain only strings without parsing information from beautiful soup.

  51. Great to see some del.icio.us hacking going on. This was a feature I was always hoping to see integrated into del.icio.us, but innovation seems to have died there after the yahoo acquisition. It seems digg has added a similar feature. Another cool del.icio.us tool is http://www.delifeeds.com which scans your del.icio.us bookmarks and turns them into a blogroll of stories tailored to your interests.

  52. [...] 32- Find Similar Users on del.icio.us [...]

  53. [...] 32- Find Similar Users on del.icio.us [...]

  54. [...] Find Similar Users on del.icio.us On the social bookmarking site del.icio.us, you can add other users to your network to see their recent bookmarks aggregated on one page. (tags: http://www.aiplayground.org 2008 mes6 dia12 at_home del.icio.us code) [...]

  55. [...] 32- Find Similar Users on del.icio.us [...]

  56. Hey I did not know about your implementation, so I did wrote a own script http://leo.freeflux.net/blog/archive/2008/07/28/delicious-friends.html. But to me your’s look a bit mature.

    I still think there might be some more ideas to calculate smart rankings. I tried to rank urls higher that have less users, but the results weren’t to good. I think there’s also a high influence if somebody added an url bevor or after you (follower vs. inovator).

  57. I am so surprised someone hasn’t turned this into a working website (or something) for the fainthearted among us. It seems like a simple and obvious idea. An easy way to find similar delicious users has got to be a much-desired feature. I still do it by clicking on a random website link and seeing whether a resulting user has similar tastes to me - far more often it is miss rather than hit. Painfully laborious and disappointing.

    Your answer seemed scarily impossible to me - someone who knows nothing about Python.

    I went ahead and installed Python on my Vista laptop anyway; configured it by following instructions somewhere so that the Command Prompt responded favourably to a request to run Python. Then I downloaded all the *.py files you suggested in this post - and finally realised I didn’t have a clue what to do with any of it. I flew blind. I put the *.py files into a lib folder in the Python directory, studiously copied your instructions to run your delicious_mates.py - without success (obviously), and then was reduced to double-clicking the *.py files from within the Python “lib” directory - to which Python responded by flashing open in the Command Prompt window and then disappearing.

    I think this must be the computing equivalent of stabbing a laptop with a screwdriver to get the disc drive to open.

    So, there it is.

  58. My apologies for the self-chastisement in my previous comment.

    I ran your script on my linux installation and (with a little bit of jiggery-pokery) it began to work. Thank you so much.

    The bad news is - like a couple of your commentors - I received a “throttle” notification after about 250 of my 500 bookmarks. Is there a solution to this?

  59. can anybody help on the error below - after thousands of tries to fix i finally gave up - 10x:

    Fetching list of bookmarks … (5)

    Fetching list of users for each bookmark …

    Finding 50 candidates from list of 223 users …
    Traceback (most recent call last):
    File “delicious_mates.py”, line 160, in
    main()
    File “delicious_mates.py”, line 137, in main
    num_bookmarks = float(re.findall(”items\s \((\d )\)”, user_page)[0])
    IndexError: list index out of range

  60. Now, how wierd is this.

    I found this post from a google search “delicious people with similar”. This page was top-ranked.

    The first thing I see clicking through is a network with Andreas.S in the centre. And I thought ‘Wow, that’s good, it’s been straight to del.ici.ous and looked up my network, drawn a diagram, that was quick’

    For I only have three del.ici.ous users in my network and you (andreas.s) are one of them. I added you ages ago when I noticed we had two or three links in common.

    Spooky eh?

  61. botogol: I wish the script worked the way you describe it :). Thanks for sharing this!

  62. denkisa, I’m having the same problem.

    Andreas, do you have any helpful suggestions?

  63. [...] 32- Find Similar Users on del.icio.us [...]

  64. I have developed a Java application to find similar users on delicious.com just like this one at:
    http://code.google.com/p/sbff/

    Social Bookmarks Friend Finder is a GUI application that allows you to find users with similar bookmarks to the user specified.

    It is similar to: http://www.aiplayground.org/artikel/delicious-mates/ but:

    * it is coded in Java (instead of Python)
    * does not require login as it does not use the public delicious API (it scrapes web pages since there is not unauthenticated API)
    * it uses a MySQL database to keep the large amount of data needed (delicious-mates keeps all the date in memory and that is too much for many users)
    * it can be shut down and restarted at any time to continue gathering data
    * it uses a Swing GUI

  65. This is cool! Do you have plans to provide a zip download and “how to run this” instructions for people who are not used to svn & java?

  66. I had the same/similar error to denkisa:

    Finding 10 candidates from list of 5264 users …
    Traceback (most recent call last):
    File “delicious_mates.py”, line 160, in
    main()
    File “delicious_mates.py”, line 137, in main
    num_bookmarks = float(re.findall(”items\s \((\d )\)”, user_page)[0])
    IndexError: list index out of range

    Any ideas?

  67. I have made an easy to use prerelease of sbff at:
    http://code.google.com/p/sbff/

    Please test it and report feedback!

  68. I have just made a 1.0 release that includes the http://www.h2database.com/ embedded database. So, using sbff could not be easier!
    http://code.google.com/p/sbff/

  69. @Eduardo

    Andreas asked, “Do you have plans to provide a zip download and “how to run this” instructions for people who are not used to svn & java?”

    I’m one of those people who are not used to svn & java. I have no clue how to implement your java application - and I would love to be able to do so. Please could you include a step-by-step guide on how to run your Social Bookmarks Friend Finder GUI.

    Thank you.

  70. I have updated the home page:
    http://code.google.com/p/sbff/
    But it is really easy. You just have to download the jar somewhere and type the command line you see in that page. Let me know if you have any other problem.

  71. Works for me, thanks!

  72. can someone post users list here after search please?

Antwort schreiben