Monday, January 08, 2007

Books with similar library subjects and classifications

I've added a new and often powerful recommendation engine. It has a long and awkward name: Books with similar library subjects and classifications.* So far, I've only got it on Suggester pages.

It feeds off three pieces of "traditonal" library data:
  • Subjects (mostly Library of Congress Subject Headings),
  • Library of Congress Classifications (LCC), and
  • Dewey Decimal Classifications (DDC)
The recommendations are special in a few ways:
  • They can be very "targeted"
  • There is no "popularity" threshhold; books with just one copy in the system often have recommendations**, and it will recommend obscure stuff too
  • It works better for non-fiction than for fiction
  • It fails in interesting ways
At its core, the system looks for shared library data. So if book B has subject S, all the other books with subject S get a "vote"; the winners are the books that share the most subjects with the suggesting book. The algorithm goes beyond this by leveraging the inherent hierarchy of the three systems, apportioning successively "smaller" votes to ascending levels of the hierarchy. Popularity is also taken into consideration, but as little more than a tie-breaker.

At it's best, the system is spooky. So Into Thin Air's other recommendations are spread over Everest, general mountaineering and adventure books. But the "Similar subjects and classifications" recommendations leads with Kenneth Kamler's Doctor on Everest : emergency medicine at the top of the world : a personal account including the 1996 disaster, a reasonably obscure (5 members) personal account of the same 1996 expedition. Other times the results are mixed or even odd. Kant's Critique of Pure Reason pulls up commentaries on itself, but also the acclaimed but seemingly unrelated seminal work on the anthopology of magic, E. E. Evans-Pritchard's Witchcraft, oracles, and magic among the Azande. Why? Because both receive the Library of Congress Subject Headings:Strange bedfellows, perhaps.

*Got a better name? Let us know, seriously.
**Ironically, twice as many works have recommendations (219,000 vs. 120,000 for "people who have X also have Y"), but because they are more evenly distributed by work popularity, half as many books have recommendations (2.6 million vs. 5.9 million).

12 Comments:

Anonymous Anonymous said...

Where is it?

1/08/2007 1:33 AM  
Blogger Tim said...

Ah. Sorry. I added a note. I put it on the Suggester pages, but not yet on work pages. With five different types of recommendations, things are getting confusing. We need to figure out how to deploy so everything is visible somewhere, but the main view is streamlined.

1/08/2007 1:49 AM  
Blogger gritmonkey said...

Suggestions for this book are not built yet.

http://www.librarything.com/suggester/84177

The other books I checked out had them. When will the outliers get recommendations?:)

1/08/2007 4:21 AM  
Anonymous Anonymous said...

"Books similarly classified" ?

Fastred

1/08/2007 5:18 AM  
Anonymous Anonymous said...

Kindred Works ?

1/08/2007 7:58 AM  
Anonymous Anonymous said...

I like the Kindred Works suggestion.
Along the same vein:
Subject Siblings, Surface Works, SameThing Works

1/08/2007 1:15 PM  
Anonymous Anonymous said...

Tim, now I know why I work in publicity and you work with computers. You lost me at popularity. Sounds like a cool feature. I'll have to check it out.

1/08/2007 10:17 PM  
Anonymous Anonymous said...

How about searching for books that are "Subjectively synonymous"?

Nice double meaning there, I think.

1/09/2007 1:33 AM  
Anonymous Anonymous said...

How about SubjectThing?

1/09/2007 9:35 AM  
Anonymous Anonymous said...

"SimilarThing"
Yes, I like that

1/10/2007 3:10 AM  
Anonymous Anonymous said...

*Got a better name? Let us know, seriously.

Books of a Feather? :)

KayDekker@LT

1/20/2007 11:57 AM  
Blogger Tim said...

I like that!

It's wrong here, where I'm trying to suggest a particular use of library data—to signal that it's a different sort of algorithm, one based on NON-user-generated data. But the "Books of a feather" name is a good one!

1/20/2007 12:12 PM  

Post a Comment

<< Home