Tuesday, January 24, 2006

New feature: Fun statistics

As some of you have noticed, I've added a "Fun statistics" link on every profile page. Here's mine. It presents statistics like:Link
  • Library obscurity (the average number of copies of the books held by others)
  • Total tags; tagged books; tags per book
  • Cataloging sources
  • A histogram of ratings
  • A histogram of publication dates
There's a lot more I'd like to do. Right now these numbers aren't contextualized. Is my "library obscurity," 12, high or low (it's quite low). A percentile, and maybe something on the Zeitgeist page would be a good idea.

I'm very open to other ideas. I can add read-date statistics, for those of you using those fields. Ditto BCID fields. I could do tag-obscurity, for what that's worth. I'd like to work with page counts; I have the data for some books and can get it for others. In theory I should be able to extract publishers from ISBNs (I can do it from the publishers field too, but that's too messy). But extracting publishers from ISBNs requires a big database, and I think it's one I need to pay for, from Bowker I think.

Some other ideas would require a more serious integration with library data—language, date of original publication, etc. Nicole on the LibraryThing discussion group suggested "some kind of histogram of the distribution of your books on the LoC classification
system." That's a good idea too.

20 Comments:

Anonymous Anonymous said...

I'm fascinated by the "obscurity" factor (probably due to my having a lot of obscure books), I was wondering about that back when you had the "weighted" lists of other users' libraries, and would be interested to know how that number is determined. Right now I have a 23 ... you say your 12 is quite low ... what is the scale on that, and about where would be the mean?

1/25/2006 12:41 AM  
Anonymous Anonymous said...

While on the subject of "fun statistics" ... how about adding a ranking for things like "You have the ###th largest library" and "You are the ###th most prolific reviewer"?

As I've slowly logged my library in I will admit that I've kept an eye on the listings of those ... and while I've not cracked the top 100 libraries (yet), I am somewhere in the "90's" for reviews (with 37 so far).

Having those on the "fun statistics" page would be mighty handy!

1/25/2006 1:18 AM  
Anonymous Anonymous said...

Maybe I'm perverse but I would like statistics like
Largest (public)library you share no books with.
Librarys with which you share only one book. (the largest I know of has 1900+ books while I have 3300+)
Shared books weighted for obscurity.

1/25/2006 5:02 AM  
Anonymous Anonymous said...

And your date information reminds me I would like a field for date of original work / copyright dat. Is there any support for this from other users. If not it's more for the comment field.

1/25/2006 6:32 AM  
Blogger AbbotOfUnreason said...

I had the same thought as ringman. I'm less interested in the physical age of my library book than in the range of original dates.

1/25/2006 9:01 AM  
Anonymous Anonymous said...

I do the same as our friend languagehat in regard to book dates. When I have the physical book in front of me I add the earliest date and the latest date in the front matter, separated by a comma. I too was puzzled by the comment Date is the edition's publication date, not date of original work. How does the software know which date belongs to the edition or how people use the date field.

Personally I would like more date fields. I want to add at least the orginal date of writing/publication, the date this translation was first written/published, and the date my edition was first published or my copy was printed. You can see some of these are a little vague but I'm not yet an expert on this stuff.

1/25/2006 11:09 AM  
Anonymous Anonymous said...

I also change the date to be the copyright date, for some categories of my books. Is this going to mess up the identification of editions later?

For obscurity, I think that median might be better than average. The average is pulled up unduly by a few very-popular books. Or perhaps a histogram of some sort.

1/25/2006 11:35 AM  
Anonymous Anonymous said...

I've got to agree with Rich on the calculation of the library obscurity. Median would make for a more realistic reflect of obscurity than the mean does. If you use the mean average, own one Harry Potter and whole result is skewed.

But I do love the fun statistics - thanks for adding them.

1/25/2006 1:33 PM  
Anonymous Anonymous said...

I LOVE the library obscurity stat - though that may be because I have a whopping 94. ;)

I found myself wondering, as I had to manually add some of my books, if there was a way to determine how many new editions/books I was adding to the LT catalog - so it was a pleasant surprise to notice the "distinct work" statistics.

Great job!

1/25/2006 2:24 PM  
Blogger chamekke said...

My library obscurity number is 16. Since the "fun stats" page now says this indicates the average number of copies held by others, I'm thinking that a low number = Obscure and a high number = umm ... Popular, I suppose.

I know that I won't top the obscurity charts overall, but I do think it's hard to be much more obscure than this volume. Maybe we should have an obscure-off?

1/25/2006 11:14 PM  
Anonymous Anonymous said...

on the calculation of obscurity. The current averaging method is not very good.
If you have 99 books in your library all unique you score 1. Buy HP and the Half Blood Prince and it goes up to 17. This is equivelant to 16 people buying your entire library.

I suggest averaging the recipricals of the number of copies, and mutiplying by say 1000. This gives a number up to 1000 with higher numbers rather than lower being more obscure.

The above purchase would reduce a score of 1000 to 990 while a single person entering your entire library would reduce it to 500.

The most obscure library I have found is rated at 2: user patf444 from Brazil.

1/26/2006 6:21 AM  
Anonymous Anonymous said...

An obscure-off, you say.

Here's my entry

http://www.librarything.com/author/usarmy

1/26/2006 10:02 AM  
Anonymous Anonymous said...

I had two suggestions for additional functionality:

1) Allow a means of adding a field for who you loaned the book to and the date you did so. In order to keep track of who has your books.

2) A means of adding books from the same author without need of reselecting their name and tags. Then going to the page in the right hand view you were last on.

Great site!

1/26/2006 1:39 PM  
Blogger Tim said...

Ringman: I certainly see the problem. I think the answer, however, is to provide both mean and median, and maybe a histogram for good measure.

I also think it should give you a percentile. I'd like to do that for all numbers. It could never be perfectly up to date—the amount of calculation is staggering—but it could be pretty good. I think I'm going to make statistics only apply to people with over 100 books cataloged.

It's too bad that foreign-language libraries end up being the most obscure. This will change once it handles works better—my current project.

Anonymous: "A means of adding books from the same author without need of reselecting their name and tags. Then going to the page in the right hand view you were last on." I'm a little thick. Do you mean from an author page? Spell this out a little for me.

Tim

1/26/2006 2:27 PM  
Blogger Ed said...

Obscure-off?? I knew a Stanley Obscureoff who spent a lot of time updating downtime of various websites . . .


Please, no booing or hissing!

1/26/2006 3:37 PM  
Anonymous Anonymous said...

Any chance the "libraries like yours" feature will be returning? I liked that one.

I like the obscurity stat. It will be nice when the same book translated can be collapsed together.

I'm also interested in dates - date of original pub, date of translation (for translated works), date of current edition, date of purchase, date started, date finished. I've changed the dates of my books to reflect their original pub dates.

1/27/2006 2:18 PM  
Anonymous Anonymous said...

Another cool feature - average time it takes to read a book. I love that idea!

1/27/2006 2:34 PM  
Anonymous Anonymous said...

AND, I like anons idea of books lent out, date, etc.

1/27/2006 2:35 PM  
Anonymous Anonymous said...

Tim,

I noticed when I was say adding a bunch of Hunter S. Thompsons books I needed to grab the list over and over. Instead of maybe having check boxes to select multiple of his books to add at once. Like a batch add function. Or am I missing something. I can be a little thick myself. ;)

1/27/2006 5:56 PM  
Anonymous Anonymous said...

I vote for ringman’s comment requesting a copyright date field too - I find it more useful knowing when a book was first written, rather than that I happen to have the 5th reprint on my shelf....
Another 'want' is for joint writers to go into the same author field rather than a single name. I know the 2nd author can go in the alternate author field but it seems a tad unfair that one author gets all the stats or ‘combine with’ and the other doesn't.
However if I try joining the names in the author field, I can lose 'sharedness' with other identical copies which kind of degrades the data on distinct works, author count etc anyway..
Enjoying the work & improvements supplied so far though, thanks hugely for this site.

1/30/2006 8:25 AM  

Post a Comment

<< Home