Monday, February 09, 2009

One million facts

"Now, what I want is, Facts. Teach these boys and girls nothing but Facts. Facts alone are wanted in life. Plant nothing else, and root out everything else. You can only form the minds of reasoning animals upon Facts: nothing else will ever be of any service to them. This is the principle on which I bring up my own children, and this is the principle on which I bring up these children. Stick to Facts, sir!" — first paragraph of Dickens' Hard Times
Three cheers for LibraryThing's dilligent members. Our Common Knowledge system has hit 1,000,000 member contributions.

Common Knowledge is an innovative "fielded wiki" for book information—collaborative, piecemeal "cataloging" of information about books and authors. We created it back in October 2007—Chris did most of the coding—and it has exceeded our expectations.

The focus is on things not found anywhere else—not cataloged by librarians or publishers. The system's biggest strength is probably is series coverage, 26,890 and counting. More comprehensive than paid series data, it is also often of higher quality. There is surely no library in the world that accounts for the Star Wars series (plural) better than what LibraryThing members have assembled! Common Knowledge also tracks some 8,860 awards, from the Wolfson History Prize to Nestlé Smarties Book Prize.

Fun, if not quite as full, are lists of 78 books with Lincoln in them, and 23 with Emma Goldman and Puck. Almost 1,700 books take place in New York, 90 in Mars and 49 in Hell. Some 626 authors went to Harvard, three were gas station attendants and four were burried in Uppsala Cathedral. No doubt, there are more of all, but the data is starting to really pile up—a confirmation that Social Cataloging is no joke.

Wherever Common Knowledge goes, it will not be locked up. All Common Knowledge data is free for reuse outside the site, with a handy API as well.

Picking up. The one-millionth entry came early. Edits picked up dramatically when, ten days ago, I introduced a Dead or Alive? page for every member, allowing you to find out how your authors break down on the living/dead scale. They went through the roof when I introduced a similar Male or Female? page. CK also attracted some interest from the initial release of distinct authors—a method for distinguishing between distinct, homonymous authors. (It was a busy weekend.)

The one-millionth Common Knowledge entry was added at 6:47pm (EST) by ladybug1983, who assigned the contemporary romance Taking the Heat as the third book in the series O'Neil Family.

Hey LadyBug, want a t-shirt?

Labels: ,


Anonymous Anonymous said...

Three cheers for the CN-feature. And Dead or Alive/Male or Female are great incentives to digging in and adding those data.


I'm getting totally fed up with CN data not being replicated across language versions. An obvious example: If I look up Cahrlotte Brontë in, CN comes up absolutely blank. Really, I can't see any reason whatsoever that these data should NOT be shared.

2/09/2009 3:58 AM  
Anonymous Anonymous said...

Hi Tim,

your recent initiatives make me think that LibraryThing may benefit from connecting with other resources in the Linked Data cloud/community.
The simple idea is that if your application and my application both add information about X, the information about X I add in my application is available to yours, and viceversa.
I think you should be particularly interested on what this community has to offer on identity management as this would be a mechanism for solving the Authority Control problem.

with gratitude for LibraryThing from one of your earliest customers,


2/09/2009 4:56 AM  
Blogger Christer "Mort" Boräng said...

I think that straight up replication of CK across languages is a bad idea, at least for all fields. After all, if I'm using the Swedish site, I would want CK to be in Swedish.

However, something that works like the Subject field might work.

If a field is empty in a language, but it exists in another, show it, but in a different colour. That way, you can see the information, and it's easy to see which entries needs editing.

Oh, and fields that are language neutral or have just a few possible values (gender, dates and such) can be replicated like billygoatbeard, sorry, geitebukkeskjegg, suggested.

2/09/2009 6:36 AM  
Anonymous Anonymous said...

I'll second most of what Christer said above.

But looking at the Bronte example above, the only field that really needs translation is "Occupation". Names, dates, places... none of this need be translated. Also, upon inspecting random examples in several none-english LT sites my impression is that many prefer to enter data in English - even when not required to. Perhaps under the misconception that the data will be shared.

2/09/2009 7:06 AM  
Blogger Katya said...

With due respect, names and places do sometimes need translation. (The English "London" is the French "Londres" and the Russian "Лондон," for example.)

2/09/2009 8:11 AM  
Anonymous Anonymous said...

Under Picking Up:

I introduced a Dead or Alive? page for every member, allowing you to find out how your authors bread down on the living/dead scale.

Can you explain this ?

2/09/2009 9:11 AM  
Blogger Tim said...

Some thoughts:

1. Some CK fields should be language-neutral, at least by default. (Even date can include language-specific featues, like "1920 (rumored)"). Others can't. The solution is to adopt different levels of sharing for the various elements. It's unfortunate we didn't solve this up front, for sure.

2. I am basically hostile to the Semantic Web community and their approach to problems like this. Also, nobody outside of libraries and us have any data worth a damn here. Could go on at great length; will spare you.

3. Changed. It's break.

2/09/2009 11:27 AM  
Anonymous Anonymous said...

I love CN.

But what I'm really waiting for is the ability to sort my catalog by (for example) a field like Original Publication Date. Will it ever be possible...?

2/09/2009 2:18 PM  
Anonymous Anonymous said...

In the same vein as wwidsith, one of the things I would really wish updated by more people is the original publication date. A way to sort by the original publication date would be very nice, as well.

2/10/2009 12:35 AM  
Blogger Unknown said...

@Tim "I am basically hostile to the Semantic Web community and their approach to problems like this".

If you are willing to spare the time I am really interested in your thoughts about this (whatever their length).

If you don't wish to do this on this blog (I understand and respect that this discussion may not be everybody's cup of tea) I can be reached at

2/11/2009 2:49 AM  

Post a Comment

<< Home