Tuesday, May 29, 2007

LibraryThing/Random House Early Reviewers

This is a done-up re-announcement of Early Reviewers. We blogged it last week, but tentatively. Since then we've refined it a bit, particularly on "our end" (ie., the stuff you don't see). I also want to explain what's nifty about it—for members and particularly for publishers.

The idea is simple. Basically, LibraryThing and Random House will be giving out free pre-publication books in exchange for reviews.

The first batch includes:
What's cool here? Many publishers distribute "Advance Readers Editions" (AREs) to booksellers, journalists and—increasingly—bloggers. A few have formalized programs, like HarperCollins' First Look program--register and from time to time you'll get a book. LibraryThing builds on these, but it takes it a whole new step.

AREs are a tricky business. It's hard to get them into the hands of the right people, and harder to make those hands open them. Most are simple wasted. And they're not cheap. Although usually pretty flimsy, they're made in small batches, so they generally cost more to produce than the final hardcover.

LibraryThing Early Reviews solves this problem. Books aren't distributed randomly, but to the right people. The algorithm we're using has a bunch of factors, including plain luck. The core, however, is what LibraryThing knows that nobody else knows—the books in people's libraries.


If you saw this list clearly, we'd have to kill you.
To kick things off, Random House gave us a list of "similar books" for each title. We then washed these through a new recommendations algorithm, "sorting" the LibraryThing library according to their statistical proximity to these titles. We ended up with a 200 "similars" for each book. All things being equal, the more of these you have—and the higher on the list—the better your chances of getting a book.*

It turns out, this is a pretty powerful thing to do. Some reviewers pop right out—the ones reading lots of similar books. They're not guaranteed to like the book—nothing ever could—but they're the right people to review it. At the other end, it found members with hundreds or thousands of books, none of which are in the 1,000 similar titles. I'm not actually worried about bad reviews—bad reviews are fun!—but nobody is happy when books go to the wrong people. For starters, unlike professional reviewers, "regular" people don't usually finish and review books they aren't enjoying.

We thought hard about "exposing" the similarity information to users. But we decided against it. The lists are uncanily good, but they're ultimately subjective. I don't want to argue that X is more like Y than Z. And I don't want users to despair that they're never going to get books. If this thing works, they will. We'll get more books. Every book its reader, as they say.

I did, however, calculate every members "affinity" to the books on offer, and send invitations to the most eligible .5% who aren't already signed up for Early Reviewers.

Anyway, we think we've added a new twist to ARC distribution. We think this going to become something really big—big and "not evil" (in the Google pre-Chinese censorship sense).

Early Reviews does some other new things:
  • We promise not the let the *content* of a review affect your chances of getting subsequent books. I suspect this isn't always true when publishers send bloggers books--why keep sending someone books when they keep trashing them? LibraryThing is different here. First, we hope to match books and reviewers better in the first place. Second, our reviewers are our members, and LibraryThing stands or falls based on them, not on anyone else. If we started blacklisting members, we'd fall apart.**
  • We're starting with two batches of books from Random House. Starting in October, the program will be open to other publishers.
Anyway, check it out here: http://www.librarything.com/er/list .

*By the way, we only consider books added before Early Reviewers was announced. So, you can't spam this.
**Indeed, my greatest fear is that pure randomness makes a few people feel blacklisted, and they raise hell about it. Anyway, you have our word on this. Anyway, I've always felt that the best reviews were negative ones. It is, after all, much harder to be creative in "I love yous" than in "your mothers" and other put-downs.

Update: As explained before we have to stick with US members for now. But I've opened up registration to everyone. When we get a batch that can be distributed more widely, we'll let you know.

Second update: My friend, author Kevin Shay linked to a great blog post of his, The ARC of the Covenant.

Third update: Jessica Mulley shot me this link to an article she wrote about collecting galleys, proofs and advance copies. Actually, I'd already seen it; it ranks high in Google, but a read-through convinced me of it's value. I'm glad, however, that I'm not a book collector per se. It would get exhausting.

Labels:

Friday, May 25, 2007

See all a work's tags

LibraryThing members have added well over 18 million tags. Of course, they aren't equally distributed. Popular books now sport thousands or even tens of thousands of tags. Work pages have a small tag cloud for each book, but it only shows the most popular thirty or so.

So I added a link to show all tags for a work. It shows the whole "long tail." It's very long indeed. It's stunning.

Here's Freakonomics with the standard tag cloud:



Click "show all tags" and you get around five pages of tags. Here's a piece of that:

If you want to see the actual numbers, you can click the "show numbers" link.

Thursday, May 24, 2007

Cooking bookpile winners

We had a lot of fantastic and inventive entries for the Cooking bookpile contest—thank you all!

Our first prize winner, who gets $100 to use on AbeBooks.com is Featherbooks, with "Cookbooks on Ice"—did you have no food, or did you clean out your fridge especially to take a picture? Impressive either way...



Second prize, and recipient of a free lifetime membership to LibraryThing is
"I heart cooking & Librarything!" (Who are you? Email abby@librarything.com so I can send along your lifetime membership!)




And of course, a few more notable entries for your viewing pleasure... (All of them can be seen here).


"Eating History", by Selkie—oh, the wit :)



"Which book doesn't fit?" Um, can I guess "Game is Good Eating"?



"I'm an eclecticook." (A dinner party I'd certainly like an invitation to).



Sing it with me: "I'm gonna wash those books right out of my shelf"



A dream game of Clue: "Elmo in the dining room with a Sweet Potato". Fantastic. In oh so so many ways. (There were two other ways that Elmo kicked it, also worth checking out).



The oh so elegant "Afternoon Tea Cookbooks 2" by staffordcastle



Cookbooks from around the world, by MMcM



And last, but certainly not least... "Cook(ed) books" by tanja

Labels:

Wednesday, May 23, 2007

New Feature Tip-Toe: "Early Reviewers"

We're introducing something new, called LibraryThing Early Reviewers. It's coming out officially on Tuesday, but assiduous blog readers get to start early.

The text at the top of the page sums it up:
"Random House has given us some advance copies of books soon to be published. We're sharing these with you to read and review. You get free books, and share your opinions with a wide audience. LibraryThing makes everyone happy and keeps everything free and fair."
So far, like much of what we do around here, this is something of a test. Kudos to Random House for being up to that.

Random has signed up for two batches of book. The first batch includes:
Eventually, Early Reviewers will be open to other publishers.

Members should understand what this is, and what it isn't. We're going to talk about LibraryThing Early Reviewers, but won't be pushing Random House's or anyone else's books at you. Similarly, getting a free advanced readers copy comes with NO obligation. Under no circumstances will a bad review change your chance of getting another.

If more people want the books than we have copies, we'll have to ration them. The basic algorithm is randomness, but other factors come into play. We're going to try to spread the wealth around. And if you complete a review—good or bad!—you're more likely to get another. Finally, LibraryThing's matching algorithm will try to match up books with readers, based on the rest of your LibraryThing catalog. For publishers, that's the interesting part; we're anxious to see how it turns out.

I've set up a Early Reviewers group, to talk about Early Reviewers and Early Reviewer books. Let us know what you think!

Labels: , , ,

Monday, May 14, 2007

LibraryThing for Libraries in Danbury

"You got your chocolate in my peanut butter!"
Over in Thingology I've announced the first library to use LibraryThing for Libraries—The Danbury Library in Danbury Connecticut. Works, recommendations, tags—they've got it all.

I've said I wouldn't do as much cross-posting, now that we have a combined blog feed (see over on the right). But I thought I'd mention it here, and explain a bit about what it means for LibraryThing.

First, as members of LibraryThing, you should feel proud that your data—anonymous and aggregate, as the Terms of Use say—is helping library patrons to find books. Your passions—the books on your shelves—beat statistical "paths" through books that others can follow. Your tags--the way you think about your stuff--will help people find subjects not covered by traditional subject classification.

For those concerned about development time, I want to emphasize that LibraryThing for Libraries is good for LibraryThing. On the most basic level, it's going to help our bottom line. That means more programmers making features and fixing bugs. Conceivably, it could mean cheaper accounts.

It also deepens our relationship with libraries, and returns a favor. LibraryThing was built on library data, and we've been graciously invited into the library conversation. We are charging for LibraryThing for Libraries, but our prices are in an entirely different league from what libraries are accustomed to pay for their online catalog software. And as these catalogs add "social" features, LibraryThing for Libraries will exert powerful downward pressure on prices. Ultimately, the industry needs a newcomer to take a huge slice of a smaller market. We're not going to be that company, but we can push the trend along.

LibraryThing for Libraries has also taught us a lot about library catalogs. These are some thorny, mysterious systems! Until now, we've relied exclusively on the simplicity of Z39.50 connections, which most libraries don't have. But we can do more. With out new-found experience, we can start connecting to the remaining 95%. If nothing else, this should help our language reach.

Wednesday, May 09, 2007

A very short introduction

At long last, the often requested quickstart guide—A very short introduction to LibraryThing.

It's intended as a quick overview of LibraryThing's features, to help new members get started, all the way from signing up to creating a blog widget. It's hard to come up with the balance of enough information to help without overwhelming, so I'm looking for your feedback. What should be added, changed, deleted, clarified...?

Discussion in this talk post.

Stars in reviews

Here's a low-hanging fruit. We finally put the review's star rating in the reviews. I think I'll call it a "mashup."

From The Da Vinci Code:

Labels: ,

New search, now with "working-ness!"

I've changed how the "all fields" search for your library works. It's new and still being worked on—you can discuss problems and requests here on Talk. But it's faster, solves most character set issues and allows "fielded" queries.

Example queries:
greek history
"greek history"
greek history -war -"peloponnesian war"
gree* history
*disestablishmentarianism
tag: greek author: homer
title: finger* subject: pick-pockets
source: amazon all: history

Update: It supports "all," "tag," "title," "author," "ISBN," "subject," "dewey," "LCCN," "source," "date," "review" and "comment." (You can use plural for all names too.) By default, it now uses the field "most," which is "all" minus subjects, reviews and comments.

Labels: , ,

Monday, May 07, 2007

Going to Book Expo America in New York

Book Expo America, ABA's annual book industry trade convention is in New York City this year, and I (Abby) am going to be there. I'll be speaking on Thursday, May 31st (from 1-2pm—mark your calendars!) on a panel called "Using Social Networking to Build Author Brands."

We just found out that the our competitor, Shelfari, is also going to be at BEA this year, and is apparently using some of their Amazon funding to co-sponsor an event. Hey! Well, not only does LibraryThing appear to have sixty-five times as many book lovers as them, but we think we have a lot more to offer authors, booksellers and publishers and we're going to prove it.*

Authors. It was at last year's BEA that we launched the LT Author program. After Tim and I spent a day walking around trying to describe LT in a nutshell**, we realized we had been telling people, "it's like MySpace, but for booklovers." Well, MySpace is all about bands and musicians promoting their music. Wouldn't LibraryThing be a good place for authors to do the same? What better place to promote your new book than a website full of avid bibliophiles?

And so was born the LT Author button, a shiny yellow badge that connects an author's "author page" with their profile page. So far LibraryThing has snagged 395 authors. (See the complete list.)

Best of all, they're not just authors who clicked a box. To be part of the program, you have to have a LibraryThing account and put in at least 50 books. What is your favorite author reading? Find out.

Neil Gaiman's author photo. Members have added over 15,000 pictures and photos of authors (see recently added ones), with alibrarian and leebot leading the pack. They deserve some kudos—it's actually a pretty intensive process, often involving writing authors, publishers, or photographers for permission, so the sheer number of photos is all the more impressive. Plus, it makes for a nice gallery. :)
LibraryThing members have also added over 92,000 links to author pages—links to author home pages, blogs, publisher pages, Wikipedia pages, interviews, articles, fan sites. That's a lot of links.

Booksellers. We'd love to add more bookstores to our "bookstores that integrate"—adding availability and pricing information on every work page. We've got only three so far, but we'll be adding two major "chunks" of them in the next few months—to at least 100 total. It's a great way for people to be able to see at a glance if a book is at their local bookstore.

Publishers. So far, we're not doing anything for publishers! But there's a big announcement coming soon. Be on the edge of your seats!

So what can we do to make LibraryThing big at BEA this year?

Our big idea so far is a par-tay. Of course, anyone and everyone can find some time to talk to me during BEA, but I'd like to have a big meet-up. Authors, publishers, booksellers, and hey—readers. Anyone in NYC who's around is invited, not just the book-industry professions allowed to go to BEA (they have to restrict it, because there's so much free merchandise on offer.)

I made a BEA 2007 group, post there with ideas of where we should meet (I'm thinking maybe a restaurant near the convention center?). New Yorkers, I call on you for suggestions!

We're also thinking about bring a bunch of CueCats, and giving them out to authors, to entice them into becoming LT Authors... What else?

*[Written by Tim] Shelfari doesn't release any statistics. But they do release the top 20 bookshelves. The 20th bookshelf on Shelfari has 1,360 books. LibraryThing has 1,378 members with that many. Hence 20/1,378 = 68.9 times as large. You will note that we do not abuse our other competitors--just Shelfari. Some of them are quite good! There's a good thread going about them. We want people to check them out, and come back to tell us how to improve LibraryThing!
**"This is me in a nutshell: HELP! I'm in a nutshell!"

(photo by Rick Dikeman on Wikipedia, under GNU Free Documentation License)

Subjects get faster; the rest will follow

Everything on the web is better if it's faster. Slow pages are a silent killer.

So we're working to speed thing up. We've long done "situational" caching. But our growth is relentless—we'll hit 200,000 registered members today—and we've had no good, generalized solution. We've recently been working on two solutions, for database and page-level caching. Together they should speed up certain cacheable pages, like works, authors and tags. The more resources we can free, the faster the uncacheable pages, like Talk, will become as well.

So far, only subject pages are being cached, eg.,
Subject pages were a big problem. The worst took a minute to load. When Google's "spider" program went at them, with one request/second, the servers would sweat. Subject pages are now cached whenever someone hits a page, and stays so for at least week.

Subjects are a test. There are some kinks to work out. (For example, changing the non-English translations doesn't immediately clear all affected pages.) Once we get where we want, we'll roll it out page-caching wherever we can use it. Query caching will follow.

Saturday, May 05, 2007

Conversation = Excellence

LibraryThing has always depended on members to set development goals and refine (or ditch) features. But it's amazing how well it's worked with the new "affinities"* feature. We simply could not have anticipated how members would shape our thinking. (I will never ever develop another project in a small, closed group, with occasional trips to watch a "focus group" from behind smoked glass.) We're still watching reactions on the blog, and on a now-130+ Talk topic, but we have some good ideas. When Altay returns from Boston, we'll hammer out changes, including customization of the look, and the ability to turn it off.

I started another thread I want to highlight, about LibraryThing's strategy and a hiring decision for the non-English LibraryThings. Do we hire someone, and what can they do? I hoping the thread gets some traction, at least among the users of our dozen-plus non-English sites. We need a non-English plan.

Part of the problem is technical, starting with better character support. But there's a feedback loop. Right now, the non-English sites can't be the coding priority because they're not contributing as much to our growth, or to our finances. (Not that they're small. Our non-English sites appear to have more action than our largest English-language competitor.) If we hired someone—and had something for that person to do—we'd have a stronger incentive to work on it.

*We called them "affinity percentiles," but it got chipped down nicely by SilentInaWay. Case in point.

Labels: , ,

Friday, May 04, 2007

Affinity percentiles and Altay

Altay (middle), John (sweatshirt), Tim (right), Abby (encased in her spherical "soul cage")

We're introducing an important new feature, but only just. The feature is called "affinity percentiles." Basically, we show numbers next to other user's names. These represent how "similar" your libraries is to theirs.

We've started it off on just one area of the site, the message pages in Talk (example). We plan to roll it out across the site, but not until we get a lot of feedback. I have a feeling some members will love it, but some won't. This isn't something we want to do lightly.

The number needs some explaining. (It may be too subtle, and we should fall back to a more straightforward "books shared.") Basically, the higher the better. The person who shares the most books with you will have a 99%; the person who shares the least gets a 1%.

The percentage isn't the number shared—65% does not mean a user shares 65% of their books; it means that the user shares more books than 65% of users. Two other factors come into play:
  • a member has to share five books to get an affinity percentile
  • "sharing" is weighed by book obscurity and library size. A user with 100 books, who shares 20 obscure books with you ranks much higher than a user with 10,000 books who shares some very popular novels.
Other features:
  • If you hover over the percentile, you'll get the shared books. We've thought of having it actually show the books.
  • The percentile box is colored in line with the number—the hotter the higher.
Some questions:
  • Are the percentiles too hard to understand; would shared numbers be better
  • Is the weighting confusing?
  • What should happen when you hover over it? When you click on it?
  • Where should it go? Where shouldn't it go?
How? I've wanted to do something like this for months. It's a surprisingly difficult technical problem. You can't calculate it on the fly every time, that would be insane. But caching the data gets big quick. Imagine a "Battleship" grid of users—190,000 by 190,000. If you stored a single byte for each connection--the number of shared books--it would amount to at least 16 terabytes of data (190,000 squared/2). The solution I came up with involves efficient short-term caching, and ignoring members with fewer than five shared books. We've actually been running it on the Talk pages since last night, waiting to make it visible until we knew it wouldn't melt our servers. (So far no melt!)

You'll notice the numbers aren't there when you first hit the page. They come in a second or two later. This is "Ajax" at work, and was done to prevent the new feature from slowing Talk down.

The real benefits will come when the feature is distributed across the site. I'm particularly interested in seeing affinity percentages on reviews, and sorting by them. Ultimately, I don't care what 300 people think about the Da Vinci Code. I want to know what Tim-ish people think of it.

Why?
The crux of the idea is to highlight what makes LibraryThing social system work, so-called "social cataloging." Vanilla social networking is structured around "friends." That's a powerful idea, but it has limits. It can be too "binary"; and the dynamics of "friending" a stranger miss many of us. At its best, social cataloging gets at something more nuanced. If I share 50 books about ancient history with you, there's a degree, a nuance and a semantics to the connection that opens up a world of possibilities. Some are social and some aren't. I might want to chat with you about the books we've read, or I might not. Either way, I benefit. The rest of your library is probably interesting to me. And your opinions have a claim on my attention no anonymous guy on Amazon gets.

This post also introduces Altay Guvench (username: Altay), who did the Javascript work behind affinity percentiles. This was actually a toss-off, but Altay was the force behind the much more amazing Javascript in LibraryThing for Libraries. That stuff is a work of art—Javascript inserting Javascript. It might actually be self aware! Altay will be working on the site generally, with a tilt toward things that JavaScript can improve, like the widgets.

Altay in a nutshell: Portland native. Harvard undergrad. Bassist for the alt-country band Great Unknowns (toured with the Indigo Girls! Reviewed ecstatically. Listen to a free song!). Co-founder of Y-Combinator-funded startup AudioBeta. One of only three members on LibraryThing with Optical holography : principles, techniques, and applications. Scheme hacker. Nerd, but a nerd who rocks out.

Labels: , , ,

Thursday, May 03, 2007

Combined blog feed available

I used Yahoo Pipes to make a combined feed for this blog and our Thingology blog. It was easy to do, and the result is pretty useful. The three feeds are as follows:
I also edited the employee list on the right, to add Altay. He is the magic behind the LibraryThing for Libraries Javascript, but almost nobody's seen that yet, so we're waiting for his first user feature to give him a proper introduction.

Wednesday, May 02, 2007

Many more Wikipedia citations

You'll notice many more Wikipedia links from work pages. The total has increased by about 200%, and the coverage by at least that.

This improves what I did in February. That worked by looking for ISBN patterns. Of course, not all books cited in Wikipedia have ISBNs. And even when there is one, many Wikipedia contributors omit it. (As far as I'm concerned, ISBNs look chintzy in a bibliography anyway.)

I've redone it, this time also looking for telltale title/author patterns, and running the matches against LibraryThing's vast and usefully messy dataset. The logic is somewhat fuzzy and therefore imperfect. But I haven't noticed any problems.

The number of citations expanded a lot.* Some entries exploded. Take Thomas Kuhn's The Structure of Scientific Revolutions:



Notably, it caught casual references to books, not just structured ones. For example, the article on Science wars mentions Kuhn's work in running prose, not in the bibliography or footnotes.

I haven't updated our free Wikipedia citation feed. That maps articles to ISBNs, but the new data is work-based. If anyone wants to use the new data, let me know and I'll tackle the problem. Cool as I think it would be, I haven't seen any libraries adding Wikipedia links to their catalogs yet.

*The fact that its a new feed, and the somewhat fluid interactions between ISBN-based and work-based matching make it tricky to estimate, but it looks like a 200% increase.