Wednesday, June 24, 2009

Reviews in many languages

I've added a bunch of features around the language that members write reviews in.

Reviews by language. The result is to make LibraryThing more attractive for non-English users—they now get reviews in their own language by default. A few languages, especially our Dutch, French and German sites, already have a decent number of reviews, and this should make it more fun for all non-English users to review books.

For the English-only members, the feature is mostly negative—it's now easy to screen out the clutter of reviews in languages you don't understand.



Most popular works have reviews in other languages. Something like the Da Vinci Code has reviews in thirteen languages, including twelve in Dutch, three in Swedish, two in Catalan and one in Greek! ("Un dels millors llibres que he llegit mai", "Το λάτρεψα"—maybe it's better in translation!)

Reviews uClassified: Most reviews have already been assigned to a language. Rather than use the default language in LibraryThing profiles, which turns out to be very, very weakly related to the language members write their reviews in, I took advantage of the excellent language classification service offered by uClassify (uClassify.com). uClassify runs a Bayesian filter on a piece of text and sends back a list of languages, and confidence scores.

It isn't perfect, but it's pretty good. Only very high scores were accepted as definitive. Short reviews weren't sent for the same. As a result, about 1/8 of LibraryThing's 730,267 reviews remain as "not set."

Feature changes. A bunch.
  • You can now edit your reviews language everywhere you can edit or enter a review. 

  • Your library statistics page (link) now shows how many reviews you've written in every language. Mostly importantly this shows the number of reviews that haven't been assigned to a language.
  • For reviews going forward your default language is set on your account page
  • The catalog now has a "Reviews language" field and a special search for all your reviews in a given language (eg., reviews in English, language not set). These links are available from your stats page).
  • You can Power Edit review languages, and when you're looking at all your reviews in a language, if it differs from your default language, you will get a link to make all unset reviews be in your default language. For example, here are all your unset reviews (link).


Statistics. The numbers turned out something like this.

English/Unset: 650,988
Dutch: 8,636
French: 4,666
German: 4,651
Spanish: 4,463
Italian: 2,876
Swedish: 2,329
Danish: 1,587
Norwegian: 1,231
Portuguese: 1,098
Finnish: 662
Catalan: 443
Etc.

To be done, talked about. As usual, there's more to do. So far, there's no good list of recent or top reviews by language. Come to discuss it on Talk and suggest other improvements.

Labels: , , , , , ,

8 Comments:

Blogger Pollux said...

Really cool new feature!

Thanks a lot!

6/24/2009 7:59 AM  
Anonymous Anonymous said...

Excellent! Many, many thanks!

6/24/2009 10:59 AM  
Anonymous Circeus said...

Please, PLEASE do the same for Common Knowledge material. It is currently not possible to enter material in French on librarything.com (the systems refuses to accept it could not be in English), and I'm forced to assume either the reverse occurs on librarything.fr (and you cannot enter CommonKnowledge for books in English there) or worse yet it will STILL assume the language is English!

Unless I'm mistaken, this means it is not even possible to enter CommonKnowledge if there is no translation of the site in that language!

6/24/2009 5:42 PM  
Anonymous Anonymous said...

Good work, thank you.

Still, even as a non-English speaker I've never really understood why LT has to be fragmented into several languages. Most users want access to the entire data-pool, and we do want our own input accessible to the entire membership. How many of us master more than two or three languages? Not me certainly!

I'd really, really prefer it all to be in English!

6/25/2009 7:05 PM  
Anonymous Circeus said...

Anonymous, have you considered it would make sense for, say the various language-specific details (i.e. first/kast words, epigraphs, dedications...) of a book and its translation to be stored as different languages?

6/25/2009 9:03 PM  
Anonymous Christian Renner said...

Not sure whether entries in different languages are a good thing really.
Of course Americans are the new Romans :-) and we non-Americans (I am German) should preserve our identies, but apart from being the language of the masters of the world :-), English has become a kind of Esperanto, a lingua franca, that so many people all over the world understand - it is connecting all of us. I would not have been able to use Librarything if its language had been Icelandish or - even - French in the first place and I fear that all of us will lose those entries and those contacts that will be written in languages we do not speak or understand. Anyway, of course it is cool for Librarything to be so global - again my congratulation for such a great site (my favourite place in the web).
Kind regards Christian

6/28/2009 2:33 PM  
Anonymous Osbaldistone said...

Just for fun, I ran the text of this blog entry through the uClassify 'mood' filter, and got the following results:

upset (66.9 %)
happy (33.1 %)

Meaning, uClassify is 66.9% confident that the writer is upset!

6/29/2009 8:15 AM  
Anonymous Grimm said...

This is a really great feature and cleverly implemented! Most of users probably wanted to be able to filter the reviews but keep the ability to display the ones in another language, so this is great. Probably a step forward to be able to leverage the reviews mass, like for example for offering LT for libraries out of the US.
Common Knowledge would definitely benefit for a better language mgmt because, if some fields definitely need to be language dependant, some others like dates, should be language INdependant.

6/30/2009 6:45 AM  

Post a Comment

<< Home