Tuesday, January 10, 2006

Performance / Two new features

1. Performance update. Performance has been spotty recently. Accelerating growth—Sunday saw 16,000 books cataloged!—has strained current resources. Improving speed and reliability is now my number one goal. Occasional slowness and even downtime will happen, but things are heading in the right direction.

I'm getting some part-time assistance. (But I'm still looking for a good LAMP hacker; email or send resumes!) So far I've re-jiggered settings and optimized requests. I am looking very carefully at the path up. I will probably be setting up a dedicated database machine and/or a dedicated "thinking" machine. I have a new book-suggestion algorithm that is to book suggestions what Deep Blue was to chess—extremely good and extremely processor-intensive. Bringing it live would crash the present site. It might also take over NORAD computers and threaten nuclear war until forced to learn the meaning of checkers.

I've temporarily disabled all-catalog searching while I work on database speed. These searches were taking up a large percentage of database power.

2. Two new features. Before I turned my attention to speed I did complete two new author suggestion features. Author pages now include a "Similarly tagged" box; tag pages sport include a "Related authors" box.

They are hit-or-miss. A decent example would Ilaria Gozzini Giacosa, author of the cookbook A Taste of Ancient Rome. With only a few tags the system managed to match her up the author of a Medieval cookbook and with Apicius, author of De re cocquinaria, the only extant example of a Roman cookbook. Apicius is in the unique position of being "similarly tagged" with both Marcus Tullius Cicero and Betty Crocker—both fusty and overrated? At the other end, take Nuala O'Faolain, author of Are You Somebody: The Accidental Memoir of a Dublin Woman. She is paired with Barack Obama. Both wrote memoirs, I guess. As more tags enter the system, this sort of single-tag effect should lessen. Here is the "similarly tagged" for David McCullough and the related authors for World War II.

25 Comments:

Blogger Anon said...

nice work man, we appreciate the dedication. Another suggestion I am sure will be ripped like my pricing theory but I will make it anyways. I'd like to see a recommendation on which book to read next in my catalog based upon the highest weighted ratings that other people have given the book in their reviews. This implies you add a "read" field but I think you have that on the list already.

1/10/2006 3:23 AM  
Anonymous Anonymous said...

Deep Blue, I think you mean.

1/10/2006 4:37 AM  
Anonymous Anonymous said...

6:00 AM EST, and MYSQL errors all over the place! Hope it isn't too hard to fix. I'm looking forward to seeing the changes.

1/10/2006 6:07 AM  
Blogger FlyingSinger said...

I appreciate that you are experiencing faster than expected growth and having to work on "infrastructure" (i.e., database performance issues, server issues, etc.) as well as develop cool features. I realize this will probably take some time to straighten out, so users should cut you some slack - we're getting a cool capability that didn't even exist a few months ago, and growing pains are to be expected.

Nevertheless, at the moment, outages seem to be the norm, so I will lay off trying to catalog etc. for a while. I also had to remove the widget from my blog for now as it was taking a long, long time to load or even time out. That's OK, it's not the main thing.

I don't know much about how that sort of script thing works, but if it could have a timer that says "if I can't display the real thing in 10 seconds (or something), then skip it," that would be cool. Maybe it already does but it's not getting to run because it's waiting for database access or something.

Anyway, thanks for providing this and working to keep up with the demand. Maybe you can get some sort of sponsor lined up to help with server capacity? With the features you have and are adding, plus a "critical mass" of catalogued books, I think you start to fall into the category of a valuable web resource a la Wikipedia. Just a thought, though it's also important to retain independence.

-Bruce

1/10/2006 9:58 AM  
Blogger Dennis said...

It might be a good idea to take the site offline so people don't keep trying to do things that just aren't working.

I can't access my profile, view my catalogue, or log in from a browser that hasn't already been automatically logged in. This has been the case for me for at least 5 hours.

1/10/2006 5:12 PM  
Blogger Tim said...

For five hours? How many times did you check? The site has been up most of that time. It went down once at 12 and just went down again 5:20. I'm working on it.

1/10/2006 5:19 PM  
Blogger Dennis said...

Sorry, Tim! That was a careless statement on my part. I checked before going to the library, and after coming back, and it happens that those were just the times it went down.

1/10/2006 5:39 PM  
Blogger Tim said...

Back up (and backed-up). Let me know if there's any problems.

1/10/2006 5:41 PM  
Blogger Dennis said...

I just added two new books with tags, and so far its smooth sailing.

Thanks.

1/10/2006 5:54 PM  
Anonymous Anonymous said...

Norman, couldn't a weighted recommendation just use the rating field that we have already? I don't see why we'd need a "read" field for such a thing.

I dug the prices idea, too. Somehow, though, I view that as the purview of an API-using script (I've thought of several handy scripts, e.g. "total number of books I read last year" or "percentage of my library that is science fiction"), rather than a problem for Tim to solve.

1/10/2006 7:44 PM  
Anonymous Anonymous said...

Dear Tim,

I don't want to look a gift horse in the mouth. The charge for all you have done is miniscule. I think I have said I am not unaware of the ultimately angelic nature of this enterprise.

Still ...

Surely it's more important to make the search function work. The other things you talk about are in comparison icing on the cake. They are extras (my husband called them fluff).

I don't understand much about software so cannot figure out why it is that when I attempt to search for a book in the most simple ways, the search function goes awry or doesn't help.

But so it is.

It may be this is a problem that is hard to fix.

But it is important -- at least to me. I have lots of books and one hope I had was that this library thing would help me manage my library -- find things, know where they are with us.

And that's the job of a working search function.

Pray pardon me for complaining about what I don't understand.

Chava

1/10/2006 8:07 PM  
Blogger Tim said...

Maybe there's some confusion over the term "search." I think you might be talking about the add books feature, "searching" for books on Amazon, the Library of Congress and etc. That is the center of LibraryThing, and in no way impaired.

It is also possible to search your library—the books you have already entered—and do it in various ways. And when you have a book it will tell you who else has it. Ditto authors and tags, which are also searches. And if you are looking at someone else's library you can also search that library.

What you can't now do is to use the search tab to search for all books in all libraries in the system (1.4 million books) for a given word in the title. I agree that that can be interesting, but I don't think it's central to the experience. In any case, I will bring that back as soon as possible.

Ultimately, the deal is this: Today's numbers put LibraryThing as the 5,700 most popular site on the web. According to Alexa, it now gets more traffic than BookFinder.com, Biblio and BookSense, established book businesses with employees and multiple servers. (I also note that Alexa number don't count widgets, now running on over 800 blogs.) LibraryThing is currently running on one server.

I am working to transition to a much more elastic structure—so LibraryThing is zippy at 5,000 and good enough at 1,000. This transition will take a little time and not a little money. I am investing the time and the money. I promise you I am doing the best I can do solve the issues.

Tim

1/10/2006 9:47 PM  
Anonymous Anonymous said...

Dear Tim,

I was referring simply to searching in my own library. A selfish individual desire.

Say I want to find a book by John Smith. I can type John Smith into the search the library, and get far too many books, some with smith and the others with John. Ditto on titles.

If a few hundred books turn up, that's not useful.

Similarly, when the words are alphabetized, they are alphabetized by really the very first letter in a
word and then the second, not by the
first word after "A" or "The." So that makes clicking on this or that column to put words in alphabetical order and search that way frustrating.

I don't quite understand why it is
I am so often frustrated when I try to search my own library. Perhaps
it would be helpful if the search function included comments. Apparently it only includes titles or authors' names.

Jim (my husband) could explain it better. He has the same problem. Perhaps it's a matter of splitting and lumping more precisely or along
particular kinds of linguistic or
content specific lines. I don't
know.

Chava
the book.

1/10/2006 10:57 PM  
Blogger Tim said...

Cava. Ah, okay. I see where you're coming from.

Okay, first, I will be changing how the search works. Right now it goes by a very fuzzy definition of "relevance." If you type "John Smith" it delivers a long list organized by "relevance". So, books by John Smith should come up first, but down the line it will even show books that have just one of the words or even a word that's not quite the same (eg., Smyth). This is "helpful unhelpfulness," and something of a throwback to the early web, not how search engines work today.

I will be changing this shortly. The change will actually make it faster too.

As for the The, A, etc issue, I have it on my list. It's a little more complex than you might think. Library catalog records have a little digit that tells systems when to start sorting, in case the "the" is in Urdu or Tagalog. Amazon records aren't quite as helpful. I can, however, fake it and get it right most of the time, at least for English.

1/10/2006 11:07 PM  
Blogger Tim said...

I should add that I really do appreciate these comments. They give me a sense of what to work on and in what order. At the moment, I need to work on anything that helps the server load.

1/10/2006 11:14 PM  
Anonymous Anonymous said...

"Similarly tagged" is not working for me. When I click on "calculate this", nothing happens.

1/10/2006 11:33 PM  
Blogger Tim said...

Give it a second, depending on the book. It doesn't calculate it until you need it.

1/10/2006 11:49 PM  
Anonymous Anonymous said...

As you have propably figured out by now, optimizing is quite complex. It helps a lot if you can prioritize things and maybe analyze which features are most used. The gain/effort ratio is much better that way.

1/11/2006 8:30 AM  
Anonymous Anonymous said...

Tim,

A small suggestion -- until you figure out what you want to do about a users' forum where people can post questions and help each other out with answers -- it might be useful to do an "open thread" thing here now and then.

This is also my vote for that forum. There are a million places on the web to discuss books, but no place to put our heads together about using the great tools you've put together.

1/11/2006 10:26 AM  
Anonymous Anonymous said...

On searching:

As long as you're planning to work on this Tim, I really feel that all fields should be searchable. I have had cause to look for publishers and translators and comments and it's just not possible right now.

In fact I think many of these fields should actually work at least partly like tags but that's something else entirely.

1/11/2006 12:17 PM  
Blogger Anon said...

Angharad, saw your note regarding my suggestion for a suggested next book to read. I might be the odd man out but I buy a lot more books than I'm able to keep up. So I was looking for a suggestion of the books that were already in my catalog but not yet read. The suggestion would look at the ratings of other reviewers (weighted by library similarity which is already calculated) to tell me you really want to read this one next. I think I would use this especially the night before a long trip. Right now I stare at the shelf and try bitterly to make a decision. Love your use of tags and need to look into the barcoding. Any suggestion in either regard would be appreciated.

1/11/2006 2:19 PM  
Blogger Anon said...

Tim, I know you are booked solid with work so I took the libery of creating a discussion group for Librarything at:

http://groups.google.com/group/Librarything

I hope you are ok with that and if you want me to turn this over to you at any point I would be happy to do so. Hopefully this would be a logical extention to your blogging.

1/11/2006 2:32 PM  
Anonymous Anonymous said...

Give it a second, depending on the book. It doesn't calculate it until you need it.

No, it doesn't seem to calculate at all. I've tried a few times, waiting several minutes, not seconds. And nothing happens. (I don't mean that it begins to load and takes a long time. I mean, quite literally, nothing happens.)

1/11/2006 5:45 PM  
Blogger chamekke said...

(Repeating this from another thread, since I just realized that it may not have been posted to the ideal spot.)

Tim, are you planning to put together all these instructions and tips on feature usage (such as the rules for improved tag searching) and provide a beefed-up FAQ one of these days?

I for one am beginning to have difficulty remembering all the bits and bobs that are available on LibraryThing, and I doubt that I'm alone. At the moment most of these tips can only be found by searching blog entries, which is not the most efficient way to do it. And, brand new users may not even realize that the information is there to be found.

If you'd like to do it, but simply don't have time, perhaps you could ask for volunteers ;-) I wouldn't mind writing up a subsection or two.

1/11/2006 7:13 PM  
Blogger Anon said...

I've started by creating a discussion group but can expand it to faqs as well if it helps. I'm not sure I am the best to write one yet but can certainly moderate or list appropriately within the discussion group or elsewhere it makes sense.

1/11/2006 8:16 PM  

Post a Comment

<< Home