On Thu, Jul 01, 2010 at 04:07:38PM -0400, Richard Pieri wrote: > On Jul 1, 2010, at 3:45 PM, Derek Martin wrote: > > > > But if it does indeed keep databases about its mail stores, there's no > > reason for that to be true. It can do any number of things, including > > maintain a header cache in its database (and I would imagine it > > probably does exactly that). In which case, the performance between > > the two should probably be about the same, given the same sized mail > > store. Even if it didn't, there are potentially other ways to prevent > > timeouts. > > Nope. Even with the database, header cache, whatever, manipulating > multi-GB mbox files is *slow*, slow enough that it can cause the > server or client to time out the connections. Not buying it; even if you don't cache the entire headers (which would eliminate reading the mail store at all except for new messages), reading through the whole file would be faster than the open->read->close->disk head seek (not seek()) loop that's inherent to maildir. If the IMAP server times out on these, and *doesn't* time out on equivalent maildir folders, it's doing something braindead. I've been involved with Mutt development for quite a long time, and before Thomas Glanzmann added header caching for maildir, scanning the headers of large mbox files was *orders of magnitude* faster than the exact same maildir folders. With header caching (which unfortunately does not support mbox at this time), mbox is only slighly slower, or even still slightly faster than maildir, depending on hardware, filesystems, etc. If mbox were cached, it would still be faster than maildir for most operations that matter to my usage patterns. If the programs were smarter, they also could do things like expunge in place, expunge in the background, etc. which would (at least appear to) make mbox take much less time to do those operations (and also use a lot less temporary disk space). Most implementations don't do that because the code is more complicated and the risk of data loss is higher if something goes wrong. But it's totally achievable; I prototyped code that did some of this a few years ago. > > Nor do you with mbox, unless the IMAP server is stupid. Barring > > hardware failure, the IMAP server barfing on a huge file should only > > ever happen if it's broken (i.e. stupid). > > "Should" and "reality" often don't coincide, and I've seen users do > some pretty brain-damaged things -- and had to clean up the messes > afterwards. The point is, you can't blame the file format for that. You can blame the code, or the users... But it's not an inherent problem with the format. Practically speaking the risk of catastrophic data loss is higher with mbox, but we're still dealing with very low probabilities, and after all we're talking about a home user's e-mail here. I'm sure gaf is smart enough to back up anything he cares about, and I'm willing to bet that even if he lost *all* his e-mail in some catastrophic meltdown of his environment, he wouldn't cry (too much =8^) about it. > mbox is a pain to fix by hand and that pain is compounded when > having to deal with multi-GB files. Similar explosions with Maildir > -- when they happen -- are more files to fix but they're > individually smaller, and then it's delete the cache and let the > IMAP server rebuild it. I've managed both in corporate mail environments for years, and I've never found a practical difference. Don't get me wrong, I think maildir is fine, and I use it (and I use mbox too, side by side). But I think mbox often gets a bad wrap for no good reason. It's true that maildir has advantages over mbox, but it's equally true that mbox has advantages over maildir, and as a practical matter I think the choice of which to use mostly doesn't matter, though how you use e-mail may tip the scales one direction or the other. -- Derek D. Martin http://www.pizzashack.org/ GPG Key ID: 0xDFBEAD02 -=-=-=-=- This message is posted from an invalid address. Replying to it will result in undeliverable mail due to spam prevention. Sorry for the inconvenience.