Thursday, June 19, 2008

Web Browsers and Memory Fragmentation

I've been following Stuart Parmenter's blog posts about improving memory usage in Firefox 3. The most interesting part has been his excellent work on reducing memory fragmentation. By making pretty pictures of memory fragmentation under various malloc implementations, he was able to identify jemalloc as the best option for Firefox, and the improvement made by switching was pretty impressive.

But I suspect that ultimately it won't be enough. The advent of Ajax is dramatically changing the way browsers use memory. Rather than loading a fairly static DOM and occasionally doing a few small manipulations in Javascript as the site is used, modern web apps mutate the DOM much more extensively, adding and removing sections repeatedly as new data is received from the server. And they're going to be doing that kind of thing more and more.

There's an interesting problem here which is unique to browsers. Javascript doesn't give programmers direct access to pointers, so it's free to move objects around in memory however it pleases as it executes code. Modern Javascript implementations take advantage of that abstraction, and manage memory allocation with generational garbage collectors that compact memory by relocating objects as they operate. But the DOM, which also has to be manipulated by browser code in C or C++, is reference-counted. The reference counter can't migrate objects between regions of memory, so it depends on the underlying malloc implementation to control memory fragmentation. That's not always possible even for a very smart malloc to accomplish. DOMs in particular should be challenging, because they contain large numbers of strings of varying sizes.

Reference counting introduces memory fragmentation problems in any language, but the situation is particularly bad in browsers. Javascript programs and DOMs on different web pages share the same process memory space. The fragmentation caused by one page can interact with the memory used by other pages, and even after a problematic page is closed, the fragmented memory often can't be reclaimed. With ordinary applications, fragmented memory is all neatly reclaimed when the process terminates. Right now, that doesn't work for web apps.

But this is a problem for browsers, not for web apps. Individual web app authors won't be motivated to care if your browser uses a lot of memory and causes your machine to start swapping once or twice a day, especially if the blame is shared among many popular sites. Unless one site is much worse than its peers, high memory usage makes the browser look bad, not the app.

So I think browsers will eventually experience pressure to fix this problem. As the correct metaphor for a web page moves farther from "document" and closer to "application", maybe it makes sense for browsers to act more like operating systems. Local memory for a web page could be allocated in dedicated regions, and could all be bulk-reclaimed when the page is closed.

That model is simple, but it still allows individual pages to fragment their own memory and consume more and more over time. Maybe a better answer is to find a way to use a generational garbage collector on the DOM. The familiar OS model of per-process virtual memory management was designed for running programs written in languages like C and C++, and that's why it's appealing for handling browser DOMs. But Javascript is a high-level, functional, garbage-collected language. If browsers are little operating systems that run Javascript applications, maybe they should operate more like Lisp machine operating systems than like Unix.

blog comments powered by Disqus