If you’re wondering why I’ve been quiet the past few weeks it’s because I’ve been devoting most of my free time to finishing off a new benchmark I’m releasing today called GUIMark. GUIMark is kinda like an Acid3 test on speed that’s geared towards RIA technologies. The goal was to figure out how to implement a reference design in different runtimes and then benchmark how smoothly that design could be animated. So far I have implementations in DHTML, Flex, Java, Silverlight 1 and Silverlight 2. All the results and and implementation details can be found under the GUIMark page.
GUIMark shares alot in common with another RIA benchmark Bubblemark. I’ve written a bit about Bubblemark and why I think an alternative is necessary, but I do believe Bubblemark and GUIMark can coexist while serving 2 different purposes. Alexey Gavrilov stated it best in that he sees Bubblemark as a sortof ‘Hello World’ launchpad into comparing different environments and I agree with him. Bubblemark is a *very* accessible test suite and its easy for any kind of developer to jump in and play around with performance techniques. GUIMark takes a different approach by trying to benchmark the types of UI elements common in our Web 2.0 world. This includes things like vector redraws, alpha transparencies, text reflow, bitmap motion, and 9 scale slicing rules. From there I just fill up the render pipeline until it becomes so over-saturated that it becomes easy to visually distinguish which rendering engines are more efficient then others. As a result, the benchmark is more complicated on a visual level and requires a bit more time then Bubblemark to understand the implementation rules. Lastly with GUIMark I’ve tried to get into some of the lower level details behind how rendering engines work and how that’s affected the creation of this project.
I’m hoping that developers and designers will be able to use this test suite to identify any pros or cons to choosing a particular environment when visual transitions are a key element of the experience. I’m also hoping these benchmarks provide a spotlight for the community that we can turn toward the runtime engineers inside Sun or Adobe or Mozilla to demand better performance.
Go to GUIMark home page
A few months ago someone on the Adobe boards asked why the Flex testcase in Bubblemark seemed to act so different in AIR versus in the browser. Yesterday, I saw the same question come up again and I figured I’d finally weigh in on the topic. The simple answer is that the test was created improperly, the complex answer has to do with the inherent limitations of the test itself.
First off, for those who don’t know what the Bubblemark test is, its a simple animation test case implemented in different GUI frameworks, its kinda like an Acid2 test for rendering speed. The charts should ideally give you a base number to understand how well one technology compares against another for rendering. As a GUI developer I’ve been a bit underwhelmed with the whole thing and heres why:
- The author doesn’t understand Flash’s rendering engine. The easiest way to illustrate how incorrectly the Flash test was designed, is to download the source and change the compiled framerate to 1 fps. Re-compile and run the test and you’ll notice the benchmark framerate running at ~50 fps. You can clearly see the balls only moving once per second, yet the test thinks its flying along. This is because the testcase makes the incorrect assumptions that changing the properties of a DisplayObject causes it to render right away. The reality is, Flash holds on to all display updates till the next render pass and applies all the latest changes at once. Changing the position of an object every 5 milliseconds is meaningless when Flash is bound by a 33 millisecond render pass (or whatever you’re framerate divided by 1000 happens to be). A correct test case would rely on an ENTER_FRAME handler to change x and y values and get rid of any Timer calls.
- Framerate tests above 60 fps are meaningless. Seriously, any GUI benchmark designed to test above 60 fps is bogus. In fact, a pretty simple optimization technique for Adobe or Sun would be to cap the paint requests that get forwarded to OS X or Windows, simply because the majority of computer users these days are on LCD panels which natively run at 60 fps. Some operating systems even go a step further and limit the effective framerate of paint requests it sends to the videocard (see Beam Sync on Mac). So when you see the Java test case fly up to 120 fps on Bubblemark, you can realistically only see 60 of those frames, and there might be a chance the other 60 are never even calculated by Javas layout engine.
- The test just moves balls around! This is my biggest beef with the benchmark because it only tests one simple aspect of the rendering engine in these technologies, which is bitmap translation. How do bitmaps moving around the screen tell you anything about the capabilities of the respective technologies? Do the JavaFX guys really think optimizing this usecase will make their technology relevant? The only thing Bubblemark will tell you is which runtimes might best handle bitmap particle emitters….thats about it. Theres a lot more that goes into both the layout engine and the rendering pipeline of these different technologies and its a shame that only the most basic aspect is being tested. The funny thing is, if you open up your task manager while running the tests, you’ll notice that several of them don’t even try to run at full speed, my CPU is sitting as low as 20% in some cases. This means the runtimes don’t even consider the test difficult enough to give it full attention and have opted for using less power over faster motion.
I don’t mean to cut down the developers responsible for Bubblemark because at least they came up with a simple way to help us all compare these different technologies, I just think its a bit misguided to put any meaning behind these numbers. When evaluating your options for a GUI framework in our flashy web 2.0 world, you need to consider how well a technology can handle object scaling, alpha transparencies, rotations, text reflow, along with basic x and y translation and dynamic redraws. Even more realistically, developers need to be aware of the limits in the 25-45 framerate region since this is where you can efficiently balance render complexity with smooth animation. I’ve uploaded a couple quick test cases in Flash, HTML, and Silverlight that I think provide a good foundation for stressing a rendering engine and hopefully I’ll get a chance to expand them more into a full test suite.