It’s been exactly one year since GUIMark 2 was created and it seems the natives are growing restless. Over the past few years I have spoken with and worked with a few vendors about performance issues in web technologies. Most of this stuff usually stays pretty internal, but this time I’ve gotten a new request straight from Adobe’s QA team. Build a new version of GUIMark that’s more comprehensive, focused on mobile, and remains open to the community. With 200 test results this is definitely the biggest GUIMark yet.
The philosophy for this benchmark is the same as before. Each test has been designed to find the breaking point of your phone or tablet. By forcing the devices to render at less then 60 frames per second we can ensure that stated performance matches actual throughput on the device. In fact most of these tests were designed to average around 30fps so there would be plenty of room for future growth. Similar to previous tests, I’ve only had the time to build these in Flash and HTML5, however it may be interesting to test native apps at a later date.
For the sake of disclosure, I should state that Adobe funded time through my employer EffectiveUI to enable me to write these tests. While the ideas, code, results, and analysis were conducted entirely by me, I understand that people may read in some bias as a result. Please keep in mind that this was designed to be a 3rd party analysis, not a pat on the back, and I think the results reflect this. Also, if you look at the source code for the two platforms, you’ll see that in most cases the code is line for line identical, only diverging when it comes to platform specific APIs.
Nine devices across a range of hardware and firmware were used to run the tests. While the devices provided a good sampling of whats available on the market, it is by no means a definitive list. Each device was running the latest software available to it, and in the case of Android represents a moderate amount of fragmentation. While it would be ideal to only test devices with the latest version of Android, the reality of mobile right now demands that we all work against a wide variety of firmwares. Tests on the HoneyComb platform were originally done with the 3.0 version, but with the release of 3.1 at Google I/O I wanted to see how this affected performance. I’ve only listed 3.1 results in the main article below, but the 3.0 results are also preserved in the Google spreadsheet linked at the end.
Lastly, the tests have been designed to force the device into a 480 pixel wide viewport which I feel is a good median resolution for interactive content like a game or chart. They have also been designed to run in portrait mode. The source code for all the tests is contained here, and the directory on my webserver containing all the runnable tests is also available to browse through. Keep in mind that these tests were designed to run on a mobile device, and if you view them in your desktop browser you will likely see them all running at the maximum framerate of 60 fps.
First up is a bitmap drawing test that was designed to simulate a scrolling shooter similar to the old Raiden arcade game. The game logic is minimal so this is all about pushing pixels around. Unlike the GUIMark 2 bitmap test, this new version doesn’t include scaling, anti-aliasing, rotation or half pixel compositing. This is just a straight up blitting test, which is more in line with how game developers would optimize their drawing code for handheld games. Like previous tests I’ve done, this one runs on absolute timing for the position of elements in the game. This means slower devices don’t run the test slower, instead the rendering just looks more choppy.
Also, one of the comments on GUIMark 2 was that HTML5 should draw faster when the source image was cached to a separate canvas first, so I’ve included 2 versions of the HTML5 test to investigate this theory.
|Numbers In Frames Per Second|
I actually expected HTML5 to do much better on this test. This is blitting 101 stuff here, no fancy transforms or anti-aliasing, just straight up compositing. Flash on the other hand chews through it without a problem. For the most part HTML does seem to benefit from caching image data to a canvas first and copying pixel data from there to output to the final canvas, although the benefits weren’t universal. The asterisks in the results for the 3 tablets is explained further below.
I think that in terms of ‘real world’ tests the original GUIMark 2 vector test better represents the type of things people will use the vector APIs for, so this time I wanted to do something more fun. This new test is more akin to a Processing demo, something I imagine accompanied by a cool audio tone generator and posted on a site like chromeexperiments.com. It also gives us the chance to compare complex vector fills and gradients that were left off the GUIMark 2 vector test. This test runs off absolute timing just like the bitmap test.
|Numbers In Frames Per Second|
Even with only a handful of shapes on screen at a time, this test is pretty devastating to the drawing APIs on both platforms. You can barely even detect the complexity of the gradient in the HTML5 version on mobile. Without having a desktop browser to validate it with, I would have thought the gradients were completely missing when testing on the phones. Flash manages to keep a pretty sizable lead over HTML on this test. Honestly I’m not surprised by this fact since vector drawing has been the keystone of the Flash runtime since its inception.
|HTML5||HTML5 Disabled||Flash||Flash Disabled|
|Numbers In Frames Per Second|
Last time I gave up on my attempt to compare video performance because there was no way to retrieve frame rate information from the system. This time around I decided the only way to make this work was with a high speed camera. By encoding the frame data directly into the video and putting it under a high speed camera, we can objectively record how often frame data is being dropped from the render queue. The better the decoding engine, the less we should see frames being dropped.
This test is a bit different from the others for a couple reasons. Video tends to follow standards and decoding chips are designed around those standards. Performance doesn’t scale linearly like standard CPU bound benchmarks, and you’ll reach a point where the decoder hits a brick wall. It’s more important to test those standards than to compare everything against a single heavy stream (which would be more akin to the tests above). With that in mind I’ve created four tests that stick close to YouTube encoding standards, using the following video profiles.
|H.264 Base Profile 360p video at 768 kbps||Level 1.3 video, good for ‘lowest common denominator’ testing|
|H.264 Base Profile 480p video at 1250 kbps||Maximum video size you’d expect to see delivered to a phone, between 1.3 and 2.0 Level|
|H.264 Base Profile 720p video at 2000 kbps||Large size Base Profile encoding, appropriate for tablet devices but good for stressing phones.|
|H.264 High Profile 720p video at 2000 kbps||High Profile video at 2 Mbps, maximum detail that you can expect to see on a mobile device for years to come|
The chart below lists the percentage of video frames that are displayed by each platform. If you want to look at the high speed results for each device, you can view them all in the results directory.
|360p||480p||720p||720p High||360p||480p||720p||720p High|
|Nexus One||100%||100%||99%||wont play||100%||100%||7%||11%|
|Numbers In Percentage of Frames Played|
Please Note. The Gingerbread release for Galaxy Tab enabled hardware acceleration for Flash video, while numbers are now near 100% since the update on 5/16, I didn’t have time to rerecord the test and parse the results
Before you ask, yes I actually sat through all of these high speed videos and counted individual frame skips, and it was thoroughly painful. Maybe next time I’ll wise up and write an image analysis program to do it for me. Subjectively, I would argue that video that stayed above 70% looked good during playback. Anything below that mark will have too much stutter and really starts looking like crap.
Flash really takes a beating in this category as many of these devices only allow software decoding for Flash video. You can clearly see which devices are enabled for hardware decoding like the Playbook, Xoom, and Atrix. Adobe has informed me that exposing hardware requires Google and the manufacturer to deliver the appropriate drivers, which becomes evident when viewing the performance differences between Xoom 3.0 and 3.1. HTML5 video on the other hand seems to be fully hardware accelerated on all of the phones, although interestingly HTML5 won’t fall back to a software renderer for certain files, and simply refuses to play the video.
What’s wrong with the tablet results?
Every time I build these tests there’s always some hidden problem that I stumble across that I didn’t expect, and this time is no different. You will notice in the HTML bitmap tests I had to place an asterisk next to the frame rate numbers for three of the tablets, the reason why is because the frame rate reported by the device is extremely inaccurate. With the high speed camera we can see just how far off the numbers really are.
*Note that the Xoom in the video is running 3.0, and while this affects 3.1 as well, I didn’t have time to recapture it on video.
While the tablets showed the most dramatic problems here, I’m pretty sure I saw it manifest on the Atrix as well, just to a lesser degree. The behavior doesn’t seem to exhibit itself on either the vector or compute tests, and none of the Flash tests show this problem either. The Playbook also doesn’t seem to have this problem. My best guess is we’re seeing a problem with WebKit image rendering, with the browser run loop falling out of sync with the GPU somehow. Hopefully someone out there can shed some light on this problem.
This test was much bigger then anything I’ve done before, and we’re not done yet. I went back to my old GUIMark 2 tests and ran them as well to provide even more numbers to slice and dice. I think those old tests are still perfectly valid and even show how a couple of the devices have improved since they were first tested.
The results of all the tests are broken down on this Google spreadsheet. GM3 refers to the current tests and GM2 refers to the original GUIMark 2 mobile tests.
The Motorola Atrix clearly stands out for overall performance among the phones. On the tablet side the PlayBook took the lead for Flash performance, and while the Xoom posted the highest numbers for HTML5, the truth is that a few of those tests should have their numbers halved since the device isn’t rendering to the screen at the same rate as the listed fps. In terms of interactive content overall, it’s safe to say that Flash maintains a 2x performance lead over HTML5 on average.
The video side tells a different story. All of the devices are able to chew through the full suite of HTML5 videos with only a few exceptions. Flash however is riding out a transition period in which some devices offer hardware acceleration while others fall back to software decoding.
There’s a lot of information to absorb here, and hopefully some of the finer points will be fleshed out in the comments, but here’s a quick summary of my thoughts after working on this test.
1. The Flash VM performs really well on mobile chipsets and I don’t see any evidence here to support the idea that Flash is slow on smartphones and tablets. High end videos are below par at the moment, but the 3.1 release of Honeycomb illustrates that firmware updates are the key to solving this issue.
2. I have a sinking feeling that browser vendors are happy enough with current Canvas 2D performance. The performance deltas between Flash and Canvas are nearly the same as they were a year ago when I released GUIMark 2. Maybe I’m wrong but all I hear about in tech circles is improvements in CSS and SunSpider performance.
4. I wanted to include a Windows 7 phone into this review but the browser couldn’t handle any of these tests. If anyone has access to Blackberry or Palm phones I’d be happy to include them in the spreadsheet as well, just add them to the comments below.
57 thoughts on “GUIMark 3 – Mobile Showdown”
I would be grateful if you would kindly compile the Flash benchmarks to iOS apps using the latest version of air packager so we can see them running on iPhone and iPad. Perhaps for fairness you would need to use phonegap or titanium appcelerator to compile the HTML5 games into an app as well using an embedded WebView control, for fairness. If the SWF performance of the latest version of flex on iOS is as good as anecdotal reports have been recently, the reaction from the game dev community would explode.
Keep up the wonderful work – and THANKS for your effort and time!
Running native comparisons gets tricky, should the html5 stuff be running in phonegap, or should i write a native IOS version? If native, should I use CoreGraphics or OpenGL for the drawing? I’m not sure anyone would be satisfied unless every angle was covered there, and it could get quite complex.
So exactly what are you comparing? HTML5 running on Browser vs Flash (Installed App)?? Hopefully this is NOT the case, if it is.. then this is just dumb!.
Here are the results with a Nexus S 2.3.4 stock browser with Flash 10.3.
I didn’t bother running the GM2 tests as the GM3 results are indicative of the performance (less than an Atrix, about on par with a Desire HD).
KP, all the tests are visible above to run yourself, everything is running in the browser.
Thanks Nick, I’ll add these results in a bit
Do you have some number with firefox4 on mobile!
Cedric, does Firefox 4 on mobile support Flash yet? It didn’t last time I checked.
This is awesome. Great job! Thank you for creating this resource.
please fix the embed of flash that it will be able to test it using flash player 10.3 on mobile
Jenia, I’ve been told this problem will be resolved shortly and only affects a couple devices, there is nothing I can currently do to fix this, as I have no flash detection running.
Great stuff – thanks for taking the time to run these tests, and write up such a detailed report.
Great evaluation! I’ve just done something quite similar-tested spritesheet based animation techniques. Here are my findings: http://blog.krozalski.com/?p=1.
As soon as I’ll have some time I’ll try to port you “bitmap” test to haxe/NME to see what could be achieved.
Can you add Asus Eee Pad transformer as well? It’s the best selling honeycomb tablets so far…and Xoom is just lame…
Mobile Safari with GPU hardware acceleration will annihilate Flash again in performance when iOS 5 is released. Just because Adobe paid you to make Flash look good, doesn’t mean it’s actually good.
Here are some results for the HTC Incredible S (2):
I also ran into some problems with flash not embedding properly, but running the SWFs by themselves didn’t appear to cause any relevant performance disparities.
@HTML5: Clearly you have don’t have sufficient technical background to understand what I’m about to tell you, but I will anyways.
There’s also the shitty garbage collector, and lack of fast lists (vectors) that but I’m not going there.
Video and bitmap rendering speed is irrelevant, both Flash and HTML5 are hardware accelerated, there is no major difference there. The difference is in the CPU usage, and in this case, Flash is clearly the winner. Apple’s just being lame, upsetting their customers and betting on a technology that is clearly inferior and will remain to be so in the foreseeable future.
Sorry for the rant. Peace!
I was not going to say anything, but the last comment pushed me to do so (specifically, the comment that these results imply that “Apple’s just being lame”):
How about power consumption/battery life? After all, this was one of the main reasons Apple made its choice, and these tests do not address that issue.
Kudos to Sean for disclosing Adobe’s funding up front; I would not expect Adobe to fund a benchmark that fully accounted for battery life.
Take these results as an important indicator of performance, but keep in mind that they do not try to address an issue that is more important for many users.
battery consumption can be optimized. Flash is like DirectX a company maintained system with high value to the company. HTML5 is like opengl with many extensions and diverse implementations and many interests.
One company is very agile, konsortiums are usually not agile.
Flash will always beat HTML5 unless Adobe looses interest in Flash.
@Christian: Could you delete all these spam posts? Is there any way to make the comment box require a captcha?
@paul: A real-time application is achieved by processing all data required to display the current state of the game, aka 1 frame, and do this repeatedly, as fast as possible, thus giving the illusion that the player is watching a movie they can control. This is true for most games (except some turn based games), which means the CPU of your mobile or your PC is CONSTANTLY executing code. When you’re playing a game on your mobile, whether it’s on Flash, HTML5, or an app, your CPU is likely running 95 – 100% of it’s potential output.
Conclusion: The battery usage is pretty much THE SAME! (except Flash runs 2x as fast)
Wicked post, but all these need to be re-run now that iOS5 has full hardware accelerated canvas.
And if anyone is wondering, Flash still comes out on top:
We’ve been using the GUImark tests for our performance benchmark testing on our Storyboard UI Suite. They were very easy to port since it was just moving JS to Lua and are really nice examples that have helped us tune our performance. We just posted a video showing them running on a Android platform and thought I’d share.
Thanks for all your work,
“We will no longer continue to develop Flash Player in the browser to work with new mobile device configurations (chipset, browser, OS version, etc.)”
Flash is dead. How does it feel to eat crow now, Flash apologists?
This is a fantastic website and I can not recommend you guys enough.
can you please retest and post results against playbook 2.0? I’m reporting 40fps which is a large improvement versus prior tests.
It seems the results for the PlayBook is outdated. With the OS 2.0, the PB is so much smoother when running those tests.
Is this test dependent on network and its speed?
in computer java was mile ahead in the second version java adapted(http://www.jesperjuul.net/ludologist/guimark2-some-html-flash-java-benchmarks)wonder if similar result would occur?in window i had to run in classic to enable java acceleration.if similar result were to be shown html5 and flash are both very slow .but last i checked html5 wasnt gpu accelerated.and flash has some kind of vsync enabled so 60 fps is max you get!(for testing purpose i mean)also html5 is very new and everybody want to adopt it so they ll tweak it to be on par to past solution im sure over time!ecmascript isnt going anywhere anytime soon tho ,a lot use it, and will use it in the futur
Is this test dependent on network and its speed?
Comments are closed.