Google Native Client – Overview and Performance Experiment

Print This Post Print This Post

The use of native instructions to supplement a web application is nothing new, but engineers at Google are attempting to bring a fresh, and secure, perspective to the concept. With a novel architecture and security model, Native Client makes for an interesting project with promising results.


Here we go again with another foray into the world of native instructions within a web browser. This is nothing new as we have always dealt with the ubiquitous Netscape Plug-in API (NPAPI) and the much derided ActiveX technology from our friends at Microsoft (among others, of course). Developed in nice, quaint, simple times, these architectures did very little in the way of security. They operated largely off of the concept of trust and are given the full run of the PC: it’s Filesystem, networking interfaces, etc. Internet Explorer has come a ways since those days by offering little annoying notification bars at the top of the page alerting you that it needs to run an ActiveX component. I almost never see that (and if I do, I realize I’m in IE and immediately switch back to Chrome or Firefox). Both of these leave room for social engineering attacks.

We also have the not-so-native technologies for webpages in Java, Microsoft Silverlight, and Adobe AIR (Flash/Flex), all of which are a means of isolating the untrusted code they contain from the host operating system’s interfaces. These generally make the lives of developers easier in their “write-once-run-anywhere” and are generally safer, but still have their fair share of security flaws pop up every now and again.



Google Native Client

NaCl Architecture Diagram

NaCl Architecture Diagram

Enter: Google Native Client (NaCl). At a high level, it’s a module of compiled C or C++ code that is executed in a sandbox and provides a computation facility to a standard, JavaScript-based, web application.



What we are getting (at a slightly lower level) is the execution of this compiled code in an x86 sandbox, which provides memory segmentation and isolation from the browser and the host operating system. The sandboxing of the code provides the user protection from unintended instructions as well as protecting the module itself from potential operating system defects (if you believe that any operating system is defect free, then, boy, do I have a bridge to sell you!).

NaCl Module Design Diagram

NaCl Module Design



We also see that direct access to the Filesystem and operating system resources has been blocked. This is done within the NaCl service runtime which disassembles the binary module (thus no tricky compilation schemes allows) and checks the instructions against a white-list; any illegal instructions found and the module will be rejected. Aside from providing the primitives for memory allocation and deallocation, the service runtime also explicitly prevents networking calls such as connect() and accept().
Additional features in NaCl include a subset of POSIX threads, and SSE instructions for parallel computing. The Pepper Interface, also included, provides functions such as compute, audio, native 2D, and other plug-in accessibility features. Common POSIX file I/O is available, but limited to communications channels & web-based read-only content.


Performance Experiment

If you told me that native code would execute faster than JavaScript (and were actually serious about it), I’d probably respond with a dry “Ok, and?” That said, it is still interesting to see how much faster it is, also considering that many typical system calls are provided by the service runtime, rather than the operating system itself.
So what I went and did was create a small NaCl module project for myself. It’s nothing more than a qui



ck and dirty implementation of a merge sort algorithm. NaCl’s strong suite is pure computation, and that’s what I decided to throw at it. I also implemented a merge sort algorithm, for JavaScript, and it would be these two implementations pitted against one another.

It’s also kind of boring limiting the results to just Google Chrome (the only browser thus far to implement NaCl), so I crafted the test to allow for just JavaScript.

To get a nice trend of performance given different memory requirements, the test would start at a lower bound for element size and iterate up to the higher bound. I would also allow for multiple cycles of the test and average out the results at each iteration level in an effort to reduce the effects of external processes stealing CPU time. Each run of the sort algorithm would time just the sort, not the actual creation of the list. I targeted whichever browsers I had on my PC at the time in both Linux and Windows 7 (and thus on the same hardware). For the native call, I also added an option to include the time it took to make the call (sampling the time from JavaScript) or just the sort algorithm within the module (sampling the time within C++).

Also included in the test are the results of the same C++ algorithm when executed on the command line in Linux. The executable was compiled and linked by the GCC compiler provided with Linux, not with the NaCl runtime, thus does not contain any of the NaCl libraries or optimizations.


For the test, I used the following parameters:
Test Cycles: 2
Low Bound: 100
Upper Bound: 100,000
Iterations: 50
Include Native Call in Time: Yes




Average Performance Rate of Increase

Average Performance Rate of Increase

The results show pretty much what one would expect with the native code executing significantly faster than the JavaScript. The only results not included are for Firefox 3.6.15 in Linux (outlier, too slow) and Internet Explorer 8.0 on Windows XP (way too slow, lost patience and killed it mid-test).

Ultimately, what I found is that with each test iteration (for this test meant an increase of 2%, or 1,998 elements in the list) JavaScript would take ~22% longer, NaCl would take ~6% longer, and the command-line execution would take 10% longer. I did not expect NaCl to outperform the command-line executable. NaCl also showed a negligible 1-2 millisecond overhead incurred for each call to the native module.



Compute Time Performance Trend

Compute Time Performance Trend

Compute Time Performance @ 100,000 Elements

Compute Time Performance @ 100,000 Elements

Performance Time Trend Among Native Executions

Performance Time Trend Among Native Executions

Performance Time Trend Among JavaScript

Performance Time Trend Among JavaScript



Native Client Performance Test Page:

Native Client Performance Test C++ Source Code:

NOTE: This code is not production ready and makes a ton of assumptions. I include it here primarily for reference.

Test Results Data:






Comments are closed.