Search |
||
Beware of the NativesPosted by mlam on December 7, 2006 at 2:11 PM PST
There are a lot of not so nice things about using native methods. Here are some:
But if these reasons aren't enough to deter you from using native methods, try this on for size:
This seems to go against most people's expectations, but it is the truth. First of all, there is the reason due to what goes on in the runtime stacks when you invoke native methods. I've talked about that in my previous articles (here, here, and here). There, I showed that using native methods incurs bootstrapping and extra frame pushing/popping overhead which results in degraded performance. But there are also many other reasons besides this. To be fair, native code can be used to help improve performance when used in the right places. I will explain those cases as well. The key is to use native code "carefully". Ok, let's go bust the "native" myth ... Anything you can do, I can do better! The phoneME Advanced (CDC) VM, CVM, has a JIT that is far more sophisticated than that. The CVM JIT can yield gains from 4x to 14x (over interpreted bytecodes) depending on the application. The high end numbers are for very simple applications like the small naive loop. Note: these numbers are very rough estimates that I made using an educated guess based on internal measurement results. Actual performance numbers may vary (lower or higher) depending on the device architecture and benchmarks used. But is this enough? Can JIT compiled code perform as well as or better than native code? Well, first we need to understand how the JIT gives us some of these huge gains. I'll highlight 2 aspects: Inlining Runtime Profiling In CVM, each method has an invocation counter that tracks its "hotness". The JIT inliner actually factors the hotness of callee methods into its determination of whether to inline or not. Hence, callee methods that are more hot will be more likely to be inlined. Less hot methods will be less likely to be inlined. This means that the inlining is selective of hot code paths that are taken by the application. For example, mA calls mB1 and mB2. mB1 calls mC1. mB2 calls mC2. From profiling, we know that mA calls up to mC2 a lot via mB2, but not so much to mC1. When compiling mA, this allows us to choose to inline mB1, mB2, and mC2 into mA, but not mC1. This allows us to invest the cost of inlining where it will yield us the most performance gains, while not incurring the cost as much in other lower yield areas. No, you cant! There are also profilers for C code. But applying these to Java native methods is something else. Yes, I can! The problem with this is that it assumes that the captured profile is representative of the actual execution profile of the application at runtime under real usage. A lot of times, this static approach does yield fairly good results. But it is no substitute for what a JIT can do with a runtime profiler if the application is very dynamic in nature. Now, factor in that the Java platform is a dynamic environment where new code can be downloaded at runtime. In other words, the callee method may not even be known or available to the C compiler and/or profiler at development time. There is no option to tweak the optimizations for executing code that doesn't exist yet. Another issue is with the Java language's support for virtual methods. With virtual methods, the C compiler won't be able to know which callee method to inline. Most modern JITs (including CVM's) employs a technique called speculative inlining to solve the problem of inlining virtual methods. What happens is that the JIT will inline an expected callee method based on its hotness. When the compiled method is executed, it will first check if the method to be called is the expected one that was inlined. If so, it will proceed with executing the inlined code. Otherwise, it will do a regular virtual invocation. There are also many other variations on this scheme, but the basic idea is the same. This speculative inlining together with the runtime profiling allows virtual methods to be inlined efficiently by JITs. It would be difficult for a C (or C++) compiler to implement speculative inlining in an efficient way without the help of a runtime profiler. One possible approach is to inline everything blindly or to use some heuristics. The blind approaches results in too much code bloat without necessarily yielding results. Code bloat also hurts cache locality and, therefore, performance. The heuristics approach is a blind guess. You may have a winner or a loser, and you won't know for sure until after your code has been deployed. Anything I can do, I do better than you! Secondly, in order to be at least somewhat portable, native methods need to be written using an API like the Java Native Interface (JNI). If you look at the JNI specification, you'll see that every access to Java data structures like methods and fields are through the use of an indirect function call through the JNI environment interface. Hence, if your method needs to manipulate Java object fields on a regular basis, the field accesses will incur a lot of overhead in terms of multiple function calls. In contrast, JIT compiled code will be able to access the same object field just like C accesses a struct field i.e. in a few or only a single machine instruction. The JNI overhead is in the order of something like 20x to 100x. That's a stiff penalty. To be proper and correct, a lot of JNI API calls need to be followed by an explicit exception check using JNI's ExceptionCheck() API. For example, this needs to be done after calls to all method invocation and allocation APIs. This is because the VM has a right to throw an OutOfMemoryError at any of these junctures. Without the check, the native method may be proceeding in an unstable environment, or using memory where it is unavailable. That would simply be bad programming. These checks will add additional overhead. In JIT compiled code, these checks are not explicit i.e. no time is spent on executing any such checks. The VM simply handle the exception automatically. The only case where the VM can't handle it automatically is when it encounters a native method in the call chain. Go figure! That's when it has to hand control over to the native method, and let the native method do the check explicitly. Apart from these, there are other costs to using native methods that hurts performance. These include marshalling costs of method arguments and return values. Remember that CVM uses 2 stacks: a native stack, and a Java stack. Arguments will need to be marshalled across these stacks. Even JavaSE's VM which uses only one stack will incur additional marshalling costs when crossing from Java code to native, and vice versa. However, their cost is less than CVM's. Another cost is the cost of VM state changes. In CVM, a thread is always in one of 2 states: GC safe or GC unsafe. I'll leave the meaning of these states for a later day when I discuss the GC (garbage collector). But for reasons not explained here but are necessary, native methods need to operate in a mostly GC safe state, while for performance reasons, compiled code operate in a mostly GC unsafe state. Going across that boundary from compiled Java code to a native method will incur some additional overhead in terms of these state changes. The JavaSE VM also has similar (but different) VM states, and the issue exists there as well. So, if you don't have a good reason for using a native method, then don't. Chances are your native code will be slower, and you may also get sloppy and introduce bugs (such as forgetting to do that exception check at the proper place). Native methods not only cannot be optimized by the JIT, it prevents the JIT from doing the best job it can as when there is no native methods in the call chain. Why Bother? For the application developer, it is usually best to just write in the Java language. There is one exception to this rule of thumb. If the type of computation that needs to be done by the native method does not involve a lot of access to Java objects and do not call Java methods frequently, then you may have a candidate for a native method. First, you should make sure that there is actually a benefit in writing that method as a native. An example of such a method would be like a ZIP library method that decompresses a compressed file. The decompression works purely on a C array and does not return to the Java domain until its job is done. In other words, in order to do its job, it does not need to cross that boundary between the C and Java domains frequently. Generally, one big source of performance costs of native methods come from the crossing of that domain boundary. Use the Standard Java Libraries There is another incentive to use the standard libraries instead of rolling your own native. I explained that native methods are not only themselves not optimal, but also stop the JIT from doing its work optimally. This because the JIT has no knowledge of what's in the native methods. However, for native methods in the standard class libraries which need to be native, there is a solution that can allow the JIT to still perform optimally. This solution is the use of intrinsic methods (commonly abbreviated as intrinsics). Support for intrinsics is available both in CVM and the JavaSE VM. Intrinsics are methods whose semantics are already known by the VM and its JIT at VM development time. Hence, the only native methods that can be intrinsics are the ones in the standard libraries. Amongst other benefits, intrinsics effectively allow JITs to inline native methods (in some form) where it couldn't be done before. Hence, intrinsics allows native (OS and hardware) resources to be accessed without necessarily incurring all the costs of a native method. JITs will still be able to generate optimal code around calls to these intrinsic native methods. You won't be able to get this benefit if you roll your own native methods to access those resources. Having said this, not every native resource can be accessed through intrinsics. The choice of which methods to be made intrinsics is dependent on how the VM and library developer chooses to optimize for the device. There are tradeoffs involved and a cost to having intrinsic methods. Therefore, it is not possible nor wise to make every method in the standard library into intrinsics. However, this choice is usually only made by platform developers. Application developers have no direct control on this, nor any awareness of which methods are intrinsics or not. Final Word So, beware of natives. :-) »
Related Topics >>
Mobile and Embedded Comments
Comments are listed in date ascending order (oldest first)
|
||
|
|