The Source for Java Technology Collaboration
User: Password:



Kelly O'Hair

Kelly O'Hair's Blog

Bytecode Instrumentation (BCI)

Posted by kellyohair on May 13, 2005 at 12:09 PM | Comments (13)

When I was given the assignment of converting the old JDK 1.4.2 HPROF agent library from the experimental JVMPI to the new "JVM TI" in JDK 5.0, it was with the understanding that this new HPROF profiler needed to do bytecode instrumentation (or sometimes called "bytecode injection" or "bytecode insertion") to capture method entry, method exit, object allocation, and object free events. I remember this being a bit of a scary thought, taking the classfile images and recreating them with new bytecodes inserted to track events while the class was running. I also had no idea how to estimate how long this would take. I did have some good understanding of how classfiles were layed out, having some history with fastjavac and the old Java Workshop project, but it still seemed a bit like brain surgery, not necessarily because of the complexity of the code but because of the potential for complete meltdown if the least little thing was done incorrectly.

Well, it turned out to not be as bad as I thought, and with a little help and donated code from Robert Field, I was able to create a small native library that does some basic BCI called java_crw_demo. This library is used by HPROF in JDK 5.0+ when doing BCI.

Turns out that this little library was very handy when it came to writing some demos of JVM TI and using BCI. The source to this library is provided in the demo/jvmti directory, so people are free to browse this C source. The java_crw_demo native library is a primitive classfile transformation library that will insert bytescodes at selected and limited locations in methods, returning a new classfile image. It is important that you understand the classfile layout as described in the Java Virtual Machine Specification.

WARNING: Do not take on a task like BCI lightly. I wrote java_crw_demo because I couldn't find a C version of a BCI library that met my needs. I would highly recommend you investigate the freely available BCI libraries out there before taking on the task of writing a new one. Having said that, I know a few people like myself suffer from insanity at times and will attempt this effort, so I've accumulated a few tips for the beginner.

I haven't mentioned how or when you get the classfile images, there are a variety of ways including:

  • Just modify the class images on disk and change the classpath setting.
  • Capture the class image in memory with the JVM TI ClassFileLoadHook event and return back a modified class image. This is what HPROF and the demo agents in JDK 5.0+ do.
  • Redefine the class on the fly with JVM TI RedefineClasses.

Some of the common issues you may encounter doing classfile transformations are:

  • Additions to the constant pool will be needed. It is easiest if these are added at the end of the constant pool. Don't forget to set the constant pool count correctly in the classfile header. If you do change the constant pool order or if you exceed 256 constant pool entries, watch out for the ldc bytecodes, some may need to be changed to ldc_w bytecodes. I highly recommend just adding to the end of the constant pool. If you need to push a constant greater than 16bits onto the stack, you will need a constant pool entry for it.
  • Adding bytecodes can also cause bytecodes that precede the intrumented location to need to change. In addition, things like "ifeq" may need to be changed into "ifneq" and "goto_w" if the addition of code pushes things beyond 16-bits away.
  • Adding bytecodes can cause some of the bytecodes that follow to need different bytecodes to deal with changing ranges, e.g., a jsr vs. a jsr_w instruction. Special care needs to be taken when re-constructing the new Code attribute that all the bytecodes, both inserted and original, are using the correct wide and '*_w' bytecodes.
  • Changes to the bytecodes means that the offsets in the "Exceptions", "LineNumberTable", "LocalVariableTable", and (new for JDK 5.0) "LocalVariableTypeTable" attributes will be invalid. You will need to adjust all these offsets. Careful with offset 0, you may want 0 to remain 0 for the local variables or the line table, but maybe not for the exception table.
  • If the inserted bytecode causes or could cause the maximum stack size to increase, the "Code" attribute will need the max_stack field adjusted along with the code_length. If you add local variables, you will also need to adjust max_locals.
  • Insertion of bytecodes at the very beginning needs to be done carefully, consider a jump to offset 0 in the original bytecodes. You need to decide if the inserted bytecodes at 0 will be executed once on entry, or also when the method does a jump to offset 0. So inserting bytecodes for method entry is not the same as inserting bytecodes at offset 0.
  • Insertion before return bytecodes can also be tricky. If in the original bytecodes this return bytecode is a target of a jump, do you want the inserted bytecodes to be executed? (Hint: Answer is yes. :^). You need to make sure that the old jumps to the return bytecode now jump to the inserted bytecodes.
  • The special case of the new bytecode is a less-than-obvious problem. Objects that have not been initialized cannot be passed to ANY Java methods, so doing the obvious injection of a dup and an invokestatic after the new bytecode will not work when bytecode verification is on. This object must be initialized first. So if you wish to capture newly-allocated objects, the best place to catch them is in the java.lang.Object.<init> method. Once the object makes it here, you can pass the object around (or of course you could run Java with the verifier off, but that generally isn't a good idea). An alternative is to try and find the appropriate <init> The newarray bytecode doesn't have this problem, and in fact the only way to capture these objects is by inserting bytecodes immediately after the newarray bytecode (don't forget the anewarray and multianewarray bytecodes).
  • It is best to insert the fewest bytecodes possible, and in java_crw_demo, the insertion is limited to pushing a few items on the stack, and making a static method call to a class found in the boot classpath. This so-called tracker class and tracker methods contain Java code that, in our demos, grab the current Thread with a call to java.lang.Thread.currentThread, and pass all the arguments, plus the current thread to a native method belonging to the class. The agent will have registered the natives for this class and those native functions are actually static functions inside the agent library.
  • Any bytecode insertion needs to be careful of the state of the VM when it is called, e.g., has the VM started, or has the VM been initialized. The downside to the JNI call in the Tracker class is that you can't make the JNI call unless the VM is started and the natives have been registered, and the downside to calling currentThread is that before the VM is initialized, this thread reference could be null.
  • Once past the VM initialization, any JNI usage and all the native code in the agent library needs to watch out for JVMTI_EVENT_VM_DEATH, making sure the code can recover cleanly.
  • You may wish to be selective in what classes or methods you apply BCI to, depending on what information you are after. It is always best to limit the intrusion of BCI on the application.
  • There is the idea of a system class in java_crw_demo, probably badly named. These system classes are treated special and <init> methods of length 1, finalize methods of length 1, <clinit> methods, and the java.lang.Thread.currentThread() methods will not have BCI applied to them. This may not be necessary, and the <clinit> part may be masking allocations. Fortunately, HPROF tells java_crw_demo that a class is a system class on only a few few classes, less than a dozen, and only when the class is being loaded during or before VM initialization.
  • Creating too much new bytecode can cause stack overflow errors in the VM prior to VM initialization, so for the early loaded classes you need to take special care about what inserted bytecode is executed and what that bytecode is doing when the VM is not fully initialized yet.
  • It's possible at ClassFileLoadHook time that you won't have a classname. This is a pain if you need to track what has been BCI'd. You need to dig the name out first using java_crw_demo_classname(), which parses the classname out of the classfile. This does not happen often in the real world.
  • As new classfile versions are created by newer JDKs, they may contain attributes that may need to be adjusted due to bytecode offset changes. I haven't figured out a clean way to handle this, and in fact this java_crw_demo library currently ignores the classfile version number, but that is actually a bug that should be filed. Any BCI processor should probably be well aware of what classfile versions it can work on.

I'll add to this entry as new issues come up.


Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • With respect to that last point about attributes, J2ME and Mustang stack map attributes should (presumably) be discarded. I don't think it is safe to permit any unknown, non-standard attribute past.

    Posted by: tackline on May 13, 2005 at 03:16 PM

  • The java_crw_demo library currently nulls out the body (number_of_entries==0) of any StackMap attribute it sees, and according to the CLDC spec that I read this means the same thing as no StackMap attribute at all (it was easier to null out than remove the attribute). This creates a supposedly 'implicit stack map', but recently I've discovered that this is probably wrong for any methods with more than one basic block. I should probably go through the StackMap attribute and just fix up the offsets, I don't think discarding or null'ing out the attribute works. Unless you have some more detailed information on this?

    The Mustang release (JDK 6.0) may be introducing a new variation on the StackMap attribute (with a different name and slightly different format, but the same basic purpose). When those details are more concrete I will post that information here.

    Thanks for the comment.

    Posted by: kellyohair on May 13, 2005 at 03:40 PM

  • Why would you discard standard attributes? I guess StackMap must be recalculated.

    By the way, is StackMap is actually going to be used in Mustang?

    Posted by: euxx on May 13, 2005 at 06:20 PM

  • Yes StackMap needs to have it's offsets adjusted, and no Mustang won't be using the same "StackMap" attribute but a new attribute with a new name "StackMapTable" and a slightly different but more optimal format.

    There has always been a statement in the Java Virtual Machine Specification on the classfile 'Code Attribute' that it's attributes could all be silently ignored. Things line the LineNumberTable and the LocalVariableTable. That statement made me think that the StackMap had ti be optional too. I've since been told that the StackMap attributes in all cases must be present for the verifier to work.

    Posted by: kellyohair on May 13, 2005 at 06:51 PM

  • Kelly, is there are any information about StackMapTable attribute format already? Or should I dig into Mustang sources for that?

    Posted by: euxx on May 15, 2005 at 08:05 AM

  • CLDC's StackMap and verifier documentation are available at: http://jcp.org/aboutJava/communityprocess/final/jsr139/index.html

    The Appendix1-verifier.pdf in the downloaded zip should provide the basics on the StackMap concept. The Mustang StackMapTable is a more compact format but provides the same basic information. As soon as I have something I can provide I will add to this posting.

    Posted by: kellyohair on May 26, 2005 at 11:28 AM

  • For those who are interested, I've completed StackMapTable implementation for ASM bytecode toolkit. Code is in CVS at asm.objectweb.org. Unfortunately at the moment there is no facility to recalculate content of this attribute from scratch.

    Posted by: euxx on July 01, 2005 at 08:44 PM

  • When the latest B45 JDK source snapshot is available, you can look at the file j2se/src/share/demo/jvmti/java_crw_demo/java_crw_demo.c (search for StackMapTable) for how this library dealt with the StackMapTable attribute.

    Posted by: kellyohair on July 26, 2005 at 03:21 PM

  • I assume you looked at ASM and/or BCEL and decided they were not suitable?

    Posted by: haruki_zaemon on September 29, 2005 at 05:01 PM

  • Why does java_crw_demo use malloc & friends instead of JVMTI Allocate/Deallocate? Seems like this invites problems, e.g., the one reported in http://forum.java.sun.com/thread.jspa?threadID=573903&messageID=3724822 or possible blocking by halted JNI code.

    Posted by: bobfoster on December 04, 2005 at 11:47 AM

  • The java_crw_demo library is independent of JVMTI and JNI. It's a general purpose class file instrumentation library. The problem of always making sure the memory allocated by a particular heap allocation implementation is freed by that same implementation, is a long standing problem. The windows runtime libraries just make the problem worse, especially when the allocations and frees happen in different DLL's.

    The HPROF agent also uses malloc/free but that was mostly due to the need of a realloc(), which JVMTI did not have. Keeping these different memory allocations separate is an annoying issue, makes you appreciate Java and garbage collectors. ;^)

    Posted by: kellyohair on December 05, 2005 at 09:01 AM

  • Your code is very interesting for educational purposes. But it isn't complete :
    1. The method ReturnSite() is never called for methods throwing exceptions (in this case, there is no return instruction, only a athrow instruction)
    2. The method ReturnSite() is never called if a method doesn't manage an exception. (The solution is to modify the exception table and to add finally code)
    Note I've done similar code using BCEL to program a javaagent profiler. As I found this bug in my code, I tried you code - If this remark can help someone ...

    Posted by: sylvainmarechal on January 13, 2006 at 08:09 AM

  • The HPROF library used the JVM TI exception events to deal with exceptions, so it wasn't an issue for the original consumer of java_crw_demo. You are correct that the abnormal returns are not handled, and that was by design. What you suggest is possible, but any changes to the exception table may require more serious adjustments to the JDK 6 StackMapTable attribute, just more code, but be careful on that when JDK 6 is released.

    Thanks for the comment.

    -kto

    Posted by: kellyohair on January 13, 2006 at 11:58 AM





Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds