 |
BufferedImage as Good as Butter, Part II
Posted by chet on August 21, 2003 at 01:36 PM | Comments (20)
Part II, in which we discuss the internal performance implications
of said image type
Let's dive briefly into some of the performance implications with
BufferedImages (because I'm writing this and I tend to wind up in the performance
arena no matter what I'm talking about).
There is currently only one image type that we guarantee acceleration on
(if possible on the runtime platform): VolatileImage. When you create one
of these images, we try to stash it in accelerated memory (such as VRAM
on Windows, or an accelerated pixmap on Unix) and then perform rendering
operations to and from that image using any available hardware acceleration.
VolatileImages work well for things like back buffers (ala the Swing back buffer,
which is now a VolatileImage), where you obviously want to render to
them frequently and copy from them as fast as possible. But for your average
image, managing a VolatileImage can be tiresome (you have to make sure it's
there before and after you use it), and you can't get all the flavors
of images you want (currently only opaque volatiles exist).
But think about your average application: there's the back buffer and
screen that you write to often, so you want those rendering-to operations to
go really really fast. But then there's a bunch of other images
like icons, sprites, whatever which you really only write to
once or occasionally, but from which you would like to copy often.
This is where Managed Images come in (our new catch-phrase which,
roughly translated,
means "we will try our darnedest to accelerate this for you"). Here, you
create an image however you need to, start working with it, and
internally we will recognize that these copying operations can
go much faster using an accelerated version, so we will just create that
cached version for you. You don't need to manage the image and you don't
need to know how these operations are happening; you just keep calling
your rendering operations and let us take care of the pesky details.
Now, the fine print: as of 1.4.*, we hooked out only certain parts of
the API to create these managed images. Specifically, you need to
create an image either by calling one of the create*Image methods:
GraphicsConfiguration.createCompatibleImage(w, h)
GraphicsConfiguration.createCompatibleImage(w, h, transparency)
Component.createImage(w, h)
methods or by loading the image, ala:
Toolkit.getImage(...)
You can also use
new ImageImage(...).getImage()
(because ImageIcon currently uses Toolkit.getImage() under the hood, with
the old MediaTracker functionality thrown in for free). Images that you get from
other key means, such as ImageIO-created images or any image created explicitly through calling new BufferedImage() are not managed, and thus will not benefit from under-the-hood acceleration possibilities.
So in the current implementation of Java2D, the advice posted by ajsutton in response to Part I of this BufferedImage article is well-taken: if you want to take advantage of possible acceleration for your image, use a compatible image (or one of the other means above). Then we will attempt to accelerate this for you. Another benefit of using a compatible image is that you will get an image that is "compatible" (thus the name of the method above) with the display device you are rendering to, which saves pixel format conversion during the copy loops.
(A further caveat is that not all image types that you get from the above methods are acceleratable. For example, if you create an image with the flag Transparency.TRANSLUCENT then we do not currently accelerate that image and you end up going through software rendering loops regardless. Look for this to change as the library evolves and we try to accelerate more and more standard yet nifty features of the API).
Okay, so that's the state of things now: use BufferedImage for all of your condiment image needs, and if performance is particularly important to you, then use one of the variants mentioned above is the way to go. But what about the future?
Gosh, I'm glad you asked that. What a great question.
You may already be wondering, in reading the above explanations and caveats: "Why can't they just accelerate everything? Why are only portions of the API managed?" In fact, this is totally correct; there is nothing preventing this from happening (other than the most obvious of reasons: time to implement and lots of other stuff that we've been working on in the meantime). For example, let's say you have a 16-bit BufferedImage you created from scratch and you want to copy it to a 32-bit display. This means that we have to do a pixel-format conversion, so we can't cache the 16-bit version, right? But we can cache a new 32-bit version; we just copy the 16-bit version to our new 32-bit cached version, and then simply use the cached version thereafter.
Starting in jdk1.5 (currently in the oven, baking for a while, available at some unspecified (by me) date in the future), we will manage a much wider array of images. In fact, most of the images you can create or load will be managed for you. The code is in there, I've seen it working with my own eyes: BufferedImage objects running as fast as compatible images. It's pretty sweet...
That's all for now. I've glossed over many of the details of images and acceleration, but hopefully I've given a taste of how accelerated images work today and in the future. And hopefully you will be able to use this information to get the fastest and tastiest BufferedImage applications possible.
Bookmark blog post: del.icio.us Digg DZone Furl Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment
-
What a great entry
Chet,
I love reading about your adventures in performance and design. Keep up the good work!
Posted by: ocean on August 21, 2003 at 03:57 PM
-
Excellent explanation
Thanks for this article. It really helps to explain some of the performance characteristics that I've seen in more detail.
One question though, you mention that in 1.5 new BufferredImage() comes out as fast as .createCompatibleImage(), I would assume this is only in the (typical) case where you create the image once and reuse it a lot. My app happens to create images to be used in a JTextPane and the loading of content into the JTextPane (when the images are created) is the part we really want to optimize. The image is also recreated and changed quite often as the user edits things in the JTextPane.
In such a situation, would you expect new BufferredImage() to still be comparable speed-wise to createCompatibleImage()?
Not that I plan to go back and change that code anyway, but it would be interesting to know.
Posted by: ajsutton on August 21, 2003 at 05:13 PM
-
Excellent explanation
There are a couple of things (well, okay: more than a couple) that can affect the performance of a BufferedImage. You might be running into something like these:
- If you ever grab a Raster from a BufferedImage, you've pretty much punted on any possible acceleration of that image. This is because, in order to accelerate that image, we need to know when the image has been updated (so that we can update our accelerated copy of the image). If you grab the Raster object, then you can get a handle to the Java array that holds the data of the image, and we don't have any way of finding out when you've touched that array. So at the point where you grab the Raster or DataBuffer, we throw up our collective algorithmic hands and just do operations to/from that image in our software rendering loops from then on. (Note: we may find some ways to mitigate this situation in future releases, but that's the situation as it stands now).
- If you are updating an image often (even without grabbing said Raster), then we are constantly having to update the cached version of the image. In the extreme example, if you update the image once per time that you copy from it, then we never actually cache it because it's not worth the effort/time to store and use the cached version. But even if you're not doing it quite that often, you may see some performance problems if we only get to use the accelerated version some small number of times for every time we have to update that cache.
- Acceleration depends on your platform. For example, if you are on Windows, we are probably caching the images in VRAM and using DirectX to accelerate the copy operations to Swing's back buffer. However, since DirectX images are volatile (can go away at any time), the original copy of the image is stored in system memory, thus all rendering operations TO that image happen through software routines. So you could be experiencing less performance than you might expect simply because frequent renderings to the images are unaccelerated, thus the overall performance of the images may be adversely affected.
Make sense? Hopefully you can find something in here that applies to your situation. Or at least some food for thought.
Chet.
Posted by: chet on August 21, 2003 at 06:56 PM
-
re: "managed image"
Chet,
Given what you've written about "managed images", I don't see any reason to prefer to use VolatileImage in its place. Do you agree?
Posted by: drlaszlojamf on August 29, 2003 at 10:05 AM
-
re: "managed image"
Absolutely correct ... mostly.
There are at least 3 situations in which you might want to us a VolatileImage versus a Managed Image:
- you want to change the contents often (this behavior on a Managed Image could cause poor performance overall due to either our inability to accelerate this dynamic image or our having to constantly copy down new versions of it to the accelerated memory).
- you want to take advantage of hw acceleration for rendering operations TO the image (Managed Images usually have their primary copy in system memory because their accelerated memory version can become "lost" at any time (especially on Windows). Therefore, we have to store the primary in a lossless area and thus use software rendering loops when rendering to the image).
- you want more control over how and when the image gets accelerated (with Managed Images, we do as much as we can behind the scenes. This works out great if you don't care about the details, but if you are trying to manage your performance very carefully, you may want the acceleration details to be more explicit).
Otherwise, if the above situations do not apply to your situation, Managed Images are your friends and I would agree that you probably do not want or need the extra management hassles of VolatileImage.
Actually, I was wondering what I should ramble on about in my next blog; this sounds like a pretty good candidate...
Chet.
Posted by: chet on August 29, 2003 at 12:47 PM
-
re: "managed image"
Thanks for a enlightening blog, Chet.
I'm really looking forward to that "rambling", as improved control over what is actually taking up the accelerated memory. I'm working with accelerated alpha-blending using images and sun.java2d.translaccel=true. Works like a beauty with 60 f/s with full screen images. But then i introduce a smaller image and rescale to full screen they are obviously not accelerated (less than 10 f/s).
Control over this type of acceleration would be really nice (to put it mildly), but for now I understand HW accelerated alpha blending is only availible with managed images...
PerA
Posted by: pera on September 01, 2003 at 06:13 AM
-
re: "managed image"
PerA,
You're probably running into the fact that transform operations with images are not accelerated, either with managed images or Volatile images. And if you're doing this with translucent images, that's going to require readback from VRAM (for the buffer in VRAM that you are copying into), which means that either that buffer will be punted in system memory (we detect the horrible slow-vram-readback situation and punt in some situations) so you suffer poorer performance on other operations that _could_ be hw accelerated, or the buffer is staying in VRAM, but you are suffering the slow readback problem on those specific image operations.
(Side note: to make sure the back buffer stays in VRAM, you might try the -Dsun.java2d.ddforcevram=true flag for Windows).
Think about pre-scaling to the size you need and using those images instead. If this size changes, you could just recreate the pre-scaled image whenever you need it. But as long as you use these images at least a few times for any given scaled size, you'll probably get way better performance by avoiding the transform operation during the image copy.
Also, note that you can use the flush() method in Image to force a cached version of an image to be released. This won't release the actual IMage bits, but it will release the VRAM version (if one exists). This isn't as good as direct control over the memory in managed images, but it allows at least high-level management of the bits.
More in future blogs...
Chet.
Posted by: chet on September 01, 2003 at 10:55 AM
-
re: "managed image"
Thanks, Chet. I'll try out the flush(). The images are already prescaled (read-scale-copy to (hopefully) accelerated images) - I suspect the VRAM fills up). Looking forward to the next blog!
PerA
Posted by: pera on September 01, 2003 at 11:54 PM
-
Great info
I've got some real insight from these entrys, cheers. In the current dev I'm working on tho there's a whole load of per pixel "blending" operations required, currently done by grabbing the databuffer from the raster . Using get and setRGB always seemed way too slow but it would appear that by doing this we forfeit any hw accel available. Am I right in thinkin that there is no simple way round this?
Posted by: acourtenay on September 02, 2003 at 12:32 AM
-
Great info
acourtenay,
Actually, the get/set approach to pixel-twiddling should not defeat hw acceleration in general. The big punt in our code comes when you request the DataBuffer object from the Raster; this gives you access to the pixel array directly, so we just give up at that point. But as long as you stick to the get/set methods in BufferedImage and Raster, then the worse you should suffer is our having to update the cached version of the image every time you change it.
There are probably ways that we can mitigate even this punt in the future, but for the current releases that's just the way it goes.
Note, however, that if you are changing the contents frequently, then we probably won't even bother to cache a hw-accelerated version, or if we do we won't use it very often because it's simply not worth the cost to update the cache so often.
Posted by: chet on September 03, 2003 at 11:56 AM
-
re: "managed image"
Sad, but true; we're not accelerating transform operations. You would think that it would be a simple Blt operation, and we actually implemented that capability in the original DirectDraw driver (on Windows, obviously). But the problem was that you have no way to control the filtering on that scale operation. Some cards use NEAREST_NEIGHBOR, some use BILINEAR, .... This may not be a big deal to some apps, but it's a huge deal to others; the inability to control (and make consistent across all scaling operations) these transforms was a feature-killer.
Direct3D, on the other hand, does provide the control we need; we just haven't had time to implement d3d-transforms yet; I've got my hopes on the next release (we're workin' on it...).
We do have an OpenGL pipeline on all Sun platforms (Windows, Linux, and Solaris) in 1.5 and that _does_ support hw transforms. This pipeline is not enabled by default (driver and hardwae support is intermittent, so we cannot count on it by default). Try it out with -Dsun.java2d.opengl=true to see if if works for you.
Meanwhile, if you want to work around the problem in your current code, you may look into using intermediate images. That is, if you draw your images at the same scale several times, you can create an intermediate image that pre-scales the image and then drawImage() from that intermediate image. We will manage the intermediate image and hopefully you will get the hw acceleration you need. The scale is still not accelerated, but if you only scale one per n copies, then maybe that is lost in the noise.
Posted by: chet on July 20, 2004 at 12:34 AM
-
re: "managed image"
Chet,
I'm currently developing an application which uses LOTS of small images which OFTEN have to be drawn and rescaled, and all that has to happen FAST. When I read about accelerated images, I got a bit euphoric---could this be the solution to my performance problems?
Well, after reading your posting above, I don't think so any more... "transform operations with images are not accelerated, either with managed images or Volatile images". That's bad... Any suggestions how I could still make my image rendering system faster with accelerated images? After all, rescaling is just a simple blt operation on graphics cards, so I don't get it why it should not work that way with Java images...
thx, eb
Posted by: el_barto on July 20, 2004 at 04:23 AM
-
I also care about the performance of BufferedImage objects. We have java code that uses JNI to read images in the MrSID format. The C interface for reading pixels from those MrSID images returns the pixel data in three sample buffers. One byte buffers stores all red samples, one buffer stores all green samples and one buffer stores the blue samples. On the java side we try to create a BufferedImage from those three sample buffers (using Raster.createBandedRaster). We obviously avoid copying the pixels samples into an intermediate format. The BufferedImage is thus created without problem, as we can save it to disk using ImageIO for example. When we try to render the BufferedImage to a Graphics object, we get an ImagingOpException. This is (I think) a duplicate of bug 4723021. Any idea when this bug will be fixed? It is kinda strage to have a valid BufferedImage object and not be able to draw it.
Posted by: lerzeel on February 17, 2005 at 05:42 AM
-
ICONIFIED Problem chet, thank you for these very helpful articles. I am writting a game, i use a VolatileImage to be the background, every frame, i render the bg(the VolatileImage ) to the JPanel(my canvas) then render the spirits(they are managed images), the Strings(status words)... because of the VolatileImage
Posted by: iiley on February 27, 2005 at 03:13 AM
-
because of the VolatileImage's hardware acceleration, it can be rendered very fast usually. but sometimes when i minimize the window(iconified in fact), and then revert, the bg's rendering become very slow. I mean it happens sometimes, not every time. When the original size of the window is small, it happens frequently, when the original size is maximized it happens hardly. But now i think i fixed it, i use -Dsun.java2d.ddforcevram=true known from this page to fixed it. I think it may be the VolatileImage released it memory from VRAM when the window was iconified, when the window was revert, it can't return to the VRAM, just return to System Memory. That's my thought, am i right? but why it just happens sometimes? (My PC has 128M VRAM,windows2000 sp4 OS with DirectX 9, while i test my game i just opened Eclipse 3.0.) Thanx a lot.
Posted by: iiley on February 27, 2005 at 03:15 AM
-
re: ImagingOpException
Some problems like this were already adressed in 6.0 beta. You may want to
try one of the latest builds posted on java.net. Your particular case
is likely to be fixed. However, in some cases it is probably still possibly
to get ImagingOpException - we are working on get it fixed. If you still
experience problem like this with 6.o beta please file a new bug -
we will investigate.
If you want to use 5.0 then you may try to workaround it by drawing custom
image to image of one of standard types (e.g. INT_ARGB) and performing
transformations on that image. Note that while this workaround should work
smoothly performance will not be as good as with 6.0 beta.
Andrew Brygin
(Java2D team)
Posted by: bae on March 02, 2005 at 08:07 AM
-
re: re: ImagingOpException
The Java2D team said: Some problems like this were already adressed in 6.0 beta.
I have tried the 6.0 beta and the problem has disappeared. Thanks for the feedback!
Posted by: lerzeel on March 07, 2005 at 07:34 AM
-
Great entry Chet. BufferedImage is good. I'm actually building up an Imaging application, but i currently don't found a way to use ConvolveOp correctly, here's my code:
BufferedImage image = ImageIO.read(new File("path for file as String"));
/* horizontal gradient for roberts edge detector */
float[] robertsH = new float[] {
0.0f, 0.0f, -1.0f,
0.0f, 1.0f, 0.0f,
0.0f, 0.0f, 0.0f
};
Kernel kernel = new Kernel(3, 3, robertsH);
ConvolveOp op = new ConvolveOp(kernel);
BufferedImage filtered = new BufferedImage(image.getWidth(), image.getHeight(), image.TYPE_INT_RGB);
op.filter(image, filtered);
return filtered;
Actually it does work as it should, its detecting left border, and botton border, but it should detect bottom and top borders.
Posted by: javierdemexico on October 28, 2005 at 10:46 PM
-
Great entry Chet. BufferedImage is good. I'm actually building up an Imaging application, but i currently don't found a way to use ConvolveOp correctly, here's my code:
BufferedImage image = ImageIO.read(new File("path for file as String"));
/* horizontal gradient for roberts edge detector */
float[] robertsH = new float[] {
0.0f, 0.0f, -1.0f,
0.0f, 1.0f, 0.0f,
0.0f, 0.0f, 0.0f
};
Kernel kernel = new Kernel(3, 3, robertsH);
ConvolveOp op = new ConvolveOp(kernel);
BufferedImage filtered = new BufferedImage(image.getWidth(), image.getHeight(), image.TYPE_INT_RGB);
op.filter(image, filtered);
return filtered;
It doesnt work!!!!!!!!!!!!!!!!!!! its making bad convolution.
Plz post correct this snippet.
Or email me at javierdemexic@hotmail.com
Posted by: javierdemexico on November 25, 2005 at 09:10 PM
-
I understand this is an old article - where's the new one?! :-)
One thing that bothers me with all this magic, is that in some situations I believe I would do a better job than you deciding which type of memory I'd like to use for which images.
I have this application I try to develop (Stay tuned for "Picorg"!) where I try to forward preload as many images as possible into memory, so that the user will perceive a very fast performance when browsing his images. (you know, those 8 to 20 megapixel kind of images that people tend to take thousands of from every weekend at the cottage or at the fair?)
If you constantly try to accelerate these gigabytes of images that I'm loading by trying to put them into VRAM, then you'll eat up all the VRAM on stash that don't need to, and really just shouldn't, be in VRAM.
It is exactly the same annoyance as when e.g. a full text disk search, or a virus scanner, scans through the whole bleedin' harddisk, and the operating system dutifully caches everything, all the time, except that it is plainly the wrong thing to do: the files and data structures in question will 99.99% probably not be used again in the near future, and now the good data that was already in the IO cache is flushed, and the net result is that I perceive an amazingly slow computer.
The feature many people thus ask for, is some way for a process to explicitly say to the OS: "I need this piece of data, BUT DO NOT CACHE IT!" .. because it is a sequential read through the whole harddisk, and it will be an utter waste to cache, and even more waste to trash the good data that's already in the cache.
The same goes for this BufferedImage magic: if you waste all the user's VRAM on "magic", on images which I know for sure won't be used very often, then this magic will just make everything slower, not faster.
(What I'd do was that when the user actually browses to within the vicinity of the image, I'd get an actually "VRAMImage" (or reuse one of the ones that no longer are in the vicinity), copy the cache-copy into it, and then when the user browses to this image, it would be blazingly fast when displayed)
I'd love if there was some way to say that _do not_ stash this into VRAM, it has nothing to do there. (Should I then piece together a BufferedImage myself, with my own Raster and such?!)
Actually, I would love if I could turn off all magic, and be explicit about everything: this thing should go in VRAM (and if it won't fit, then don't do any silly magic - throw me an exception), and this thing is just plain java, thank you. I tend to feel that a coding framework shouldn't have too much hidden strange-ness in it: all such things should really be pretty explicit, or at least be enabled in some way. Often when using Swing, BufferedImages and so on, I feel alienated: what's going on here, really?! I'd love to actually see the actual code that copies my pixels from here to there - so that the magic disappears.
Posted by: stolsvik on August 22, 2007 at 03:20 AM
|