Skip to main content

Removing elements from Swing HTML in JDK6

Posted by g_s_m on August 1, 2007 at 10:12 AM PDT

After I
published href="http://weblogs.java.net/blog/g_s_m/archive/2007/06/removing_elemen_1.html">the
entry on removing elements from Swing HTMLDocument in JDK7, I got
a question from a reader: but how to actually remove elements in JDK6?
There's no easy way, but what is the hard way?

Well, there exist a bunch of methods, to various degrees of
ugliness, let's look at one of them. It's neither pretty nor
effective, but has the advantage of being simple enough.

The method is to serialize all the sibling elements
using HTMLWriter class, concatenate the serialized
strings, and re-parse the resulting text
using setInnerHTML method with the parent element as the
target.

The following code illustrates this approach.

void removeElement(HTMLDocument d, Element e) throws Exception {
    Element p = e.getParentElement();
    int n = p.getElementCount();
    String s = "";
    for (int i = 0; i < n; i++) {
        Element c = p.getElement(i);
        if (c != e) {
            s += getElementText(d, c);
        }
    }
    d.setInnerHTML(p, s);
}

The tricky thing is, how to serialize not the complete document but
a single element only. There exist a constructor in
the HTMLWriter class that allows to specify only a
fragment of the document to serialize, and we could use element's
start and end offsets as the fragment boundaries. But by default the
writer will output all ancestors of the element up to the root to make
serialized form the complete HTML document.

To overcome this, we need to override
the getElementIterator method which returns the root of
the element hierarchy to serialize. In the HTMLWriter
implementation it returns the document root element. We'll implement
it to return just the element to serialize.

String getElementText(HTMLDocument d, final Element e) throws Exception {
    StringWriter sw = new StringWriter();
    int p0 = e.getStartOffset();
    int p1 = e.getEndOffset();
    new HTMLWriter(sw, d, p0, p1 - p0) {
        protected ElementIterator getElementIterator() {
            return new ElementIterator(e);
        }
    }.write();
    return sw.toString();
}

Note that if we try to remove the sole child element, the behavior
of this method differs from removeElement in JDK7. In
JDK7 the parent element is removed as well, recursively. In the above
example nothing is removed (to make the example simple). To get the
JDK7 behavior, we need to traverse to the nearest parent element that
is not a sole child and remove it instead of the original
element.

Related Topics >>