The Source for Java Technology Collaboration
User: Password:



Kyle Grucci

Kyle Grucci's Blog

Compatibility Assertion Tagging Within Specifications

Posted by kgrucci on September 07, 2006 at 10:03 AM | Comments (9)


One of the most time consuming aspects of creating Technology Compatibility Kits (TCKs) for the various JSRs which come out of the Java Community Process is the creation and maintenance of assertion lists. An assertion is a behavioral requirement placed on all implementations by the specification. The granularity of a single assertion is debatable and may differ from one list to another. The assertion list serves multiple purposes...


  1. as a plan for what must be tested in the TCK

  2. provides a way to map tests to the assertion(s) they verify

  3. allows one to measure overall assertion (and ultimately specification) coverage.


The more assertions that are tested by the TCK, the more likely an implementation which passes the TCK will be compatible and ultimately applications which are written to the specification will be portable.


Over the years while working on many TCKs, I have become increasingly aware of how difficult it is to create and maintain separate assertion lists which correspond to fast moving specifications. It is a very time consuming activity which involves the initial (manual) creation and usually several revisions of a list while attempting to keep it in sync with the evolving requirements of the specification. Then there is the not so small matter of keeping any tests that have been written in sync with the list. In then end, there essentially ends up being 2 versions of the same specification; one in the original api and/or text form and a second in the form of the assertion list. You can only hope that the final list accurately mirrors the specification. Doesn’t it seem like the list of assertions should be part of the specification and then any lists be generated automatically? There are a few tools available to generate assertion lists from API based specifications, but human interaction is really necessary to create a complete list. For text based specifications, the only way to extract the assertions is to walk through the spec manually. If the assertions were embedded (and labeled) in the spec, there would be one less document to maintain, less room for error, and more time for writing tests. TCK engineers would then work on the specification along with the spec lead(s), and the result would be a more complete and testable specification.


Now that we’ve discussed the “why” of tagging specifications, let’s talk about the “how”. My colleagues and I have had several discussions about this in the past, but we had always come to the conclusion that written specifications (in paragraph text form) were too difficult to tag given the many relationships and interdependencies that exist within and external to a spec. However, what if we started out simple? Give the specification authors two options: 1. tag assertions in place as they occur within the written spec or 2. if a certain assertion is currently derived from a table, graph, or multiple disconnected sections within the document, then that assertion must be spelled out explicitly in the spec as well. Some specifications have already gone part of the way buy explicitly calling out assertions by labeling them as such. While this is a step in the right direction, attaching unique identification (at the very least) to each assertion would allow tests to directly reference the assertion in the spec. By defining a simple xml based tag to be used within the text of the document, different types of assertion lists can be generated based on attributes of the tags. If necessary, tools can be used to strip out the tags to generate a tag free version of the specification ( if readability becomes an issue). What might the simple tag look like, you ask? The <assertion> tag could contain one or more of the following attributes...


Attribute

Description

Required

id

a colon delimited string in the following format...


spec name/acronym:spec version number:assertion number

Yes

version

Version of the spec where this assertion was first introduced. The default is the version number specified in the id attribute.

No. Defaults to the version number specified in the id attribute.

keywords

User defined meta-data attached to individual assertions. One use case would be to have group identifiers for types of assertions within a given spec if it defines different roles or profiles that a user might implement. Role or Profile based assertion lists are then easy to generate. For example, some assertions may be requirements on an application server while others are requirements on applications or components running in the server.

No


Here is what an example paragraph might look like with embedded tags...


“This specification lays out the requirements for a food processor. <assertion id=”fp:1.2:4>A compatible processor must have 3 speeds: low, medium, and high.</assertion> These speeds are available to allow a user to choose the resulting texture of a given mixture. <assertion id=”fp:1.2:5>The low speed must spin the blade at less than 20 revolutions per second.</assertion> More text......and so on....”


The idea here is to keep the tags as simple as possible so as not to intrude too much on the readability of the specification. To simplify the TCK development and maintenance process, I’d really like to see the aforementioned recommendations be required in some form in the next revision of the JCP process.



Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • Sounds like a good idea. Ideally, the assertions would be available out of band, as well, i.e. be available for reuse in other tools, like IDEs. I guess with a bit of standardization of expressions, one could have something like JML's assertions expressed in natural language, and automatically converted to different forms. I though Sun had some tools for extracting assertions from specs, actually. Or is my memory of the spec tool descriptions wrong?

    Posted by: robilad on September 08, 2006 at 04:19 PM

  • I could definitely see working with assertions in an IDE. Once the specification and/or generated assertion list is opened and a test framework is chosen, skeleton source files could be automatically generated with links (in javadoc or annotation form) pointing back to all the assertions in the list . All that would be left to do is fill in the test logic!

    With regards to the current spec tools, there are some available but the heuristic used for automatically generating assertion lists is very simple and usually inaccurate (i.e. each sentence is an assertion). The tools also provide ways to manually step through a document and mark assertions, but this is still a tedious process and one which is often repeated as the spec changes. This step would be eliminated if the assertions were embedded in the specifications.

    Posted by: kgrucci on September 08, 2006 at 07:11 PM

  • Right. I am interested in this, since I'm aware of the work going on in spec# and JML worlds, and using JML in specs, where appropriate could be attractive. See http://www.cs.iastate.edu/~leavens/JML/prelimdesign/prelimdesign_1.html#SEC1 for an overview. I assume that you're semantically one layer above that, though, as you are dealing with natural language expressions that translate to assertions. I'd be also interested in adding some formalization in the specs, where appropriate. Is there some ongoing work at Sun in this direction?

    Posted by: robilad on September 09, 2006 at 03:24 AM

  • I took a brief look at the JML link you referred to. It looks like it has promise for describing behavioral requirements for API specifications. While my blog entry does focus more on labeling assertions within natural language specs, there is a need to formalize assertions in API specs as well. We currently look at not only the components of a method declaration in an API but also at any corresponding javadoc to determine any testable assertions. This would probably result in API assertions being described in natural language as well as JML. The end goals with either type of specification are to automate the assertion generation process without impacting specification readability, eliminate duplication of effort and throwaway while the spec is evolving, and ultimately be able to spend more time focusing on test development.
    With regards to work at Sun, there are existing tools which aid the spec markup process, but no work that I know of which formalizes the labeling of assertions within written specifications. This entry is an attempt to create some awareness and start some discussion so that we might be able to move some of the formalization into the JCP.

    Posted by: kgrucci on September 09, 2006 at 09:26 AM

  • Hi Kyle,

    We've had many discussions on spec markup before, and what you describe here seems to be a quite refined version of previous (more elaborate) markup schemes. I am interested in the assertion naming/ID scheme. On first appearance, this simple naming scheme would seem to be adequate for simple specs (typically unrelated to other specs), that might linearly evolve into subsequent revisions (eg. v 1.2, 1.3, ...)

    Having worked on some of the tools you mention at Sun :^), I can say that there are all sorts of assertion naming schemes, and many of them become hobbled due to the evolution of specs.

    For example, one scheme that my group has previously tried was creating unique assertion IDs through a combination of 'context' (eg Spec Foo, Chapter 4, Par 3) and 'some form of hashing on the assertion text itself' to produce an Assertion ID unique to a given assertion. The problem with this approach is that over time , assertions change text (hashes change), change location (contexts change), and change strength (meanings change). All of these factors contribute to an assertion evolving, and requiring new IDs. This problem can become even more difficult when normative prose defining assertions move between different specifications.

    The point I wish to make here is that we have some experience that shows that for tracking assertions, it is often best to create assertion ID's that are unique identifiers within the spec they exist and within a context of multiple specs that a system might be implemented. This helps in an assertion tracking process - as normative text moves and evolves.

    Also, while it is important to annotate where an assertion resides (eg spec name, version, location) - and track how this changes over time, it is much simpler to do so with an independent invariant assertion ID (rather than encoding the context in the ID, and tracking aliases of IDs for the same assertion as it moves).

    Posted by: ktlooney on December 07, 2006 at 03:57 PM

  • Hi Kevin,

    I totally agree with your comments. Through our experiences, we have come to the same conclusions about unique ids for assertions. This is the reason that we are proposing a simple id which only includes the spec, the version of the spec, and a unique number. The simpler the better; especially if we can end up having assertions pre-labeled in specifications.

    Posted by: kgrucci on December 13, 2006 at 10:32 AM

  • Hi Kyle.

    I think we agree about unique naming - but I'm not sure we agree about naming standards that are independent of the context of the assertion. Particularly, I think the ID's that you specify in your table codify the context (eg. spec name / version) of the assertion within it's ID. Is this necessary? Would it be better to have some string that does not specify 'spec name' or 'spec version' at all?


    It seems that other attributes/markup could be used to 'track' where the assertion has originated (or been moved - or modified) independently. Taking the context 'out of the name' reduces naming confusion, as assertions change and move around.

    Posted by: ktlooney on January 02, 2007 at 03:09 PM

  • While it is possible to remove all context from the assertion ids, my feeling was that spec name and version are part of the "fixed" context of the assertion - i.e. an assertion is usually tied to a particular version of a spec (unlike other meta-data like chapter and section which change). My assumption is that there should be a single assertion list for each version of a given specification - preferably generated as I described above. Also, if we are able to get to a point where assertions are tagged within specifications when they are written, the less meta-data the better. Incorporating the spec name and version into the id also makes the id globally unique even if placed next to ids from other specifications.

    Posted by: kgrucci on January 10, 2007 at 03:23 PM

  • Hi Kyle,

    you may find interesting to check the blog of Victor Rudometov
    where he writes about JCK Lang team experience of working with spec.
    Regards
    Alexey

    Posted by: alexeyp on July 16, 2007 at 06:57 AM



Only logged in users may post comments. Login Here.


Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds