Skip to main content

Source code isn't text

Posted by robogeek on August 8, 2007 at 5:08 PM PDT

This is a thought that popped out of my mouth yesterday, and the more I think about it the truer it seems. "Source code isn't text, despite how much it looks like text". Simple text editors (/bin/ed) can edit source programs, so therefore source code must be text, right? er...

What I'm thinking is - that the form we are accustomed to writing programs is simply a textual representation. Just like XML isn't really text, it's a textual representation of a data structure, so too is source code a textual representation of program instructions, so too is SQL a textual representation of set theory.

One byproduct of this line of thinking is it explains a fundamental difficulty with writing software. It has to do with translating between an abstract model into the textual representation. To write software requires cognitively grasping the data structures and operations to be done, and understanding the way a particular language represents those as text.

Another byproduct is explaining the time a newcomer to a project requires before they can be productive. Since the text is just a textual representation, the newcomer has to spend time reading text and translating from the textual representation into the abstract model.

Another byproduct is explaining some sorts of bugs. Since the textual representation of the source code is not in its native format, everybody who deals with it has to translate from the textual representation to their own mental cognitional system. Anybody with experience talking among speakers of a second language know that non-fluent speakers have a difficult time with word choice and grammar and pronunciation, and it just makes communication "interesting". Especially in a conversation among multiple people all of whom are speaking a second (non-fluent) language.

I've been seeing discussion of "Domain Specific Languages" and while I'm sure these are a help .. e.g. I'd think SQL is an example of a DSL, right? It's sure a lot easier to describe set theory operations in SQL than in a general programming language. But these languages are all text based, and hence will all ultimately leave the programmer frustrated because textual representation isn't the best modeling of software.

Most of my programming experience is with textual representation but I can think of some examples where non-textual representation is used.

WYSIWYG-like HTML editors sure make it a lot easier to write HTML code. I much prefer them over writing straight HTML, even when the editor makes mangled looking HTML. (e.g. N|VU is a decent editor but the HTML looks not-so-good)

GUI builders such as the one in Netbeans make it easier to do GUI design. I've written GUI's in textual representation and in GUI builders both, and doing it in a GUI builder is much easier and straightforward.

Some drawing programs (e.g. Inkscape) produce SVG. It's sure a lot easier to make pretty pictures in Inkscape than it is by editing the SVG directly. Ditto with Flash and other similar graphics editors.

Music editors (of which I only know Deluxe Music Construction Set from the Amiga) which show staffs, and notes, and rests, and clefs, and whatnot.. they're sure a lot easier than e.g. making MIDI instructions.

There are no doubt more examples, these are what come to mind. I don't know how true is any of what I've just written. However, these ideas seem to be true.

Related Topics >>