Skip to main content

Are we there yet?

Posted by johnreynolds on October 24, 2003 at 6:47 AM PDT

Are we there yet?

Periodically, I like to sit back and take stock of how closely “computers” match my expectations of what they ought to be, and starting this blog seems to be as good an excuse as any to see how they’re doing. My expectations for computers are pretty easy to sum up: I want computers to function the way they did on Star Trek back in the mid 60’s.

When I bought my first computer in 1979 it had a whopping 4K of RAM and Basic in ROM. I could only store programs on cassete tape, and it was always an iffy proposition whether or not I would be able to reload them. That computer brought me quite a bit of joy by allowing me to tinker around with programming. Compared to the cryptic world of punch cards and climate controlled mainframe vaults, that little microcomputer was a hoot to play with, but it wasn’t at all what I “expected” a computer to be.

I expected computers to “know things”.

For years I had seen Captain Kirk and Mr. Spock turn to their computer whenever they had a question, and seldom were their queries left un-answered. While true that Star Trek was set in the future, the “real” computers that I worked on at school, and the one that I brought home to my dorm room were not even philisophically similar to the devices my science fiction heroes relied on. My computer knew nothing, it simply executed the buggy programs that I tediously constructed. Not at all what I expected.

Fast forward to today. Whenever my wife and I want to settle an argument about something important (like: “How old is Antonio Banderas?”), we inevitably “ask the Internet”… or more specifically we ask Google. Unlike Kirk and Spock, we have to type in a search phrase, and the results are not always as germane as the answers heard on the bridge of the Enterprise, but you have to admit it’s getting close to the same experience.

Computers are closer to meeting my expectations with respect to answering questions, but they still fall far short in their ability to be taught new tasks. When Spock needed help from the computer, he would sit down and patiently explain to the computer what he wanted, and the computer would then draw on its resources to perform the task. For me, sitting here in the 21st century, it’s still pretty tedious to contruct even relatively trivial programs, even in domains where pretty sophisticated functionality has already been developed.

For example, assume that some thug has been vandalizing the cars in your building’s parking lot, and you’d like to catch the cretins (This recently happened to us). The building has video survellience cameras, but the recorded video doesn’t have sufficient resolution to identify a culprit, so you can’t review the tapes to track down suspects after the fact. If someone would watch the monitors and alert the cops when they spotted the thugs, you could catch the rats, but there are a lot of hours in the day, and it’s just not likely that a guard will be able to stay alert.

The solution should be simple.

Me:”Computer, monitor the video camera and alert me whenever you detect any suspicious people in the parking lot”

Computer:”I don’t know how to monitor video”

Me:”No problem. I’ll install this video capture card that I bought at Fry’s, and then splice in the video feed from the camera”
Computer:”I don’t know how to detect suspicious people”

Me:”No problem. The folks at Carnegie Mellon University developed some software for DARPA that analyzes video to detect and classify moving objects and their trajectories. Just download that software, and let me know when any of the objects that are classified as ‘people’ are moving in an unusual trajectory.”

Computer:”I don’t know how to alert you”

Me:”Now you’re just being difficult. Prompt me at this terminal first. If I don’t respond, then buzz my pager. If I don’t respond then call my cell phone”

The status quo is a bit less satisfying.

Carnegie Mellon University really has developed software that can analyze video and differentiate between the “normal” and “abnormal” movements of people and vehicles. They developed some pretty impressive stuff, and published all of their results a couple of years ago on their web site: http://www.cs.cmu.edu/~vsam/. There is no longer any need to suffer through hours of grainy video tape to catch a glipse of the thugs that vandalized your car in the parking lot last night. You really can construct a system to quickly flip to the scenes where “something unusual” is going on. That’s pretty cool stuff.

Conceptually, all you need to do is get the software from CMU, establish a feed from your video cameras to the processing units, and add in hooks to alert you or your guard. Practically, it’s not a simple task. The software is not componetized, and it is not packaged for reuse. You’ll need an extensive knowledge base and many weeks of tinkering to duplicate the base functionality and adapt this technology to catch the bums that are vandalizing your cars.

For this specific example, there’s lots of money to be made so there will no doubt be a turn-key system available soon, but think of all the fringe cases where you want something done that isn’t common enough to warrant a dedicated product.

I long for the day when it is the norm for sophisticated systems to provide interfaces that allow them to be incorporated as components of other systems. Perhaps Web Services is the beginning of this trend, but it’s a bit too early to tell.

Can you imagine a world where all of the functionality that exists on SourceForge and Jakarta today is componetised and packaged for reuse? Can you imagine development environments that can locate and assemble functionality to fulfil stated desires? We’re not there yet, but we’re getting there.


(Cross posted at The Thoughtful Programmer)