[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

rantings and ravings (long and almost certainly boring)



On a less practical note...a question I've had for a while:

The first time that I saw a description of self, I was very excited  
by the possiblities of the language.  (I still am.)  Here is a  
language which has succeeded in factoring the problems facing  
programmers in a way that satisfys both their desire for an  
interactive world containing concrete things (prototypes), as well as  
their desire to control how things work and relate to each other  
(methods).  [The recurring lisp-is-beauty arguments on this mailing  
list point out this dichotomy -- data is code, code is data...what a  
beautiful life...]

Self also has tremendous potential for allowing the legal hoarding of  
prototypes.  Any object that I polish off today might come in useful  
in the future; objects in self are dynamically typed, so if they turn  
out to be useful in some future world, they should be able to conform  
to unanticipated constraints automagically (through the magic of type  
analysis).

OK, so here's my question, one which has been bubbled up recently by  
threads on this mailing list: why not exploit this factoring by  
leveraging the simplicity of its model and already existing database  
mechanisms?  Here's what I mean by this...

Suspend for the moment, if you will, your total programmer's distrust  
of the word "database" and assume that their are flexible, fast, and  
mathematically rigorous techniques that will allow people (someday)  
to use databases of some flavor to usefully store things.  Things  
like prototypes.  In fact, such a database might look a lot like the  
vm system of self...   But it would also support the notion of query  
more explicitly, and might do more to protect the integrity of  
prototypes.  Such a "database" must be an explicit service, not  
merely a language feature, for the same reasons that vm needs to be  
an explicit assumption in the self world.  In fact, let's plug in  
this "database" and make it function as vm.  There is an extensional  
piece which looks like the object space, and there is an intensional  
piece, which looks like methods.  Maps are represented by immutable  
tables...you get the idea.  And of course, since vm is transparent,  
anyone writing self code just knows about a few more options at their  
disposal -- there is no necessary impact at the language level  
(although some features might be useful to pass through).

Now, with this database vm in place (or, for argument, let's say that  
vm has expanded to include the notions of query and tranaction), the  
explicit "type analysis phase" in the compiler might be embellished  
at will; new techniques could be installed and removed interactively,  
since they could be expressed as optimization techniques within the  
query engine (I'm assuming that a pluggable optimizer would be a very  
useful object to have in your lobby if you were a self developer,  
let's say...)  Because its supports transactions, many active worlds  
could co-exist in which parts are shared; those parts might take on  
radically different looks in alternative worlds, since the objects  
which define a world implictly define a data model: the query context  
of this database is the context of object slots in a given world.   
Really useful tranformation techniques such as "magic sets" could be  
easily exploited within a given context by self's dispatching and  
compiling facilities.  And the world would be a better place...

Here are the most recent threads on this mailing list in this light:

The **multi-methods vs. single receiver methods** question becomes  
whether to use a generalized "select on these attributes" vs. a basic  
"lookup-on-this-key".  A very interesting possibility that emerges  
would be that not only could a multi-method be a "call to action" for  
a certain cooperating group of objects, but that that group of  
objects might be partially defined...for instance, send a prototype a  
message that dispatches on an entire set of prototypes, without  
having to physically construct the set before sending the message.   
If you think about this one in the self context (a world where a  
finite number of objects are sitting around doing things for you),  
this could be both simple and incredibly powerful.  Of course, you  
still have to come up with the good language syntax...

**Object persistance** is clipped into the vm -- it is important to  
realize that it is NOT integral to the language -- it is a separate,  
invokable, service.  You might not even use it, if you're doing  
something lightweight.  Different vms might use different  
instantiation techniques, which could help to solve the:

**prototype corruption problem** -- prototypes are "selected" into a  
current world, which means that if they become trashed, the world can  
be "rolled back" to a consistent state.  When the desired changes  
have been made, the new image of the object can be comitted to the  
database...using a time-based transaction mechanism.  In other words,  
old objects never disappear, they just become "shadowed".  This kind  
of technique is becoming very common, and is very manageable in this  
context, since the commit granularity is usually on a per-session  
basis (maybe once every few hours...)  So, I could reference "the  
object that I used last week that I redefined black in.." if I wanted  
to break my system, for instance.  The debugging possibilities are  
really great, as well as the large-project, object-sharing aspects of  
the system.

The key point here is that the way self is structured seems to allow  
for an entirely different, and more sensible, treatment of  
persistance and evolution.  Prototypes are concrete, so identity is  
not a big problem.  The query model is simple, because it lives below  
user space.  (Programmers never even see it.)  Many, many good type  
inference algorithms already exist and could be utilized to make the  
compiler blazingly fast.  And this would all be pretty simple (in a  
relative sense), again, because it is not intended to be general  
purpose.

The performance and consistency techniques that have been developed  
in the database world would complement self's expressive  
computational model beautifully.  If I'm not mistaken, object  
identity -- the big problem facing most people trying to do this kind  
of thing in Smalltalk, let's say -- doesn't rear its head at all,  
since there is simply no way that data *can't* have an identity in  
self.  The data *is* the object...the only real difference that would  
be reflected by this type of scheme would be the notion of an  
explicit "query".  Data (that is, objects) comes from some point in  
time, which can be reasonably defaulted to "the past", but might also  
include such interesting generic possibilities as "the future".

Any thoughts, anyone?  (If you made it this far...)
David Stutz