"(C)1984-1985 Jecel Assumpcao Jr"

"$Id: st84.txt,v 1.1.1.1 1995/06/29 00:14:36 root Exp $"

"This is an implementation of Smalltalk described
in Smalltalk itself. This was designed in late 1984, but
what follows is a translation into English of the
February 26, 1985 version. It is only a paper design
as there was no way to test it then and represents
the first real object oriented design for the author.
A confusing aspect of this design is the *maxLex*
thing that keeps popping up - the author had assumed
that blocks were real closures and compiled temporary
variable references within blocks as (lexical level,
variable index) pairs with the associated complications
in the runtime access. The design is for a sequencial
interpreter, but a dataflow-like parallel implementation
is possible with the same object code format ( which
was one of the reasons why the format was chosen ).
--------------------------------------------------------
The program below is a discription of part of the Smalltalk
system that will be eventually written in machine language
and that interprets compiled methods. It is interesting to
note the resemblence between the static data structures
( templates and blocks ) and the dynamic structures used
by the interpreter. These structures take up more memory
than the bytecodes used in other Smalltalk systems, but
achieve a greater uniformity, which we hope will result
in a higher performance.

Only the methods that help to make clear how the interepreter
works will be shown."

Object subclass: #Template
       instanceVariableNames: 'expressions'
       classVariableNames: ''
       poolDictionaries: ''
       category: ''

"This is the basic unit of Smalltalk object code. It
represents a single message as an Array of expressions
containing the selector, the receiver and the arguments
of the message. While the selector is a string, the
receiver and the arguments are themselves Templates,
forming a tree." !

! Template methodsFor: '' !

sender: context1 index: anInteger maxLex: context2
    "makes an ( almost ) clone of itself and
     saves the specified dynamic environment"
    | tmp |
    tmp <- MethodContext new.
    tmp sender: context1 index: anInteger maxLex: context2.
    tmp expressions: expressions shallowCopy.
    ^ tmp !

saveContext
    "the current environment is stored away for later"
    | |
    ^ self sender: CurrentProcess currentContext
           index: CurrentProcess index
           maxLex: CurrentProcess maxLex !

eval
    "creates a new context to determine the
     value of this code in the current environment"
    | |
    CurrentProcess dispatch: self saveContext !!


Template subclass: #ReplyTemplate
       instanceVariableNames: ''
       classVariableNames: ''
       poolDictionaries: ''
       category: '' 
       
"This class is for code which starts with the $^
character" !

! ReplyTemplate methodsFor: '' !

sender: context1 index: anInteger maxLex: context2
    "remember the home context, rather than
     the current one"
    | |
    ^ super sender: context2 sender
            index: context2 index
            maxLex: context2 !!


Template subclass: #SuperTemplate
       instanceVariableNames: ''
       classVariableNames: ''
       poolDictionaries: ''
       category: '' 
       
"This class represents code generated from
messages sent to super" !

! SuperTemplate methodsFor: '' !

sender: context1 index: anInteger maxLex: context2
    "the clone is a SuperContext"
    | tmp e |
    tmp <- SuperContext new.
    tmp sender: context1 index: anInteger maxLex: context2.
    e <- expressions shallowCopy.
    e at: 2 put: context2 receiver.
    tmp expressions: e.
    ^ tmp !!


Template subclass: #SelfTemplate
       instanceVariableNames: ''
       classVariableNames: ''
       poolDictionaries: ''
       category: '' 
       
"This class represents code generated from
messages sent to self" !

! SelfTemplate methodsFor: '' !

sender: context1 index: anInteger maxLex: context2
    "the receiver is the same as context2 receiver"
    | tmp e |
    tmp <- MethodContext new.
    tmp sender: context1 index: anInteger maxLex: context2.
    e <- expressions shallowCopy.
    e at: 2 put: context2 receiver.
    tmp expressions: e.
    ^ tmp !!


Template subclass: #HCTemplate
       instanceVariableNames: ''
       classVariableNames: ''
       poolDictionaries: ''
       category: '' 
       
"This class represents code generated from
messages sent to the home context" !

! HCTemplate methodsFor: '' !

sender: context1 index: anInteger maxLex: context2
    "the receiver is the sending context"
    | tmp e |
    tmp <- MethodContext new.
    tmp sender: context1 index: anInteger maxLex: context2.
    e <- expressions shallowCopy.
    e at: 2 put: context1.
    tmp expressions: e.
    ^ tmp !!


Template subclass: #LiteralTemplate
       instanceVariableNames: ''
       classVariableNames: ''
       poolDictionaries: ''
       category: '' 
       
"This class represents literal objects
in the source code. The expressions
instance variable contains a single
object ( of any kind ) rather than an
Array of Templates" !

! LiteralTemplate methodsFor: '' !

eval
    "this is easy: we already know the answer"
    | |
    CurrentProcess reply: expressions 
                   to: CurrentProcess currentContext
                   index: CurrentProcess index
                   maxLex: CurrentProcess maxLex !!


Template subclass: #LiteralReply
       instanceVariableNames: ''
       classVariableNames: ''
       poolDictionaries: ''
       category: '' 
       
"This class represents code generated from
a literal object in the source code after
the $^ character" !

! LiteralReply methodsFor: '' !

eval
    | |
    "the answer goes to the home context sender"
    CurrentProcess reply: expressions 
                   to: CurrentProcess maxLex currentContext
                   index: CurrentProcess maxLex index
                   maxLex: CurrentProcess maxLex maxLex !!


Template subclass: #Block
       instanceVariableNames: 'numTemporaries'
       classVariableNames: ''
       poolDictionaries: ''
       category: '' 
       
"This class represents blocks in the source
code. numTemporaries tells us how much space
to add when cloning. expressions is an Array
of Templates which are evaluated in order and
have all ( but the last ) values thrown away.

Source code methods are also compiled to
blocks." !

! Block methodsFor: '' !

sender: context1 index: anInteger maxLex: context2
    "cloning much reserve space for the temporaries -
    this could be avoided by merging the temporaries
    and expressions Arrays"
    | tmp |
    tmp <- BlockContext new.
    tmp nextLex: context1.
    tmp expressions: expressions. "no need to copy
                                   them as they cannot
                                   be changed ( the
                                   returned values are
                                   thrown away"
    tmp temporaries: ( Array new: numTemporaries ).
    ^ tmp !

eval
    "evaluating a block results in a BlockContext"
    | |
    CurrentProcess reply: self saveContext 
                   to: CurrentProcess currentContext
                   index: CurrentProcess index
                   maxLex: CurrentProcess maxLex !!


Object subclass: #Context
       instanceVariableNames: 'sender index maxLex expressions'
       classVariableNames: ''
       poolDictionaries: ''
       category: '' 
       
"This represents the current state of a computation" !

! Context methodsFor: '' !

sender: context1 index: anInteger maxLex: context2
    "initialize"
    | |
    sender <- context1.
    index  <- anInteger.
    maxLex <- context2. !

expressions: anArray
    | |
    expressions <- anArray !

at: int put: value
    | |
    expressions at: int put: value !

lexicalLevel: anInt
    "normal Contexts may nest in the same
     lexical level"
    | |
    ^ sender lexicalLevel: anInt !!


Context subclass: #BlockContext
       instanceVariableNames: 'temporaries nextLex'
       classVariableNames: ''
       poolDictionaries: ''
       category: '' 
       
"This class represents the dynamic environment
of the evalutation of a block" !

! BlockContext methodsFor: '' !

temporaries: anArray
    | |
    temporaries <- anArray !

nextLex: context
    | |
    nextLex <- context !

value
    "starts the evaluation of the block"
    | |
    sender <- CurrentProcess currentContext.
    index <- CurrentProcess index.
    maxLex <- CurrentProcess maxLex.
    CurrentProcess dispatch: self !

value: arg
    | |
    temporaries at: 1 put: arg.
    self value !

value: arg1 with: arg2
    | |
    temporaries at: 2 put: arg2.
    self value: arg1 !

evalAt: anInt
    "evaluate one of the block's expressions"
    | |
    ( anInt = expressions size )
        ifTrue: [ CurentProcess return ] "replies
                             to whom I should reply"
        ifFalse: [ ( expressions at: anInt ) eval ] !

temporary: anInt
    | |
    ^ temporaries at: anInt !

temporary: anInt put: obj
    | |
    temporaries at: anInt put: obj !

at: anInt put: obj
    | |
    ^ self "just ignore it - blocks throw aways results" !

lexicalLevel: anInt
    "a block always starts a new lexical level"
    | |
    ( anInt = 1 ) ifTrue: [ ^ self ]
                  ifFalse: [ ^ nextLex lexicalLevel: anInt - 1 ] !!


Context subclass: #MethodContext
       instanceVariableNames: ''
       classVariableNames: ''
       poolDictionaries: ''
       category: '' 
       
"stores the dynamic environment of evaluating a message.
There are two stages:
  1) the receiver and arguments must be evaluated
  2) the message must be sent ( the associated
     method is found and this becomes the home
     context of that method as it is evaluated )" !

! MethodContext methodsFor: '' !

evalAt: anInt
    | |
    ( anInt > expressions size )
        ifTrue: [ CurentProcess send ] 
        ifFalse: [ ( expressions at: anInt ) eval ] !

receiverClass
    | |
    ^ self receiver class !

receiver
    | |
    ^ expressions at: 2 !

selector
    | |
    ^ expressions at: 1 !

argument: anInt
    | |
    ^ expressions at: anInt + 2 !!


MethodContext subclass: #SuperContext
       instanceVariableNames: ''
       classVariableNames: ''
       poolDictionaries: ''
       category: '' 
       
"Kludge to implement super" !

! SuperContext methodsFor: '' !

receiverClass
    "this dynamic lookup is very inefficient and only
     must be done because a messages holder is not
     stored anywhere"
    | |
    ^ ( maxLex receiver class superclassWhichHas: maxLex selector ) 
                  superclass !!


Object subclass: #Interpreter
       instanceVariableNames: 'currentContext index maxLex'
       classVariableNames: ''
       poolDictionaries: ''
       category: '' 
       
"This represent a single thread of execution. The global
variable CurrentProcess is always an object of this
class" !

! Interpreter methodsFor: '' !

currentContext
    | |
    ^ currentContext !

index
    | |
    ^ index !

maxLex
    | |
    ^ maxLex !

step
    "evaluate the next thing"
    | |
    index <- index + 1.
    currentContext evalAt: index !

dispatch: newContext
    | |
    "starts an additional level - like the
     colon in FORTH"
    currentContext <- newContext.
    index <- 0.
    self step !

return
    "ends a level of evaluation - like the
     semicolon in FORTH"
    | |
    index <- currentContext index.
    maxLex <- currentContext maxLex.
    currentContext <- currentContext sender !

reply: value to: context1 index: anInt maxLex: context2
    "replaces a context subtree with the computed
     result"
    | |
    index <- anInt.
    maxLex <- context2.
    currentContext <- context1.
    currentContext at: index put: value.
    self step !

send
    "starts the second phase of MethodContext evaluation"
    | method value |
    method <- self lookFor: currentContext selector
                   in: currentContext receiverClass.
    method isNil ifTrue: [ self sendMessageNotUnderstood ]
                 ifFalse: [ value <- PrimitiveFailed.
                            method primitive isNil
                               ifFalse: [ value <- self doPrimitive:
                                                  method primitive ].
                            ( value = PrimitiveFailed )
                               ifFalse: [
                                   self reply: value
                                        to: currentContext sender
                                        index: currentContext index
                                        maxLex: currentContext maxLex
                               ]
                               ifTrue: [
                                    maxLex <- currentContext.
                                    method compiled eval
                               ] ] !

lookFor: selector in: class
    "not really needed - just to speed up things"
    | value |
    Cache includes: selector and: class
        ifTrue: [ value <- Cache at: selector and: class ]
        ifFalse: [ value <- class lookup: selector.
                   value isNil
                       ifFalse: [ Cache at: selector and: class
                                             put: value ] ]
    ^ value !!
"-----------------------------------------------
To see how this might work, it is important to know
what the compiler output looks like. On page 546 of
the Blue Book ( Smalltalk-80: The Language and Its
Implementation ) we find this example:

 merge: aRectangle
     | minPoint maxPoint |
     minPoint <- origin min: aRectangle origin.
     maxPoint <- corner max: aRectangle corner.
     ^ Rectangle origin: minPoint
                 corner: maxPoint

which gives as the bytecodes: 

#( 0 16 209 224 105 1 16 211 226 106 69 17 18 224 124 )

and the literal frame:

#( min: origin max: corner origin:corner: Association(Rectangle,...) )

For our interpreter we would have:

Block(2,#(HCTemplate(#(LiteralTemplate('temporary:put:')
                      nil
                      LiteralTemplate(1)
                      Template(#(LiteralTemplate('min:')
                                 SelfTemplate(#(LiteralTemplate('basicAt:')
                                            nil
                                            LiteralTemplate(1)
                                           ))
                                 Template(#(LiteralTemplate('origin')
                                          HCTemplate(#(LiteralTemplate
                                                            ('argument:')
                                                       LiteralTemplate(1)
                                         ))
                               ))
                   ))
          HCTemplate(#(LiteralTemplate('temporary:put:')
                      nil
                      LiteralTemplate(2)
                      Template(#(LiteralTemplate('max:')
                                 SelfTemplate(#(LiteralTemplate('basicAt:')
                                            nil
                                            LiteralTemplate(2)
                                           ))
                                 Template(#(LiteralTemplate('corner')
                                          HCTemplate(#(LiteralTemplate
                                                            ('argument:')
                                                       LiteralTemplate(1)
                                         ))
                               ))
                   ))
          ReplyTemplate(
                   #(LiteralTemplate('origin:corner:')
                     Template(#(LiteralTemplate('value')
                                LiteralTemplate(Association(Rectangle,...)
                              ))
                     HCTemplate(#(LiteralTemplate('temporary:')
                                  LiteralTemplate(1)
                               ))
                     HCTemplate(#(LiteralTemplate('temporary:')
                                  LiteralTemplate(2)
                               ))
                   ))
         )
     )
 
Now this includes 38 Template objects, which would consume a
huge amount of memory ( plus the 14 Array objects ). Of these,
14 LiteralTemplate objects are there only to quote the selector
name - if the interpreter were modified to skip selector evaluation
they could be eliminated. This would still make the memory requirements
of this implementation several times that of a bytecoded one. If
the compiler cached objects like:

HCTemplate(#(LiteralTemplate('temporary:')
             LiteralTemplate(2)
          ))

which occur repeatedly, the actual code expansion could be less than
100%. Also, the interpreter could short circuit evaluation when
encountering such common templates.

The reason for all this mess is that a bytecoded implementation of
Smalltalk didn't seem object-oriented enough. This proposal has
message passing at the bottom ( to use a Self term ) and the compiled
code itself is made up of objects. It was hoped that such an
implementation would be easier to render into hardware."