"(C)1984-1985 Jecel Assumpcao Jr" "$Id: st84.txt,v 1.1.1.1 1995/06/29 00:14:36 root Exp $" "This is an implementation of Smalltalk described in Smalltalk itself. This was designed in late 1984, but what follows is a translation into English of the February 26, 1985 version. It is only a paper design as there was no way to test it then and represents the first real object oriented design for the author. A confusing aspect of this design is the *maxLex* thing that keeps popping up - the author had assumed that blocks were real closures and compiled temporary variable references within blocks as (lexical level, variable index) pairs with the associated complications in the runtime access. The design is for a sequencial interpreter, but a dataflow-like parallel implementation is possible with the same object code format ( which was one of the reasons why the format was chosen ). -------------------------------------------------------- The program below is a discription of part of the Smalltalk system that will be eventually written in machine language and that interprets compiled methods. It is interesting to note the resemblence between the static data structures ( templates and blocks ) and the dynamic structures used by the interpreter. These structures take up more memory than the bytecodes used in other Smalltalk systems, but achieve a greater uniformity, which we hope will result in a higher performance. Only the methods that help to make clear how the interepreter works will be shown." Object subclass: #Template instanceVariableNames: 'expressions' classVariableNames: '' poolDictionaries: '' category: '' "This is the basic unit of Smalltalk object code. It represents a single message as an Array of expressions containing the selector, the receiver and the arguments of the message. While the selector is a string, the receiver and the arguments are themselves Templates, forming a tree." ! ! Template methodsFor: '' ! sender: context1 index: anInteger maxLex: context2 "makes an ( almost ) clone of itself and saves the specified dynamic environment" | tmp | tmp <- MethodContext new. tmp sender: context1 index: anInteger maxLex: context2. tmp expressions: expressions shallowCopy. ^ tmp ! saveContext "the current environment is stored away for later" | | ^ self sender: CurrentProcess currentContext index: CurrentProcess index maxLex: CurrentProcess maxLex ! eval "creates a new context to determine the value of this code in the current environment" | | CurrentProcess dispatch: self saveContext !! Template subclass: #ReplyTemplate instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: '' "This class is for code which starts with the $^ character" ! ! ReplyTemplate methodsFor: '' ! sender: context1 index: anInteger maxLex: context2 "remember the home context, rather than the current one" | | ^ super sender: context2 sender index: context2 index maxLex: context2 !! Template subclass: #SuperTemplate instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: '' "This class represents code generated from messages sent to super" ! ! SuperTemplate methodsFor: '' ! sender: context1 index: anInteger maxLex: context2 "the clone is a SuperContext" | tmp e | tmp <- SuperContext new. tmp sender: context1 index: anInteger maxLex: context2. e <- expressions shallowCopy. e at: 2 put: context2 receiver. tmp expressions: e. ^ tmp !! Template subclass: #SelfTemplate instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: '' "This class represents code generated from messages sent to self" ! ! SelfTemplate methodsFor: '' ! sender: context1 index: anInteger maxLex: context2 "the receiver is the same as context2 receiver" | tmp e | tmp <- MethodContext new. tmp sender: context1 index: anInteger maxLex: context2. e <- expressions shallowCopy. e at: 2 put: context2 receiver. tmp expressions: e. ^ tmp !! Template subclass: #HCTemplate instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: '' "This class represents code generated from messages sent to the home context" ! ! HCTemplate methodsFor: '' ! sender: context1 index: anInteger maxLex: context2 "the receiver is the sending context" | tmp e | tmp <- MethodContext new. tmp sender: context1 index: anInteger maxLex: context2. e <- expressions shallowCopy. e at: 2 put: context1. tmp expressions: e. ^ tmp !! Template subclass: #LiteralTemplate instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: '' "This class represents literal objects in the source code. The expressions instance variable contains a single object ( of any kind ) rather than an Array of Templates" ! ! LiteralTemplate methodsFor: '' ! eval "this is easy: we already know the answer" | | CurrentProcess reply: expressions to: CurrentProcess currentContext index: CurrentProcess index maxLex: CurrentProcess maxLex !! Template subclass: #LiteralReply instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: '' "This class represents code generated from a literal object in the source code after the $^ character" ! ! LiteralReply methodsFor: '' ! eval | | "the answer goes to the home context sender" CurrentProcess reply: expressions to: CurrentProcess maxLex currentContext index: CurrentProcess maxLex index maxLex: CurrentProcess maxLex maxLex !! Template subclass: #Block instanceVariableNames: 'numTemporaries' classVariableNames: '' poolDictionaries: '' category: '' "This class represents blocks in the source code. numTemporaries tells us how much space to add when cloning. expressions is an Array of Templates which are evaluated in order and have all ( but the last ) values thrown away. Source code methods are also compiled to blocks." ! ! Block methodsFor: '' ! sender: context1 index: anInteger maxLex: context2 "cloning much reserve space for the temporaries - this could be avoided by merging the temporaries and expressions Arrays" | tmp | tmp <- BlockContext new. tmp nextLex: context1. tmp expressions: expressions. "no need to copy them as they cannot be changed ( the returned values are thrown away" tmp temporaries: ( Array new: numTemporaries ). ^ tmp ! eval "evaluating a block results in a BlockContext" | | CurrentProcess reply: self saveContext to: CurrentProcess currentContext index: CurrentProcess index maxLex: CurrentProcess maxLex !! Object subclass: #Context instanceVariableNames: 'sender index maxLex expressions' classVariableNames: '' poolDictionaries: '' category: '' "This represents the current state of a computation" ! ! Context methodsFor: '' ! sender: context1 index: anInteger maxLex: context2 "initialize" | | sender <- context1. index <- anInteger. maxLex <- context2. ! expressions: anArray | | expressions <- anArray ! at: int put: value | | expressions at: int put: value ! lexicalLevel: anInt "normal Contexts may nest in the same lexical level" | | ^ sender lexicalLevel: anInt !! Context subclass: #BlockContext instanceVariableNames: 'temporaries nextLex' classVariableNames: '' poolDictionaries: '' category: '' "This class represents the dynamic environment of the evalutation of a block" ! ! BlockContext methodsFor: '' ! temporaries: anArray | | temporaries <- anArray ! nextLex: context | | nextLex <- context ! value "starts the evaluation of the block" | | sender <- CurrentProcess currentContext. index <- CurrentProcess index. maxLex <- CurrentProcess maxLex. CurrentProcess dispatch: self ! value: arg | | temporaries at: 1 put: arg. self value ! value: arg1 with: arg2 | | temporaries at: 2 put: arg2. self value: arg1 ! evalAt: anInt "evaluate one of the block's expressions" | | ( anInt = expressions size ) ifTrue: [ CurentProcess return ] "replies to whom I should reply" ifFalse: [ ( expressions at: anInt ) eval ] ! temporary: anInt | | ^ temporaries at: anInt ! temporary: anInt put: obj | | temporaries at: anInt put: obj ! at: anInt put: obj | | ^ self "just ignore it - blocks throw aways results" ! lexicalLevel: anInt "a block always starts a new lexical level" | | ( anInt = 1 ) ifTrue: [ ^ self ] ifFalse: [ ^ nextLex lexicalLevel: anInt - 1 ] !! Context subclass: #MethodContext instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: '' "stores the dynamic environment of evaluating a message. There are two stages: 1) the receiver and arguments must be evaluated 2) the message must be sent ( the associated method is found and this becomes the home context of that method as it is evaluated )" ! ! MethodContext methodsFor: '' ! evalAt: anInt | | ( anInt > expressions size ) ifTrue: [ CurentProcess send ] ifFalse: [ ( expressions at: anInt ) eval ] ! receiverClass | | ^ self receiver class ! receiver | | ^ expressions at: 2 ! selector | | ^ expressions at: 1 ! argument: anInt | | ^ expressions at: anInt + 2 !! MethodContext subclass: #SuperContext instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: '' "Kludge to implement super" ! ! SuperContext methodsFor: '' ! receiverClass "this dynamic lookup is very inefficient and only must be done because a messages holder is not stored anywhere" | | ^ ( maxLex receiver class superclassWhichHas: maxLex selector ) superclass !! Object subclass: #Interpreter instanceVariableNames: 'currentContext index maxLex' classVariableNames: '' poolDictionaries: '' category: '' "This represent a single thread of execution. The global variable CurrentProcess is always an object of this class" ! ! Interpreter methodsFor: '' ! currentContext | | ^ currentContext ! index | | ^ index ! maxLex | | ^ maxLex ! step "evaluate the next thing" | | index <- index + 1. currentContext evalAt: index ! dispatch: newContext | | "starts an additional level - like the colon in FORTH" currentContext <- newContext. index <- 0. self step ! return "ends a level of evaluation - like the semicolon in FORTH" | | index <- currentContext index. maxLex <- currentContext maxLex. currentContext <- currentContext sender ! reply: value to: context1 index: anInt maxLex: context2 "replaces a context subtree with the computed result" | | index <- anInt. maxLex <- context2. currentContext <- context1. currentContext at: index put: value. self step ! send "starts the second phase of MethodContext evaluation" | method value | method <- self lookFor: currentContext selector in: currentContext receiverClass. method isNil ifTrue: [ self sendMessageNotUnderstood ] ifFalse: [ value <- PrimitiveFailed. method primitive isNil ifFalse: [ value <- self doPrimitive: method primitive ]. ( value = PrimitiveFailed ) ifFalse: [ self reply: value to: currentContext sender index: currentContext index maxLex: currentContext maxLex ] ifTrue: [ maxLex <- currentContext. method compiled eval ] ] ! lookFor: selector in: class "not really needed - just to speed up things" | value | Cache includes: selector and: class ifTrue: [ value <- Cache at: selector and: class ] ifFalse: [ value <- class lookup: selector. value isNil ifFalse: [ Cache at: selector and: class put: value ] ] ^ value !! "----------------------------------------------- To see how this might work, it is important to know what the compiler output looks like. On page 546 of the Blue Book ( Smalltalk-80: The Language and Its Implementation ) we find this example: merge: aRectangle | minPoint maxPoint | minPoint <- origin min: aRectangle origin. maxPoint <- corner max: aRectangle corner. ^ Rectangle origin: minPoint corner: maxPoint which gives as the bytecodes: #( 0 16 209 224 105 1 16 211 226 106 69 17 18 224 124 ) and the literal frame: #( min: origin max: corner origin:corner: Association(Rectangle,...) ) For our interpreter we would have: Block(2,#(HCTemplate(#(LiteralTemplate('temporary:put:') nil LiteralTemplate(1) Template(#(LiteralTemplate('min:') SelfTemplate(#(LiteralTemplate('basicAt:') nil LiteralTemplate(1) )) Template(#(LiteralTemplate('origin') HCTemplate(#(LiteralTemplate ('argument:') LiteralTemplate(1) )) )) )) HCTemplate(#(LiteralTemplate('temporary:put:') nil LiteralTemplate(2) Template(#(LiteralTemplate('max:') SelfTemplate(#(LiteralTemplate('basicAt:') nil LiteralTemplate(2) )) Template(#(LiteralTemplate('corner') HCTemplate(#(LiteralTemplate ('argument:') LiteralTemplate(1) )) )) )) ReplyTemplate( #(LiteralTemplate('origin:corner:') Template(#(LiteralTemplate('value') LiteralTemplate(Association(Rectangle,...) )) HCTemplate(#(LiteralTemplate('temporary:') LiteralTemplate(1) )) HCTemplate(#(LiteralTemplate('temporary:') LiteralTemplate(2) )) )) ) ) Now this includes 38 Template objects, which would consume a huge amount of memory ( plus the 14 Array objects ). Of these, 14 LiteralTemplate objects are there only to quote the selector name - if the interpreter were modified to skip selector evaluation they could be eliminated. This would still make the memory requirements of this implementation several times that of a bytecoded one. If the compiler cached objects like: HCTemplate(#(LiteralTemplate('temporary:') LiteralTemplate(2) )) which occur repeatedly, the actual code expansion could be less than 100%. Also, the interpreter could short circuit evaluation when encountering such common templates. The reason for all this mess is that a bytecoded implementation of Smalltalk didn't seem object-oriented enough. This proposal has message passing at the bottom ( to use a Self term ) and the compiled code itself is made up of objects. It was hoped that such an implementation would be easier to render into hardware."