Pete's Log: I like reading! part 3

Entry #1148, (Coding, Hacking, & CS stuff)
(posted when I was 23 years old.)

MASA: A Multithreaded Processor Architecture for Parallel Symbolic Computing
by Halstead & Fujita, 15th ISCA, 1988

I was surprised to find this paper more useful than I anticipated. The architecture described in this paper is one optimized for Multilisp, and so I did not expect to find much in it that applied to my interests. Additionally, the architecture described has been developed "on paper" only, there is no simulator or any other validation of the ideas described.

However, the authors did put significant thought into context allocation, which is of interest to me because it is similar to frame allocation. It also describes runtime environment isues such as procedure linkage, and the ideas presented are probably of some value for further study. Also, the need to swap out frames if the system become overloaded is discussed.

On the other hand, I skimmed over some other parts of this paper. Architectures for functional languages of this sort are not of great interest to me right now. I also felt that the instruction set proposed was somewhat ... silly. But it had some features that I should not immediately discount.


Monsoon: an Explicit Token-Store Architecture
by Papadopoulos & Culler, 17th ISCA, 1990

This paper presents Monsoon, an architecture that attempts to simplify the hardware required to support dataflow programming models.

Dataflow is cool. It allows for easy toleration of delays, simple synchronization and good parallel stuff. Unfortunately, in a "pure" dataflow machine, all incoming tags must be compared to all the tags in the token store, requiring complex matching hardware. The authors suggest the Explicit Token Store (ETS) architecture, which allows for dataflow programs to be run without the complex matching hardware. Instead of comparing an incoming tag to all stored tokens, incoming tokens instead indicate locations ... a token consists of a value, an FP, and an IP. The instruction at the IP indicates an offset from the FP. The location at that offset is inspected to see if the presence bit is set. If it isn't, the value from the token is stored at the location. If it is, the value is read from the location, and along with the value from the token serves as input to the instruction at IP. The result of the IP will then be new tokens, consisting of values, FPs and IPs.

Monsoon is an implementation of the ETS architecture. It is fairly simple, but seems to work well.

I've not yet decided what I think of this. I think it's a cool idea. But I don't know exactly how feasible it really is. The paper does describe a running prototype system, so this is more than just a paper architecture. I'm not really sure how this applies to me though. I think mainly it's good to have read this so that I can better understand dataflow and its implementations.

One really cool idea in Monsoon is that the tokens stored as execution units on a processor are the same tokens as are sent across the network to invoke remote execution. So inter and intra processor communication is achieved by the same means.

Multithreading: A Revisionist View of Dataflow Architectures
by Papadopoulos, Traub, 18th ISCA, 1991

This paper extends upon the previous one, detailing additions made to the Monsoon architecture that gave it characteristics of a traditional von Neumann architecture while still leaving the general dataflow infrastructure of the machine in place.

Added to the architecture were fork and join capabilities. The fork is very similar to the fork in P-RISC, while the join is not a separate instruction, but instead an extension to existing instructions. While this feels somewhat burdensome in a CISCy manner, it is an idea not without merit, as it saves on instruction count and such. However, I don't know how well this join implementation could be adapted to an architecture that is less dataflow-oriented ...

The section on procedure linkage (3.4) is really cool. I don't think it presents any radically new ideas, but it explains very well how procedure calls might work in a multithreaded continuation-based architecture. It is definitely worth considering when designing a PIM runtime model.

Another interesting topic in this paper is the talk of critical sections. Instructions that do not generate new threads or such are able to take a "short circuit" path directly back to the first pipeline stage when they have completed. This allows for a thread to run "uninterrupted" (other than by threads in other pipeline stages, which by design would have to refer to other frames). While the concept of a continuation being able to so easily enter a critical section is appealing, I am somewhat worried by the fact that any thread can apparently enter such a critical section and not worry about being preempted. No mention was made of any means to prevent threads from taking over (long-term) exclusive use of the processor in this manner.

The main argument the authors provide for implementing support for such critical sections is that system code such as resource managers would require such capability. I agree this to be true, but facilities must exist to ensure that not anyone can gain exclusive use of the processor. The authors mention as well that in dataflow literature, the topic of support for system services such as resource management is poorly addressed.

One issue I have with the design of Monsoon is that all addresses consist of a PE address and an offset into that PE. So if a continuation is referenced on a remote node, a PE knows the offset of the code and frame on that node, which smells of poor abstraction to me.

But beyond that, Monsoon -- and especially the modifications to Monsoon described in this paper -- has proven an interesting architecture to consider to me.