The award will run from July 1st, 2007 to June 30th, 2010.
Research Abstract - In the 90s, the theme of speculation drove innovation in all aspects of processor design and led to significant cumulative performance gains. Today, single core performance is not scaling as impressively due to technology issues and the lack of a compelling theme to drive a new generation of microarchitecture innovation. While the now popular multi-core theme is important, it poses more of a challenge than a solution because much software remains non-parallel. Thus, the multi-core theme must be combined with a new sequential-program-centric thrust. This project puts forward a new microarchitecture theme. We propose a paradigm in which the processor has an unprecedented view of the structure of a running program. Like speculation in the past, this paradigm will enable a new generation of powerful performance optimizations.
We propose that current processors are fundamentally performance limited because they narrowly focus on fine-grained program behavior. In particular, they treat individual memory accesses (loads and stores) to program data separately. The novelty of our proposal is recognizing that program data is naturally organized into objects and an object is accessed as a whole via clusters of instructions we term object phases. The additional novelty of this project lies in developing powerful optimizations enabled by object and object phase identification. Our centerpiece and truly novel strategy becomes evident from object phases. When the processor detects the end of a phase operating on an object, signaling an end to modifications (stores) to the object, the next phase can be anticipated and the corresponding code can be specialized according to the new data stored within the object. This represents an unprecedented execution model: when the data in an object or objects change, the program changes with it in real time thereby continuously compressing the future dynamic instruction stream in reaction to object modifications. We observe millions of instructions between successive object phases to a given object, providing ample time for specialization before the next phase. Furthermore, this execution model presents a unique form of parallelization, where the program under observation is not parallelized itself but rather compressed, and the specialization process itself is highly parallel by virtue of assigning responsibility for different objects to different processors in a multi-core or many-core platform. Thus the proposal unifies the prevailing multi-core theme with our new sequential-program-centric theme.