Register Pressure Reduction Techniques

A Very Long Instruction Word (VLIW) processor performs multiple operations per clock cycle based on a fixed schedule generated by the compiler. Therefore, the runtime performance of a VLIW processor relies heavily on its compiler. A typical VLIW compiler includes an instruction scheduling phase, which maximizes instruction level parallelism (ILP), followed by a register allocation phase, which minimizes data spills to external memory. A limitation of this flow is that if the ILP is maximized without considering the register requirement, a VLIW program may require more registers than available, leading to increased spills and reduced runtime performance. To address this issue, we have developed a new compiler technique that is applied prior to instruction scheduling and register allocation. By modifying the ordering of operations in a program, this technique can control the number of registers required by the program and reduce spills. To illustrate the benefit of this technique, we have implemented it in an academic VLIW compiler, Trimaran. Our experiments have shown that this technique can effectively reduce spills and improve runtime performance for VLIW processors with a limited number of registers.

W. Xu and R. Tessier, Tetris-XL: A Performance-Driven Spill Technique for Embedded VLIW Processors, in ACM Transactions on Architecture and Code Optimization, vol. 6, no. 3, September 2009, pp. 1-40.
W. Xu and R. Tessier, Tetris: A New Register Pressure Control Technique for VLIW Processors, in the Proceedings of the ACM/SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems, San Diego, CA, June 2007.