Architecture and Real-Time Systems (ARTS) Laboratory

 Architecture and Real-Time Systems (ARTS) Laboratoryclock

Thermal Aware Management

The need to develop thermal-aware management specifically for cyber-physical systems (CPSs) has growing significance for several reasons:

First, the high cost of designing and manufacturing microprocessors deters the design of special-purpose controllers for most CPSs. As demanding CPS applications migrate from cost-insensitive applications such as aerospace to highly cost-sensitive applications such as automobiles, the pressure grows for inexpensive solutions. Thus, it has become a common practice to use high-volume general-purpose microprocessors as commercial off-the-shelf components since they are considerably cheaper. The emergence of low-cost multicore chips further enforces this trend as complex CPSs often require multiple processors.

However, these commodity processors have not been designed for real-time applications (often with stringent timing constraints) and their only thermal-related concern is to ensure that cores do not experience a thermal breakdown. To this end, almost all contemporary processors have built-in protection against overheating either through reduction of the voltage/frequency, or in more severe cases, idling the overheated processor. In general-purpose computing systems, reducing the speed of computation, insuring a brief idle period of a single core, or even idling the entire multicore to allow it to cool down is acceptable. If not anticipated, such steps may negatively impact the CPS performance and may, in certain situations, result in failure of its mission/applications.

Thermal-aware management is a must in the case of CPSs that operate in hot environments, such as engine compartments or robots in harsh search-and-rescue conditions. A higher ambient temperature reduces the temperature gradient between the chip and its environment, slowing the rate at which the heat can be dissipated. This results in a higher steady-state chip temperature for the same amount of internal heat dissipation. Off-the-shelf multicores are not hardened for use in high-temperature regimes unlike some specially-designed electronics components for automotive control systems (prototypical CPSs) that are often rated up to $140^o$C.

Even for other CPSs that normally are not expected to experience high ambient temperatures, efficient thermal-aware management can be beneficial. These include CPSs for life-critical applications (e.g., commercial airliners' fly-by-wire systems) where the reduced reliability that comes with continuous excessive heating is often unacceptable.

We are developing algorithms and software implementations for CPS thermal-aware management. Integrated, holistic, CPS management requires co-regulation of task processing as well as physical actions since an optimal and robust CPS must adjust both to meet task-level objectives given models of both cyber-side and controlled-plant-side temporal, energy, and thermal dynamics, as well as costs and constraints. We are devising a highly adaptive, reliability-aware, and application-driven approach to thermal-aware management of CPSs.