Our design goals are simple. Save energy without sacrificing performance. There are many places in modern computers where the designer can look to save energy. Because most computers do not consume all of their resources all of the time, slack can be exploited to either slow down computation or save energy by shutting off components while not in use. We are looking at whether or not slack can be exploited to save energy on frequency/voltage scaling CPUs by dynamically scaling the voltage/frequency.
Energy is proportional to clock frequency and proportional to the SQUARE of the operating voltage. Modern CPUs have a sliding scale of operating voltage vs. frequency. In other words, raise the frequency, raise the voltage and vice versa. Therefore, if we can lower the clock frequency (and lower the operating voltage to match), we can save energy even in the face of constant cycle requirement operations.
For example, an MPEG frame might take 50,000 cycles to decode. The frame must be displayed by a certain deadline (the framerate). If I decode the frame just in time to display it on the screen, running at a lower speed I save energy versus decoding the frame at full CPU and then sleeping for the remainder of the interval until display.
Another example would be a User Interface. If human perception cannot tell the difference between 5 ms response time, and 50 ms response time, I should run the UI task more slowly and complete in 50 ms using a slower CPU frequency/voltage combination thereby saving energy.
So, we know we want to scale the CPU dynamically, and we want the control over CPU frequency/voltage to be set automatically. There are still several hurdles to energy/performance nirvana. The first question is, is there sufficient slack in the performance of tasks to scale the CPU? The second question is, can the CPU be scaled quickly enough to save that energy without too much switching overhead? Finally, and only if the answers two the first questions are positive, what or who controls the frequency/voltage scaling? We are exploring the answers to these questions.
We spent last year reviewing the current state of the art in operating systems level power savings. We looked at work from Govil, Weiser, and Pering to determine if there was slack to exploit, and if we could use implied information about past process performance to determine future CPU frequency. Initially, we were very hopeful that using only statistics about processes, we could scale for energy savings without losing performance. Using a specially modified Itsy Pocket Computer from Compaq Western Research Labs we implemented these statistical mechanisms in a 2.0.30 Linux kernel and then evaluated there effectiveness. We did not find good energy savings, not because there was not slack in the system, nor because we couldn't scale the voltage/frequency rapidly enough, but simply because past performance is not enough to allow aggressive scaling while still maintaining performance. We presented this work at OSDI 2000. It is available for download from this site as well.
As a result, we have been looking into additional mechanisms for providing the scaling algorithm in the kernel with information upon which to make scaling decisions.
In a recent paper, Jacob Lorch explored using information about user interface deadlines to scale the processor to save energy. We believe this approach has merit for this category of application, especially if the mechanism for communicating information to the scaling algorithm can be implemented in the user interface library code.
Another plan that we are exploring is using user land information to aid scaling. If an application is rate based, (i.e. an mp3 decoder or dvd playback), there is significant internal knowledge about how long a frame of data takes to decode versus how much time exists between frames. If this information can be transferred to the scaling algorithm, the process can be run to meet it's deadlines in a "just in time" fashion instead of decoding ahead of time and then sitting idle. Recall that at one slower speed continuously saves more energy than running at full speed then idling because energy consumption varies with the square of the processor voltage, and scaling the frequency allows running at lower voltage.