top指令中load average的含意

最近在查 top 指令所顯示資訊的含意。主要是針對1, 5, 15 minutes load average 的解釋。load average看字面很容易解釋,但背後所表示的含意為何?以及該如何解讀這些數字?

找了一會,居然沒找到一個明瞭、或簡單易懂的說明。忽然想到之前在 Building Scalable Web Sites(台灣翻譯為 聚沙成塔~建置逐層擴充的Web2.0 服務)這本書中,於Bottlenecks章節內,有提到關於CPU部分。翻了一下書,果然有說明…節錄做個記錄…

The load average statistic is a very good quick indicator of the general state of the machine. The three figures shown represent the load average over the last 1, 5, and 15 minutes. The load average is calculated by counting the number of threads in the running or runnable states at any one time. The running state means that the process is currently executing on a processor, while the runnable state means the process is ready to run and is waiting for a processor time slice. The load average is averaged out using samples of the queue length taken every five seconds (on Linux), averaged, and damped over the three time periods. The load average is never an indication of how many processes are trying to run at any one moment.
A load average of zero indicates that no processes are trying to run. When the load average exceeds the number of processors in a box, there are processes in the queue waiting to run. In this way, the load average is less an indicator of current load and more of an indicator of how much work is queued up, waiting to run. A high-load average will make a box appear unresponsive or slow, as each request has to wait in the queue to get serviced. When the load average for a box is less that its number of processors, there are free CPU cycles to go around.

不過,在上述所節錄的文章之後,作者還有提到一些狀況,load average在某些情況下(如different priorities are assigned to different jobs)並不能完全代表執行速度。有興趣的人不妨參考書本內容。(下面參考資料中的Load (computing)也有描述)

Building Scalable Web Sites這本書,其實是以flick為範例,闡述一個大型網站各方面的架構。點到很多面向…內容完整。不過,所提到的每個面向又都是各自一門學問…需要自行鑽研~

順便附上一個flickr架構的相關投影片(資料有點舊 :P )....

參考資料

留言