Select Page

Computer Organization

Memory Organization
Interrupt
DMA Controller

# Pipelining And Non-Pipelining

There are two common methods are use to process the instructions. The one is pipeling and other is non-pipeling. Although the concept of non pipelining is replace with pipeling due to less efficieny and throughput. Let explain both terms

## 1. Non-pipelining

In a non-pipelining concept, each instruction is passed through some stages which are given below to complete its execution

• Fetch
• Decode
• Execute
• Memory
• Write back

The second instruction has to wait until the first instruction completed.

So each instruction is passing through five stages. Each stage is complete in one clock cycle.  So 4 instructions will takes 20 clock cycles to complete in non-pipelining.

### Numerical problems on non-pipelinig

Q1: If 15 milisecond time is given to each clock cycle and there are 4 instruction which are passing through 5 stages to complete its execution in  non-pipeling.

1. How much time is required to complete the execution of all instructions.
2. Calculate the efficiency of the system

Solution: A

Total clock cycle= K*N     ( as K are stages so k=5 and N are no of instructions so n=4)

=5*4

=20

Time for one clock cycle= 15 ms

Time for 20 clock cycle = 15*20 ms

Solution: B

Efficieny or utilization = total no of used box in no-pipelining/ Total no of boxes

In 4 instructions diagram with 5 stages total box are 80.

4 instuctions each use 5 stages, so total 20 boxe are used

Efficieny or utilization = 20/80= ¼

## 2. Pipelining

Execution of more than one overlap instructions is called pipelining. In this way CPU performance increased.

In a pipelining concept each instructions is also pass through five stages to complete as

• Fetch
• Decode
• Execute
• Memory
• Write back

But the 2nd instruction starts execution when the first instruction is at decoding stage. 3nd instruction starts execution when 2nd instruction is decoding stage and so on.

In this way, 4 instructions completed in just 8 clock cycles while in non-pipelining it was 20.

## Need of Pipelining

Purpose: CPI (clock per instruction) is equal or equivalent to ONE. Higher the no of instructions the most probability to achieve CPI target (which is one)

Note: only first instruction takes 5 clock cycles to complete other all instructions are completed in one clock cycle. So, first instruction execution clock cycles (time period) is equal to no of stages and other all instructions are executed in on clock cycle.

Note: One clock cycle is sometime in which instruction is executed

When a clock cycle is apply in pipelining then all stages (fetch, decode, execute, memory, write back) of processing are executed. But at a time only one stage executed for one instruction. Mean to say instruction-1 cannot fetch and decode at the same time.  But instruction-1 can fetch and instruction-2 can decode at the same time.

## Numerical problems on pipelining

Q1: If 15 milisecond time is given to each clock cycle and there are 4 instruction which are passing through 5 stages to complete its execution in  pipeling.

1. How much time is required to complete the execution of all instructions.
2. Calculate the efficiency of the system

Solution: A

Total clock cycle= K+ (n-1)      as k are stages so k=5 and n are no of instructions so n=4

=5+(4-1)

=8

Time for one clock cycle= 15 ms

Time for 8 clock cycle = 15*8 ms

Solution: B

Efficieny or utilization = total no of used box in pipelining/ Total no of boxes

In 4 instructions diagram with 5 stages total box are 40.

8 instuctions, each use 5 stages, so total 20 boxes are used

Efficieny or utilization = 20/40= ½

Conclusion : CPI is almost one in pipelining and higher the efficeny and throuput in pipeling as compare to non pipelinging.

## Important points about Pipelining and non-pipelining

### SpeedUp Formula

Ratio between non pipelining and pipeline is speed up. As 8 instructions completed in 12 clock cycles in pipeline but 40 clock cycles required for non-pipelining. So Speedup will be

Speedup = NP/P = 40/12 =3.1 so 3.1 times is speedup. NP is non pipelining and P is pipelining

### Stage Delay

Every stage has circuits which is use to process data. So, sometime required at every stage which is called stage delay.

### Registers Delay

Registers between stages are use to store intermediate results. These registers  store the input value from previous stage,  for very next stage. If stage delay is uniform then we have no delay  in registers we can direct pass it to  next stage.

But if one stage processing speed is missmatch with other stage( mean to say stage 1 is complete in 5ns  but stage 2 is still in processing or its delay time is 8ns)  then we have to store intermediate results in registers for some times to complete the next stage (stage 2).

Stages delay and registers delay given below in the diagram,

### Numerical Problems

Question1: A 4- stages pipeline has stage delays as 150,120, 160 and 140ns. Registers are used between stages and have delay of 5ns each. Assuming constant clock rate, the total time taken to process 1000 data items on this pipeline will be—-?

Solution:

Consider a maximum stage delay so that the other instructions may executed, it founds in stage 3 which is equal to 165 (160-stage delay+5-register delay).

First instruction/data pass through the entire stage and rest instructions will follow the pipeling and every instruction is complete  in every stage. So the formula will be as following..

First instruction x stages x time + Rest instructions x stages x time

= 1x4x165 + 999 x 1 x 165 ns = 165.5 usec.

Question No 02: Consider a non pipelined processor with a clock rate of 2.5 GHz and avg. cycle/instructions of  four. The same processor is upgrade to a pipelined processor with five stages. But due to internal pipeline delay, the clock speed is reduce to 2 GHz. Assume that there is no stall (ideal condition) in pipeline. The speedup achieved in pipeline processor is ?

Speedup = TNP/TP   (“NP” is  non pipelining and “P” is pipelining)

As T= 1/F = So,

TNP = 4×1/2.5×109 Sec

Tp = 1x 1/2×109 Sec

Speedup = (4×1/2.5×109 Sec) / (1x 1/2×109 Sec)

Note: Time for one instruction = cycles per instructions x clock rate

Help Other’s By Sharing…