0
1
00:00:02,880 --> 00:00:08,550
The loop unrolling technique cannot be applied to all loop statements in a code. A specific coding 
1

2
00:00:08,550 --> 00:00:13,980
style is required to be able to fully unroll a loop and synthesis it into a combinational circuit.
2

3
00:00:15,640 --> 00:00:19,450
In this lecture, I will explain some of the related coding styles.
3

4
00:00:22,390 --> 00:00:27,880
When we add the unroll pragma to a loop statement, the synthesis tool unrolls the code at compile time
4

5
00:00:28,150 --> 00:00:31,120
so it should know the exact number of loop iterations. 
5

6
00:00:31,480 --> 00:00:38,440
In other words, only loops with static bounds, which are fixed and know at compile time, can be 
6

7
00:00:38,440 --> 00:00:39,860
fully unrolled by the HLS tool. 
7

8
00:00:42,910 --> 00:00:49,180
The number of loop iterations can be defined through a macro such as this code in which N is defined 
8

9
00:00:49,330 --> 00:00:50,530
by a define macro.
9

10
00:00:52,290 --> 00:00:57,270
Alternatively, a const variable can be used, as shown in this code example.
10

11
00:00:58,800 --> 00:01:04,500
However, the for-loop in the third code cannot be unrolled as the number of loop iterations depends 
11

12
00:01:04,500 --> 00:01:08,910
on n which is a function input argument and is unknown at compile time. 
12

13
00:01:10,270 --> 00:01:16,780
Notice that in this case HLS tool gives a warning message and keeps the loop rolled and generates 
13

14
00:01:16,780 --> 00:01:19,120
a sequential circuit instead of a combinational one.
14

15
00:01:20,800 --> 00:01:26,050
Fully unrolling a loop structure results in a separate hardware circuit for each iteration. 
15

16
00:01:26,470 --> 00:01:32,470
Therefore, there should be enough hardware resources on the target FPGA to realise all the instances; 
16

17
00:01:33,010 --> 00:01:36,460
otherwise, the logic synthesis phase will issue an error.
17

18
00:01:37,120 --> 00:01:42,760
The Xilinx Vivado-HLS unrolls the loop and performs the high-level synthesis regardless of the 
18

19
00:01:42,760 --> 00:01:47,380
available hardware resources and only gives warning messages in its report. 
19

20
00:01:48,280 --> 00:01:56,020
However, if we don’t pay attention to the warnings, the Xilinx vivado logic synthesis tool later along
20

21
00:01:56,020 --> 00:02:01,810
the design flow will complain about the lack of enough resources and won’t generate the FPGA bitstream. 
21

22
00:02:03,980 --> 00:02:10,370
As an example, consider that our target FPGA has 20800 LUTs. 
22

23
00:02:12,180 --> 00:02:18,420
Let’s assume that one instance of the task function in this loop requires 100 LUTS.
23

24
00:02:20,240 --> 00:02:27,320
If N=10, after synthesising this code about 1000 LUTs are required. As it is less than 
24

25
00:02:27,320 --> 00:02:31,350
20800, the for-loop can be implemented later.
25

26
00:02:32,150 --> 00:02:40,340
However, if N=1000, then 100000 LUts are required. As it is greater than 20800, 
26

27
00:02:40,340 --> 00:02:41,080
20800, 
27

28
00:02:41,600 --> 00:02:43,700
it cannot fit in the target FPGA.
28

29
00:02:45,650 --> 00:02:51,530
In the next lecture, I will take a real example to explain how to use the unrolling techniques in practice. 
29

30
00:02:54,090 --> 00:03:00,630
The takeaway message from this lecture is: To fully unroll a loop in HLS,  The loop bounds must be known at 
30

31
00:03:00,630 --> 00:03:06,690
compile time.  And there should be enough resources on FPGA, e.g., enough LUTs
31

32
00:03:09,110 --> 00:03:13,480
Now the quiz question. Which code can be synthesised into a combinational circuit?
