Vivado synthesis and implementation strategies

(Last Updated On: March 10, 2018)

There is many options for Synthesis and implementation in Vivado, which one should I use? Depend on what are you searching for.

In this multivariable test, I tried all the predefined possibilities to figure out which one suits best for my application. In this case the more positive slack the better.

You maybe want other specifications. Basically a trade off between speed at synthesis (and implementation) and performance. synthesis speed at the beginning at the developing phase, but later for the final design you may want a better performance sacrificing synthesis speed.

Summarizing the FPGA design flow, these consists on:

  • RTL design
  • Behavioral simulation
  • Synthesis
  • Implementation
  • Bitstream generation
  • Download to FPGA

Especially two steps are important to get a high-speed performance: synthesis and implementation. Synthesis is the process of transforming an RTL-specified design into a gate-level representation and implementation mean various steps necessary to place and route the netlist onto the FPGA device resources.
One crucial factor to push into the limits the FPGA at fastest speed possible is the synthesis and implementation strategy of Vivado. A good strategy can win up to 200 picoseconds of slack. In terms of speed used this is an improvement around 10%.

Figure 1: Diagram chosen to make the synthesis and implementation test

The used tool, Vivado 2016.2, gives 8 synthesis and 26 implementation strategies classified in categories.  To get an optimization some less important categories are scarified. For research purposes a multivariable study was made to discover which combination of synthesis and implementation strategy match better to our system. The variables for the test are 8 synthesis and 26 different implementation strategies.
In total, there are 208 combinations. In normal cases, it is not necessary to test all of them, for example, for this project, a fast PWM is desired, therefore all synthesis strategies related with chip used area, power consumption or runtime are not relevant and could be ignored. Also for the implementation strategies only performance, flow and congestion strategies are relevant on this purpose.
To drive the multivariable test the PWM10M block was taken as a application example. This should determine to determine which combination of variations suits best out of all the possible combinations.

The goal of the test is to have speed performance on the system, this is measure with a positive slack timing value. To generate all the synthesis and implementation combination a tcl command file was created, to ease the creation of all the groupings.

The estimate the necessary time to complete this test, is taken in consideration an average time of 2 minutes for each implementation, making a total of 416 minutes or around 7 hours were estimated. The completing of the simulation was measured in 7 hours and 22 minutes. This implementation work was done selecting two jobs in parallel, but it could run faster using 4 jobs if the processor of the computer have these cores available.

The results are exposed in the coming Table. The first one shows the worst negative slack (WNS) value in nanoseconds. This value should be at least positive to be able to generate the bitstream. All the negative WNS results are considered as failed implementation and cannot be used. Thus, the higher the WNS is, more overhead is at the end of the period. A high positive slack mean that the clock frequency can be increased as long as the WNS stays positive or the maximum clock frequency of 464 MHz is reached.
Analyzing the 208 synthesis-implementation results from Table 4:

click to enlarge
click to enlarge

• Only two out eight synthesis strategies performed positive results. Therefore, synthesis is the determining factor, in some case more than 500 picoseconds of deviation only by using different synthesis strategy.

✓ Flow Alternate Routability

✓ Flow Performance Threshold Carry

• Four implementations strategies

✓ Performance Extra Timing Optimized

✓ Flow Run Phys Optimized

✓ Flow Run Post Route Physically Optimized

✓ Performance Net Delay low

 

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top