Pushing the Limits of the ZYBO to create the fastest PWM possible in VHDL

(Last Updated On: March 10, 2018)

The aim of this project is to develop the fastest possible PWM generator block using the Zynq FPGA and VHDL programming language. Therefore the constrains are studied to know which are the speed limitations. Several versions are developed with different features and configurable parameters.

Maximal theoretical speeds

This FPGA incorporates the chip Zynq XC7Z010 which could run up to 464MHz (speed grade -1), according to the datasheet of the XC7Z010 device. Another devices of this Zynq-7010 family can run faster up to 628 MHz (speed grade -3) [1]. Taking this in account a maximal theoretical speed can be calculated in function of the resolution of the PWM, also the number of bits of the counter. The bits of resolution are the most limiting factor. Therefore, the next Table 1 shows the maximum theoretical frequency achievable by the PWM module.

 XC7Z010 (464MHz, speed grade -1)XC7Z020 (628MHz, speed grade -3)
10bit464/1024=0.4531 MHz628/1024=0.6133 MHz
8bit464/255=1.8196 MHz628/255=2.4627 MHz
6bit464/64=7.25 MHz628/64=9.8125 MHz
4bit464/16=29 MHz628/16=39.25 MHz

Table 1: Maximum theoretical speed for the Zynq-7000 family

For the developing a lower clock frequency is employed. For simplicity 400 MHz (and 600MHz for speed grade -3) is taken as standard clock speed in all the test in this document, if clock frequency is not detailed. This is due all versions of the PWM cannot run at highest speed and for comparing purposes a lower clock frequency is used.

 464MHz , speed grade -1628MHz, speed grade -3
10bit400/1024=0.3906 MHz600/1024=0.5859 MHz
8bit400/255=1.5686 MHz600/255=2.3529 MHz
6bit400/64=6.25 MHz600/64=9.375 MHz
4bit400/16=25 MHz600/16=37.5 MHz
Table 2: Maximum speed at test clock frequency of 400 MHz

Design diagram

The Vivado tool is based on block diagrams, where the system is prepared to be load into the target board. For synthesis, implementation and bitstream generation, the design from the block diagram of the Figure 2 was employed. This diagram is kept as simple as possible, in order to get the greatest performance of the chip. The Zynq-processing system core only provides the clock signal to the FPGA and all other peripherals are deactivated. This core can give a maximum of 250 MHz, therefore the clock output is set to 200MHz and the locking wizard is in charge of getting faster clock speeds.


Figure 2: Synthesis and implementation block diagram

The simulation test bench

Simulation are performed to test how the modules behave and to debug possible bugs using the internal signal waves. Thus, a separated simulation test bench-diagram was used, where all the modules can be tested. Here the clock generator source at 200MHz simulates the clock coming from the PS-core. The desired speed is obtained with the clock wizard block, following the same procedure as in the previous commented FPGA implementation design.


Figure 3: Simulation block diagram
An external RTL module was used to generate the stimuli for simulate several cases for the input values and evaluate the response of the PWM module. The stimuli block generates two signals: Duty cycle as an 8-bits output and a digital enable signal. The full VHDL code of the stimuli block can be found at the appendix on the page 50 and the output signals can be seen at the Figure 4.

Module versions

The idea of making different versions is to create a polyvalent PWM modules giving to the designer the possibility to choose between a trade of features and speed. For example in a particular project only speed could be needed, but the resolution or interlock delay time are not so determinant factors, or viceversa. This FAST PWM project aim to create a totally configurable high speed PWM FPGA modules. Definitely, around 10 versions with distinctive features to get the best trade of speed-features.

Common basic features which all PWM module should fulfills:

  • Double output: normal and inverted.
  • Output should be driven to the pins
  • Running at the programable logic (PL) part of the Zynq device.

This variables features are:

  • Number of bits of resolution: 4, 6 or 8 bits
  • Fixed or variable duty cycle as an input
  • Output enable as an input with configurable idle state
  • Configurable interlock delay time
  • Maximum and minimum saturation limits for the duty cycle
  • Synchronization output

The features that should fulfill each version are summarized in the Table 5:

Version ModuleMain FeaturesCharacteristics
V1PWM01Fixed duty PWM, @fmax The duty is fixed via generic parameter and the output is both an inverted and a non-inverted PWM signal (for driving a full bridge transistor stage).
V2PWM02Variable duty PWMVariable via input register or pin. The ducy change is only effective at the next PWM cycle.
V3PWM03V2 + an enable inputIf disabled, the outputs of the PWM should be switched to its idle state. The idle state should be definable by generic.
V4PWM04BV3+ fixed interlock delayFirst useful PWM for application.
V5PWM06V4 + variable interlock delayIf this design is not slower than the previous, it makes no real difference whether some parameter is initialized via a generic or fed into the circuit via a register.
V6PWM07V5+ variable saturation limits for the duty cycle This is a protection for certain power electronics which does not allow 0% or 100% duty, because it would damage the circuit.
If the PWM module handles this, the controller does not need to do so. Variable implementation preferred, dependent on results of V4.
V7PWM08V6+ variable starting phaseMake the starting value of the counter adjustable. By this way a multi-phase PWMs can be made by using multiple of these PWM blocks.
V8PWM09 MasterV7+ sync input

If multiple PWM modules operate on separated boards, their core clock frequency will deviate from each other.
V9PWM010s MasterV8+ variable frequencyMethod for varying the frequency are considered: Change core clock and bit resolution.
Table 5: Version summary

How are the PWM modules tested and compared?

Performance

All the modules were analyzed in terms of timing inside a testbench-diagram created for loading to the FPGA. This diagram has a minimalist design containing only the basic things for running the applications modules.

The clock speed entering to the module was standardized to 400 MHz. Later considering the results given by the 400MHz test, each module was implemented at the maximal runnable speed for the module, which can be higher or lower.

The standard synthesis and implementation strategies for the testbench were determined after the previously mentioned Synsthesis&implementation multivariable test. Therefore, the best strategies for this case are:

  • Synthesis: Flow Performance Threshold Carry
  • Implementation: Flow Run Post Route Physically Optimized

Versions

Version 1: PWM01

This is the minimal version of a PWM module, the main and only characteristic is:

  • Fixed duty cycle by a generic parameter

This code generates a simple PWM signal, as it can be seen in the Figure 7. The 8 bit counter rise up to 255, while the duty cycle is fixed at 76 clock cycles (or 29,8%).


Figure 7: PWM01Be module simulation
To compile this VHDL block and create a functional bitstream to load to the board, the FPGA diagram was used.
The results after running the synthesis and implementation are favorable: 678 picoseconds of positive slack.


Figure 8: Timing summary after synthesis and implementation @400MHz

This positive result makes possible to increase the clock speed until the maximum supported by the board. Therefore, a new synthesis and implementation with the clock source at 464MHz was performed obtaining 235 picoseconds of positive slack (Figure 9). This would generate a 1.8196 MHz PWM signal with 8-bits resolution.


Figure 9: Timing summary after synthesis and implementation @464MHz

This positive slack at the maximum FPGA clock speed shows that the limitation factor for this module is the FPGA itself. In this case, if another FPGA with a higher speed grade is used, higher speed can be achieved.

A synthesis and implementation with the Zynq chip XC7Z020 were done. This allows to test how fast could be the speed grade -3, the fastest device available of this Chip-family. The result was a positive slack with 628 MHz clock, generating theoretically a 2.4627 MHz PWM signal.


Figure 10: Timing summary after synthesis and implementation with XC7Z020 @628MHz

Version 2: PWM02

This PWM module add new features:

  • (new) Duty cycle as input
  • (new) The duty cycle only changes at the end of the cycle


Figure 11: Block of PWM02

The duty cycle for the PWM signal comes from an external pin. This approach is based on the previous version, but adding the input value. The duty cycle value should be only refreshed at the beginning of each cycle. The following VHDL code was written:

The simulation of PWM 02 block was made applying the stimuli signal described before in Figure 4, the following output signal is obtained (Figure 12). The gold duty signal is read from the stimuli, it is the input. The green duty_sig wave is the actualized value. It can be seen that this actual duty cycle value (green duty_sig) is only at the end of every counter cycle refreshed. For example in the Figure 12 at the time ~ 1.1ns, the duty = 0xC8 is not taken in account because the 0x60 value was the value at the end of the cycle.


Figure 12: PWM02 Module simulation

After the synthesis and implementation of this PWM module a positive slack of 753 picoseconds is gotten.


Figure 13: Timing summary after synthesis and implementation @400MHz

This module has enough positive slack to run at the maximum speed (464MHz). For this synthesis and implementation also a positive slack is obtained, as it can be seen on the Figure 14.


Figure 14: Design Timing Summary for PWM02 @464MHz

Version 3: PWM03

This module inherits some features from the previous one and adds an enable input. The features for the third version are:

  • Duty cycle as input.
  • The duty cycle only changes at the end of the cycle.
  • (new) Enable input. If disabled, the outputs of the PWM should be switched to its idle state.

For this module a function was created to convert the Boolean generic parameter into std_logic format.

The functionality of this enable input is shown in the Figure 16. The idle state for the simulation was set to zero for both signals. This can be changed by double clicking on the instantiated block and checking the variable box of the Figure 15.


Figure 15: Check box for changing the idle state


Figure 16: PWM03 Module simulation

This module after synthesis and implementation has a positive slack at 400 MHz clock speed of 445 picoseconds


Figure 17: Timing summary after synthesis and implementation @400MHz

Version 4: PWM04

This fourth version is the first functional stable module and usable by a power electronics bridge. The features are the following:

  • Duty cycle as input.
  • The duty cycle only changes at the end of the cycle.
  • Enable input. If disabled, the outputs of the PWM should be switched to its idle state.
  • 8 bits resolution.
  • (new) Fixed interlock by generic parameter.

Code 3: Extract from PWM 04 module

The simulations wave for this module can be seen at the Figure 18, where the interlock delay of 4 clock cycles can be appreciate. The interlock delay time is equally divided between the normal and inverted signal.

The duty cycle can be calculated as following:

Where duty and interlock are measured in clock cycles.


Figure 18: PWM04 one PWM cycle simulation

The duty cycle calculation can be done after the simulation signals:


Figure 19: PWM04 module. detail on the interlock delay time with 4 cycles and 16 cycles of duty

The synthesis and implementation for this PWM module at 400 MHz the module returns a positive slack of 120 picoseconds (Figure 20).


Figure 20: Design Timing Summary for PWM04 @400MHz

This module can run at 450MHz clock speed creating a 1,75 MHz PWM signal


Figure 21: Design Timing Summary for PWM04 @450MHz

The output signal of this module on the oscilloscope can be seen on the next Figure 22:


Figure 22: PWM04 signal on the oscilloscope

The signal shown above can not be good displayed because the measurement equipment is not fast enough. The speed of the used oscilloscope is 100GHz. This lead on bad edges on the transitions.

Alternative implementation

For this functional fourth version, another coding option with identical functionality were tested. In this module the use of an external counter block was implemented in the PWM05, as it can be seen on the Figure 23. This could be beneficious when more module in parallel share a common external counter.


Figure 23: Block PWM05 with an external counter

After the synthesis and implementation, it was concluded that the use of an external counter is slower, in terms of maximum clock frequency, than using a user defined VHDL counter with a std_logic_vector signal. The synthesis and implementation finished with only 48 picoseconds of slack (Figure 24).


Figure 24: Timing Summary for PWM05 block @400MHz

Version 5: PWM06

Features of this module

  • Duty cycle as input.
  • The duty cycle only changes at the end of the cycle.
  • Enable input. If disabled, the outputs of the PWM should be switched to its idle state.
  • 8-bits resolution.
  • Interlock as input (new)


Figure 25: Block diagram for synthesis and implementation of version 5

After the synthesis and implementation at 400 MHz a positive slack was obtained, as it can be seen at Figure 26. but compared with the previous version 4, it is slower.


Figure 26: Design timing summary of PWM06 @400MHz

This feature is not really needed in real applications for power electronics. The interlock delay time varies between different systems, where it can be better fixed by a generic parameter for each system.

Version 6: PWM07

This new module is based on the PWM04. Features of this module:

  • Duty cycle as input.
  • The duty cycle only changes at the end of the cycle.
  • Enable input. If disabled, the outputs of the PWM should be switched to its idle state.
  • 8-bits resolution.
  • (new) Fixed interlock by generic parameter.
  • (new) Variable saturation limits for the duty cycle by generic parameter.

This limitation is based on many power devices cannot handle 100% duty cycle, and they have some security control implemented. When the controller has this feature, the power electronics device can dispense with this. For this function, the following Code 4 was added into

Code 4: Extract PWM 07

Simulating the PWM block with the following parameters:


Figure 27: Configuration parameters of the module PWM07


Figure 28: PWM07 test of the duty saturation limits

As it can be seen on the Figure 27, the input signal of duty (in purple) is limited at time 3µs by the minimum limit. It takes the minimum value 0x14 instead of the given value of 0x10. On the other hand, at time 5µs, it saturates to 0xF4 instead of the given duty signal of 0xFF.

The Slack is positive,251 picoseconds, for this module can be found at the Figure 29.


Figure 29: Design timing summary of PWM07 @400MHz

Version 7: PWM08

Features of this module:

  • Duty cycle as input.
  • The duty cycle only changes at the end of the cycle.
  • Enable input. If disabled, the outputs of the PWM should be switched to its idle state.
  • Fixed interlock by generic parameter
  • Variable saturation limits for the duty cycle by generic parameter
  • (new) Starting value of the counter adjustable by a generic parameter.


Figure 30: Block for PWM08


Figure 31: Block diagram for PWM08

This implementation leads to the possibility of creating many PWM modules interleaved with different phase. This can be seen on the Figure 32, that the counter starts at 127 after the enable signal is set. Also, the saturation limits for the duty cycle are active in this module, as shown in the previous one.


Figure 32: PWM08 starting 180º phase shifted

The results of the synthesis and implementation for this module is 127 picoseconds (Figure 33).


Figure 33: Timing summary of PWM08 @400MHz

Version 8: PWM09

For this module, a synchronization signal is added to manage the use of multiple PWM modules in separated boards. This avoid the possible clock deviation when multiple boards are being used.

Features of this module:

  • Duty cycle as input.
  • The duty cycle only changes at the end of the cycle.
  • Enable input. If disabled, the outputs of the PWM should be switched to its idle state.
  • Fixed interlock by generic parameter
  • Variable saturation limits for the duty cycle by generic parameter
  • Starting value of the counter adjustable by a generic parameter
  • Synchronization signal (new)

For this synchronized design, an unique master role is needed to provide the tick to the slave modules. In normal conditions, slaves which are running in the same board as the master, they should never desynchronize with the master, but to be in the safe region, and to run all slaves the same conditions, the master output synchronization signal is returned to as an input for the inboard slaves. Two different modules were created for this version: Master and slave.


Figure 34: Block for the PWM09 modules. From left to right: Master and Slave

In the simulation environment, to test the synchronization of modules running in separated boards, two clock sources are used. One for the master at 400MHz and another slightly higher for some slaves at 401MHz. This should lead to asynchronous PWM signal of the three interleaved pwm modules. For the test two slaves and one master are instanciated. After 70.000ns the signals are not any more synchronized. The simulations diagram can be seen on the Figure 35.


Figure 35: Asynchronous PWM signal

In the slave 0 of the Figure 36 can be seen that the phase is totally out after 80 µs of simulation.


Figure 36: Wave signals for asynchronous PWM.

To avoid the possible asynchronization between different boards, the sync-signal provide a high pulse when the period starts. This makes the system dynamically synchronize, while the PWM is being generated. The diagram with the sync signal in green can be seen at the Figure 37. The following Figure 38 shows that the system is not affected with the clock deviation of the slave modules.


Figure 37: Asynchronous PWM modules with synchronization signal


Figure 38: Synchronized PWM modules

To test how fast can be synthetized and implemented the following block diagram (Figure 39) was built.


Figure 39: Block diagram for PWM09 in slave mode


Figure 40: Timing summary of PWM9S @400MHz with one slave

Version 9: PWM10

This is the last and most complete version. The PWM 10 module is identical as the previous version 9, but it allows you to vary the resolution of bits. The features of the module:

  • Duty cycle as input.
  • The duty cycle only changes at the end of the cycle.
  • Enable input. If disabled, the outputs of the PWM should be switched to its idle state.
  • Fixed interlock by generic parameter
  • Variable saturation limits for the duty cycle by generic parameter
  • Starting value of the counter adjustable by a generic parameter
  • Synchronization signal
  • (new) Generic number of bit of resolution

This version has been solved by the previous version 9: one master gives the synchronization signal and the slaves generate the PWM. The master block remains equal as before, but the slave module incorporate a generic parameter of number of bits. By this generic parameter the length of the counter is changed and therefore the PWM signal resolution.

The test for this module to get the maximum performance with one module at 400. Below can be seen the diagram used for the test and in the


Figure 43: Diagram block for PWM10 Master module


Figure 44: Timing summary of PWM10 @400MHz

A small application of multiphase pwm is done here with three interleaved at 120° resulting a positive slack of 55 picoseconds:


Figure 45: Timing summary of PWM10 with 3 multiphase modules@400MHz

The special feature of this module is the possibility of varying the values of the resolution easily from the generic block properties. The PWM frequency can be changed without comprising the slack time. The Worst Negative Slack depends on how many modules in parallel are placed on the

PWM module
Clock freq. = 400MHz

WNSlack 161 ps

1 Master module
Master 4 bits module PWM10M25 MHz
Master 5 bits module PWM10M12.5 MHz
Master 6 bits module PWM10M6.35 MHz
Master 7 bits module PWM10M3.125 MHz
Master 8 bits module PWM10M1.5 MHz
Master 9 bits module PWM10M780 kHz
Master 10 bits module PWM10M390 kHz
......
Master 20 bits module PWM10M38 Hz

Table 7: PWM speed depending on the bit of resolution

To choose a certain PWM frequency speed, the most suitable procedure without losing resolution is to set first the number of resolution bits depending on the maximal frequency. Then down the clock speed.

For example, if 2 MHz PWM speed is required:

Looking at the Table 7, to get 2 MHz is possible only with maximal of 7 bits resolution. Then the new clock speed should be calculated.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top