Verilog Pro https://www.verilogpro.com/ Verilog and Systemverilog Resources for Design and Verification Thu, 29 Sep 2022 14:33:47 +0000 en-US hourly 1 https://wordpress.org/?v=6.4.3 How Chiplets Assemble Into the Most Advanced SoCs https://www.verilogpro.com/how-chiplets-assemble-into-the-most-advanced-socs/ https://www.verilogpro.com/how-chiplets-assemble-into-the-most-advanced-socs/#respond Wed, 28 Sep 2022 10:21:01 +0000 https://www.verilogpro.com/?p=844 Chiplet based SoC designs have become mainstream in high end SoCs. This article explores technologies used to build chiplet designs.

The post How Chiplets Assemble Into the Most Advanced SoCs appeared first on Verilog Pro.

]]>
For this article, I decided to take a quick pause from Verilog, to write an article on a topic that I had been researching the past few months. You have probably come across the term “chiplet”, and may be wondering what this latest trend in SoC design is about. In this article, I will explore some of the background topics and technologies around chiplet based designs, and give you many links to follow to find out more. I hope you find this topic equally interesting as Verilog coding. Here we go!

Why Chiplet Designs

Today’s complex SoCs are approaching (and in some cases already exceeded) the physical limit of how large a single silicon die can be manufactured. This limit is called the reticle limit. According to article in Protocol,

But big die sizes create big problems. One fundamental issue is that it’s currently impossible to print a chip larger than the blueprint used in the photolithography stage of chip manufacturing, called a photomask. Because of technical limits, the beam of light shining through the photomask to reproduce the blueprint onto the silicon wafer cannot print chips larger than about 850 square millimeters.

Chiplets helped save AMD. They might also help save Moore’s law and head off an energy crisis. Protocol, July 20 2022

Therefore without breaking up a design into multiple dies (or multiple chiplet), engineers simply will not be able to design some cutting edge SoCs.

The economics of producing monolithic SoCs on a single large die, at cutting edge technologies, is becoming (or may have already become) prohibitive. A large, single monolithic, die will be more prone to having irrecoverable defects than a smaller die, causing lower yield and higher overall cost. In a chiplet based design, each chiplet is smaller, so the amount of silicon (die) that needs to be thrown away for an irrecoverable die is less, leading to lower cost. Cost is further exacerbated by the increasingly higher cost of the latest lithography node. AMD estimates that using a chiplet based in their Epyc processor led to a >40% reduction in cost (AMD on Why Chiplets—And Why Now – The Next Platform).

When a SoC is broken up into chiplets, the design becomes more modular. First advantage is this can allow different chiplet to be manufactured at the best lithography technology for that purpose. For example, many radio frequency (RF) circuitry do not perform well in cutting edge logic process technologies. These circuitry can potentially be designed on a chiplet that uses a less dense, but more suitable, process technology. Another example is separating large SRAM memories (like large system caches) from compute logic transistors, on to different a different die, allows optimization of the process technology for each of those dies, leading to better overall performance metrics like power and operating frequency. One can even imagine building SoCs using different process technology of different foundries, and stitching up the chiplet into a single SoC.

Having a more modular design also facilitates reuse. A “holy grail” of the industry is to source an entire chiplet from different IP/chiplet providers, and be able to stitch them up together to create a custom SoC, much like how SoC designers source IPs from different vendors today. There are numerous challenges to achieving that vision, first of which is there was not yet a standard to describe how different chiplets can communicate with each other. In 2022, Universal Chiplet Interconnect Express (UCIe) 1.0 started this effort of standardizing die-to-die interconnects in the industry. It has gathered much early interest, but it is going to be a long journey.

One final advantage of chiplet-based design applies in particular to 3D integrated SoCs. With 3D chiplet integration, there is now a 3rd dimension to exploit and “route” signals, leading to potentially smaller distances between logic. 3D integration can obviously also achieve higher area density. With the pace of lithography advances slowing in the latest process technology generations, 3D integration can become an important dimension to continue the pace of integrating more transistors into a given area (Moore’s Law). The potential smaller distances between logic, when integrated 3D and routed using the 3rd dimension, can also translate into multiple advantages, like reducing latency and power (less capacitance).

Challenges of Chiplet Designs

While chiplet based designs have many advantages, there are also many challenges in building these complex designs.

Firstly, for each additional chiplet that is integrated into a single package, there is a risk of causing a defect in the packaging step, leading to a non-functional package and yield loss at the package level. The entire package, containing multiple chiplets, may then need to be thrown away altogether. Therefore while the cost of manufacturing the individual chiplet decreases with a smaller chiplet, the cost of packaging them together increases. For low cost product segments, single monolithic die based designs may continue to be the most economical.

Today’s die-to-die interfaces, which are the workhorse for die-to-die communication in multi-chiplet designs, occupy a larger area on silicon compared to a standard wire on a die, leading to an “area tax”. Die-to-die interfaces today simply are not as dense as regular wires on a single die. For a signal to cross to a different chiplet, it needs to first be routed to a die-to-die interface PHY, driven off-die, over to the die-to-die interface PHY of the other die, and finally routed back to on-die wires. There are limitations in how densely wires can be printed on the material that connects between dies (substrate material, or silicon “bridge”), how small and dense the solder bumps can be made (in technologies where solder is used to make die-to-die connections, like Intel Foveros), and just the area of having additional die-to-die PHY logic.

Designing and optimizing across multiple chiplets is obviously more difficult and complex than building a single monolithic die. Electronic Design Automation (EDA) vendors are actively working on tools to help partition, design, analyze, multi-chiplet designs, that are integrated in 2D and 3D. Some tools have been released for these design flows, such as Cadence 3D-IC and Synopsys 3DIC Compiler.

Finally, as alluded to earlier, there is not yet a mature standard for die-to-die communication. Multi-chiplet design efforts have so far been limited to companies that are vertically integrated from IP to SoC product (and even to manufacturing, like Intel). Universal Chiplet Interconnect Express (UCIe) 1.0 defines a common PHY layer, and a protocol layer to carry Peripheral Component Interconnect Express (PCIe) and Compute Express Link (CXL) protocols, over a die-to-die interface. However, if you need to carry other protocols, the specification essentially left the definition to the implementer. In order to have a chiplet ecosystem that can fully interoperate, more standardization is needed to carry other protocols that are non-PCIe and non-CXL (such as AMBA protocols).

Chiplet (Die-to-Die) Interfaces

There are already many existing, and competing, die-to-die interfaces in the industry. The following table shows some die-to-die interfaces currently in the market.

StandardPromoterDescription
AIB (Advanced Interconnect Bus)Intel, CHIPS AllianceParallel interface. Used for example in Intel Stratix 10 FPGA. Latest spec v2.0 (June 2021)
BoW (Bunch of Wires)Open Compute Project (OCP) subgroup Open Domain Specific Architecture (ODSA)Parallel interface. Championed by the Open Compute Project (OCP)
OpenHBI 1.0/2.0Xilinx, OCP ODSAParallel interface inspired by JEDEC HBM. PHY can also support JEDEC HBM devices
PCIeIntel, PCI SIGSerial interface. Use PCIe as a short range die-to-die interface. Not ideal (very high power). Used in Intel Kaby Lake-G CPU
UCIe (Universal Chiplet Interconnect Express)Intel, industry consortiumParallel interface. Defines how to carry PCIe and CXL protocols over die-to-die interface, and a raw/streaming mode. 1.0 spec released March 2022
XSR (Extra Short Reach)
USR (Ultra Short Reach)
VSR (Very Short Reach)
OIFSerial interface. Championed by optical networking forum
Selection of die-to-die interfaces in the market today

When comparing die-to-die interfaces, designers use several common metrics:

  • Data rate – the data rate of a single data I/O
  • Bump space (pitch) – spacing between adjacent data I/Os of the die-to-die PHY, on the die
  • Power efficiency – power to transmit a bit to the other die. A common metric is pJ/bit
  • Edge density – combined metric of data rate and bump pitch. A common metric is Tbps/mm, which means for 1mm of die edge, how many I/Os (and at what data rate) can be packed
  • Area density – combined metric of data rate and bump pitch. A common metric is Tbps/mm2, which means for 1mm2 die area, how many I/Os (and at what data rate) can be packed

A presentation from OCP Tech Week Nov 2020 Die-to-Die Interface Comparison makes a comparison between several die-to-die interfaces along these metrics, which I will not repeat here.

Chiplet Integration Methods

Chiplet can be integrated using a variety of methods. Both Intel and TSMC have similar competing technologies to address different integration requirements. Some major categories are:

  1. Standard / Multi-Chip Package
  2. 2.5D Silicon interposer
  3. 2.5D Silicon “bridge”
  4. 3D Solder Bonding
  5. 3D Hybrid Bonding

The following image, that Intel presented at its 2017 Technology and Manufacturing Day, shows a good comparison between methods 1 to 3.

Comparing chiplet integration methods. Intel Technology and Manufacturing Day 2017
Comparing chiplet integration methods. Intel Technology and Manufacturing Day 2017

The following image, presented at Intel Architecture Day 2020, shows a comparison between methods 4 and 5.

Comparing solder and hybrid bonding chiplet integration methods. Intel Architecture Day 2020
Comparing solder and hybrid bonding chiplet integration methods. Intel Architecture Day 2020

Each chiplet integration method uses different technologies and manufacturing flows, leading to different properties. However, one key comparison is the bump pitch of the die-to-die interface, which dictates how densely the wires in a die-to-die interface can be packed, and indirectly the bandwidth of the die-to-die interface.

For reference, the bump pitch of some die-to-die technologies are in the order of:

  • Standard package ~100um
  • 2.5D advanced package (e.g. Intel EMIB) ~50um
  • 3D (solder) bonding (e.g. Intel Foveros) ~50um
  • 3D hybrid bonding (e.g. Intel Foveros Direct) <=10um

Standard / Multi-Chip Package

Standard package is the simplest integration method to understand. The multiple dies are simply placed on a single package substrate, and connected together via traces in the substrate. This is also sometimes called multichip modules (MCMs). There is no advanced packaging technology involved.

2.5D Integration Methods

2.5D integration methods still integrate dies in 2-dimension. The name “2.5D” is intended to convey that these advanced packaging methods can achieve a much higher signal density compared to traditional 2D integration methods (standard / multi-chip package).

Silicon interposer is a piece of silicon that sits between the die and the package substrate. Since this layer is made from silicon, it can be manufactured using advanced silicon manufacturing processes, and achieve a dense bump pitch. However, similar to manufacturing any other silicon die, in order to pass a signal from the top side to the bottom side of the silicon die, through silicon vias (TSVs) need to be manufactured into the silicon. Both this extra piece of silicon (the interposer), and TSVs, increase the cost of the solution.

2.5D silicon “bridge” packaging refers to integration methods that embed a smaller piece of “bridge” silicon, within either a (non-silicon) interposer or substrate, to act as die-to-die interconnect. An example of this method is Intel Embedded Multi-die Interconnect Bridge (EMIB) technology.

3D Solder Bonding

3D integration methods truly stack dies in a 3-dimension manner, on top of each other. More traditional 3D integration methods use solder bumps to bond two dies together. Using solder bumps to bond dies has limitations in how densely solder bumps can be fabricated, and the resulting bump pitch and signal density that can be achieved in the 3D die-to-die interconnect. Intel Foveros is an example of 3D solder bonding technology.

3D Hybrid Bonding (Interconnect; HBI)

3D hybrid bonding is also true 3D integration. It does not use solder microbumps to bond two dies. Instead, the interconnect of each die is exposed on the surface of the die, and directly bonded together (without solder). This technique removes the solder bumps, which is one of the limiters of bump pitch and interconnect density. As a result, hybrid bonding can achieve much smaller bump pitches. Intel Foveros Direct is an example of 3D hybrid bonding. It supports bump pitch of <10um.

Examples of Chiplets SoCs

Both Intel and AMD have utilized chiplet based designs in their latest server and client CPUs. On the server CPU front, AMD’s 3rd generation EPYC (Milan) processor is comprised of 8 CPU chiplets and 1 I/O chiplet, integrated in 2D. The AMD Milan-X processor with 3D V-Cache further has a SRAM cache die integrated in 3D on top of each of the compute dies. Intel’s upcoming Sapphire Rapids Xeon server CPU is also a chiplet based design, and comprises of 4 chiplets connected in 2.5D using EMIB.

Intel Lakefield hybrid CPU is the first Intel SoC to use Intel Foveros 3D integration technology. I already mentioned AMD Milan-X processor with 3D V-Cache as another example of 3D integration (manufactured by TSMC).

Intel’s Ponte Vecchio High Performance Compute GPU is one of the extreme examples of chiplet based design, comprising of a whopping 40+ tiles per SoC. See YouTube: Xe-HPC and Ponte Vecchio – Architecture Day 2021 | Intel Technology.

You can find some nice figures and animations of these chiplet SoC examples by following the links.

Conclusion

Chiplet based SoC designs is already mainstream in many high end markets like server CPUs, HPC GPUs, high end client CPUs. This trend will only likely continue, as more and more SoCs become complex enough to realize the benefits of disaggregating from a single monolithic die to multiple smaller chiplets. Chiplet based designs offer many advantages, like potentially lower cost, overcoming the maximum die size (reticle size), being more dense, more modular, more reusable. But they also come with many challenges, like higher design complexity, lowering of yield from packaging defects, and many proprietary protocols and competing standards. The Universal Chiplet Interconnect Express (UCIe) is one standard that is gathering momentum in the industry, but it will be a long journey to arrive at the vision and holy grail of being able to build SoCs by mixing and matching chiplet from different vendors and foundries.

This has been a fun topic to research (and a long topic to write about). Everything about chiplet is so new and developing, that it is likely to become outdated as soon as it is written about. I’m excited to follow these ongoing developments, to see where this emerging field goes.

Feel free to leave me some comments below if you have feedback about this article!

References

The post How Chiplets Assemble Into the Most Advanced SoCs appeared first on Verilog Pro.

]]>
https://www.verilogpro.com/how-chiplets-assemble-into-the-most-advanced-socs/feed/ 0
Verilog Module for Design and Testbench https://www.verilogpro.com/verilog-module-for-design-and-testbench/ https://www.verilogpro.com/verilog-module-for-design-and-testbench/#comments Sun, 19 Jun 2022 07:48:07 +0000 https://www.verilogpro.com/?p=784 A Verilog module is a building block that defines a design or testbench component, by defining the building block’s ports and internal behaviour. Higher-level modules can embed lower-level modules to create hierarchical designs. Different Verilog modules communicate with each other through Verilog port. Together, the many Verilog modules communicate and model dataflow of a larger, ... Read more

The post Verilog Module for Design and Testbench appeared first on Verilog Pro.

]]>
A Verilog module is a building block that defines a design or testbench component, by defining the building block’s ports and internal behaviour. Higher-level modules can embed lower-level modules to create hierarchical designs. Different Verilog modules communicate with each other through Verilog port. Together, the many Verilog modules communicate and model dataflow of a larger, hierarchical design.

Verilog has a simple organization. All data, functions, and tasks are in modules, except for system tasks and functions, which are global. Any uninstantiated module is at the top level. A model must contain at least one top-level module.

Defining a Verilog Module

A Verilog module is enclosed between the keywords module and endmodule. It has the following components:

  • Keyword module to begin the definition
  • Identifier that is the name of the module
  • Optional list of parameters
  • Optional list of ports (to be addressed more deeply in a future article)
  • Module item
  • Keyword endmodule to end the definition

Let’s address each of these components one by one.

Verilog Parameter

Verilog parameters were introduced in Verilog-2001 (not present in the original Verilog-1995). They allow a single piece of Verilog module code to be more extensible and reusable. Each instantiation of a Verilog module can supply different values to the parameters, creating different variations of the same base Verilog module. For example, a FIFO Verilog module may have a Verilog parameter to adjust its data width (or even data type, in SystemVerilog). They are not strictly necessary for a design, so I will defer discussing the topic further to a future article.

Verilog Port

A Verilog module only optionally needs to have a list of Verilog port (a port list). For example, a top level testbench may not have any Verilog ports at all. Verilog ports allow different modules of a design to communicate with each other. There are other (more backdoor) ways that Verilog modules can communicate. But for a design that intends to be synthesized, Verilog ports is the standard method. There are many ways to code port connections. This will be discussed in more detail in another future article.

Module Item

Module item is essentially the “code that is inside the module” (after the port declaration). It defines what constitutes the module, and can include many different types of declarations and definitions (net and variable declarations, always blocks and initial blocks, etc.)

Putting it Together

Here is a very simple example of Verilog module definition, that puts together all the pieces above.

module my_module
#(
    parameter WIDTH = 1
) (
    input wire              clk,
    input wire              rst_n,
    input wire [WIDTH-1:0]  in_a, in_b,
    output reg [WIDTH-1:0]  out_c
);

always @(posedge clk or negedge rst_n)
    out_c <= in_a & in_b;

endmodule

Instantiating a Verilog Module

A Verilog module can instantiate other Verilog modules, creating a hierarchy of modules to form the full design and testbench. Any uninstantiated module is at the top level.

Instantiation Statement

The Verilog module instantiation statement creates one or more named instances of a defined module. Multiple instances (identical copies of the Verilog module) can be created on the same line of code. This type of coding style is obviously easier with simple modules that have few (or none) ports. Multiple instantiations can even contain a range specification. This allows an array of instances to be created.

Connecting the Ports

For a Verilog module that does have ports, Verilog defines two styles to connect the ports:

  • By position – specify the connection to each port in the same order as the ports were listed in the module declaration
  • By name – specify each port by the name used in the (sub) module declaration, followed by the name used in the instantiating module

When connecting by name, an unconnected port can be indicated by either omitting it from the port list, or by providing no expression in the parentheses ( .name () ). The two types of port connections shall not be mixed (in Verilog) in a single declaration.

For a Verilog module that does not have any port, you still need to write the parentheses when instantiating it.

As to what to connect to the port, from Verilog, it can be a register or net identifier, an expression, or a blank (to indicate no connection that that port). An unconnected port may also be simply omitted from the port list, but only when connecting by name.

Here are some examples to illustrate the concepts above.

wire clk, rst_n;
wire a, b, c1, c2, c3, c4, d;

// Instantiating a module and connecting ports by position
my_module mod_b (clk, rst_n, a, b, c1);

// Instantiating a module and connecting ports by name
my_module mod_a
(
    .clk   (clk),
    .rst_n (rst_n),
    .in_a  (a),
    .in_b  (b),
    .out_c (c2)
);

// Instantiating a module, but leaving a port unconnected (a bug in this case!)
my_module mod_c
(
    .clk   (), // A bug! But this will compile
    .rst_n (rst_n),
    .in_a  (a),
    .in_b  (b),
    .out_c (c3)
);

// Instantiating a module with no ports
my_module_with_no_ports mod_d();

// Connecting an expression to a port
my_module mod_e
(
    .clk   (clk),
    .rst_n (rst_n),
    .in_a  (a & d), // an expression as a port connection
    .in_b  (b),
    .out_c (c4)
);

Verilog Module Hierarchy

When instantiating and connecting Verilog modules and ports, a hierarchical design is created. Every identifier (for example every module) has a unique hierarchical path name. This is useful generally in testbench coding, where you sometimes need to reference a particular signal, in a somewhat backdoor way, in a different module within your testbench. This is generally not used in design coding, where you always want to more formally use Verilog ports to make explicit connections and model dataflow (with exception possibly with Verilog defparam—topic for a future article).

The complete hierarchy of names can be viewed as a tree structure, with the root being the top level module. Each module, generate block instance, task, function, named begin-end or fork-join block defines a new hierarchical level (also called a scope), in a particular branch of the tree. A design description contains one or more top-level modules. Each such module forms the root of a name hierarchy. The following figure shows an example Verilog module hierarchy.

Verilog module hierarchical name example
Verilog module hierarchical name example

Verilog generate block, that do not have a name label, creates a hierarchy that is only visible within the block, and within the sub-tree formed by this block—and nowhere else. Therefore it’s good practice to always name Verilog generate blocks so all identifiers can be referenced throughout your environment. See my article Verilog Generate Configurable RTL Designs.

Now that we have defined a hierarchy, we can reference any named Verilog object or hierarchical name reference, by concatenating the names of the modules, module instance names, generate blocks, tasks, functions, or named blocks that contain it. Each of the names in the hierarchy is separated by a period.

You can reference the complete path name, starting from the top-level (root) module, or you can reference “downwards”, starting from the level where the path is being used. You can also reference a particular instance of an array (or a generate loop) by a constant expression within square brackets. The expression shall evaluate to one of the legal index values of the array.

The following code matches the example design hierarchy above. We’ll use it to illustrate some examples of referencing.

// A trivial AND module
module my_and
(
  input wire in1,
  input wire in2,
  output wire out
);
  assign out = in1 & in2;

endmodule

// A design that instantiates the AND module
module dut
(
  input wire clk,
  input wire rst_n,
  input wire [3:0] a, b,
  output reg [3:0] c
);
  my_and i_my_and_3 (a[3], b[3], c[3]);

  generate
    genvar gi;
    for (gi=0; gi<3; gi=gi+1) begin : gen_and
      my_and i_my_and (a[gi], b[gi], c[gi]);
    end // gen_and
  endgenerate
endmodule

// Testbench that instantiates the design under test (DUT)
module testbench;
  wire clk, rst_n;
  wire [3:0] a, b, c;

  dut i_dut
  (
    .clk   (clk),
    .rst_n (rst_n),
    .a     (a),
    .b     (b),
    .c     (c)
  );

  // Some examples of hierarchical references
  initial begin
    $monitor(a); // reference to testbench hierarchy
    $monitor(i_dut.a); // reference to dut hierarchy
    $monitor(i_dut.i_my_and_3.out); // reference to dut sub-module hierarchy
    $monitor(i_dut.gen_and[0].i_my_and.out); // reference to generated sub-module
  end
endmodule

Some final notes on Verilog module hierarchy:

  • Each node in the hierarchical name tree creates a new scope for identifiers. Within each scope, the same identifier (e.g. net name, variable name) can be declared only once. Conversely, it is permitted to use the same identifier in different scopes, or different Verilog modules.
  • Objects declared in automatic tasks and functions are one exception that cannot be referenced by hierarchical names.
  • It is also possible, from Verilog LRM perspective, to reference “upwards” in the hierarchy. For example, it is possible for a lower level module to reference a variable in the module that instantiates the lower level module. However, the usage is tricky, and application is dubious (certainly not recommended for design code), so I will not go into it here.

Verilog Module Summary

Verilog module is one of the fundamental hierarchical constructs in Verilog. It encapsulates code and functionality, allowing a larger design to be built from lower level components, enhancing modularity and reuse. This article described the basic syntax of a Verilog module, how to define a module, how to connect multiple modules together, and how the interconnection creates a design hierarchy.

The next article in this fundamentals series will dive deeper into Verilog ports, exploring how the syntax evolved from the original Verilog to the latest SystemVerilog language manuals. Stay tuned!

References

The post Verilog Module for Design and Testbench appeared first on Verilog Pro.

]]>
https://www.verilogpro.com/verilog-module-for-design-and-testbench/feed/ 2
Verilog Always Block for RTL Modeling https://www.verilogpro.com/verilog-always-block/ https://www.verilogpro.com/verilog-always-block/#respond Wed, 13 Apr 2022 15:00:00 +0000 https://www.verilogpro.com/?p=723 Verilog always block is a procedural statement that starts an activity flow. It is essentially an infinite loop. However, when combined with a Verilog event expression, it can be used to model combinational and sequential logic.

The post Verilog Always Block for RTL Modeling appeared first on Verilog Pro.

]]>
This article is going to introduce the Verilog always block—one of the most basic constructs that has existed since the very beginning of Verilog (IEEE standard 1364-1995)—relate it to some other introductory constructs, and use them to write some simple hardware logic.

After a long hiatus, I’m picking up the proverbial pen again and writing some Verilog articles! I have a new goal to create a series of articles to help new engineers transition from “textbook knowledge” to real world knowledge needed to become a digital design engineer. You’ll see some new articles on basic concepts, as well as intermediate level concepts similar to my previous articles. Hope you find these new articles useful for your career! Let’s get started!

Verilog Always Block In a Nutshell

Verilog behaviour models (RTL design/model is a class of behavioural models) contain procedural statements that control the simulation, and manipulate variables to model hardware circuitry and data flow. The Verilog always block is a procedural statement that starts an activity flow. Each Verilog always block starts a separate activity flow. All of the activity flows are concurrent to model the inherent concurrence of hardware. Each Verilog always block repeats continuously throughout the duration of the simulation, executing the statements defined in its procedure. Its activity ceases only when the simulation is terminated.

An Infinite Loop?

The Verilog always block essentially describes an infinite loop. Without some form of timing control to avoid a zero-delay infinite loop, and allow simulation time to advance, a simulation deadlock condition can be created (read: a simulation hang). The following code, for example, creates such a zero-delay infinite loop.

always areg = ~areg;

If there is a Verilog always block with no timing control, your simulation will look as though it has hung, and will not advance in time. So let’s add a timing control to make the code more useful:

always #half_period clk = ~clk;

Now this becomes a potentially useful statement. It causes the clk signal to toggle its polarity at a delay of half_period. If you haven’t noticed already, it can be used to create a continuously toggling clock stimulus in a simulation.

The “#” is formally called a delay control. It causes the simulation to insert the specified delay (to that procedural block) where the “#” is written.

Timing Control with Event Expression (Sensitivity List)

Another way to provide a timing control is in the form of a Verilog event expression. The syntax of Verilog event expression is “@(event_expression)“. For the procedural block that contains the Verilog event expression, it causes the simulator to wait until the event_expression has occurred before continuing execution of that procedural block. When used together with a Verilog always block, it adds the necessary timing control to make the Verilog always block useful to model hardware. The SystemVerilog standards (IEEE 1800-2005 onwards) also describe the event expression, when used to trigger a Verilog always block, as a “sensitivity list”.

The event expression (or sensitivity list) can contain multiple signals. Each signal is separated by the keyword “or”. The following Verilog always block describes a simple OR gate with inputs A and B, and output C. This code tells the simulator to re-evaluate the value of output C, whenever the signal A or B changes. It is not entirely straightforward, but you can see how this essentially describes the behaviour of an OR gate in simulation.

always @(A or B)
    C = A | B;

Modelling Hardware Logic with Verilog Always Block

Combining these ideas brings us to the more common usage of Verilog always block—together with an event expression.

always @(event_expression)
    single_statement;

always @(event_expression)
begin
    multiple_statements;
end

For hardware modeling, the Verilog event expression is typically specified in one of two forms:

  • Sequential logic: execution triggered based on a clock event (and frequently a reset event)
  • Combinational logic: execution triggered based on the inputs to the logic (i.e. nets and variables on the right hand side of an assignment statement)

Modeling Sequential Logic

A typical event expression for a flip-flop based design is “@(posedge clock_name)”. This tells the simulator to only evaluate the Verilog always block when the clock_name signal makes a positive edge transition (0->1), which is how a flip-flop based design is constructed. Without the “posedge” keyword, the Verilog always block will trigger on any transition of the clock_name signal, both positive and negative.

The following code describes a flip-flop with input d and output q, clocked by positive edge of clk (with no reset):

always @(posedge clk)
    q <= d;

Let’s make it a more complete flip-flop design by also adding a reset, namely an asynchronous reset. An asynchronous reset means the reset will occur even without the presence of a clock. An asynchronous reset is modeled by adding the reset signal also to the sensitivity list. This will cause the simulator to re-evaluate this Verilog always block when the reset signal transitions, irrespective of whether the clk signal transitions (which is what an asynchronous reset means). The Verilog event expression “@(negedge rst_n)” makes the reset active low (trigger an evaluation when rst_n transitions 1->0), and the “if” statement specifies a reset value for the flip-flop. The following code describes the same flip-flop, now with an active-low asynchronous reset (rst_n) that will reset the flip-flop output q to 0 when the reset is asserted (when rst_n transitions 1->0).

always @(posedge clk or negedge rst_n)
    if (!rst_n)
        q <= 1'b0;
    else
        q <= d;

To describe two flip flops, you can write them as two separate Verilog always block. Let’s make it more interesting and put them in series into a two-stage pipeline. First the diagram, then the code.

Two stage pipelined D flip flops
Two stage pipelined D flip flops
always @(posedge clk or negedge rst_n)
    if (!rst_n)
        q1 <= 1'b0;
    else
        q1 <= d1;

always @(posedge clk or negedge rst_n)
    if (!rst_n)
        q2 <= 1'b0;
    else
        q2 <= q1;

Or you can also write it as one Verilog always block.

always @(posedge clk or negedge rst_n)
    if (!rst_n) begin
        q1 <= 1'b0;
        q2 <= 1'b0;
    end
    else begin
        q1 <= d1;
        q2 <= q1;
    end

When there are multiple Verilog always blocks, there is no implied order of execution between them. There is also no limit to the number of always constructs that can be defined in a module.

Modeling Combinational Logic

The Verilog always block can also model combinational logic, but it is a bit less straight forward to understand. A physical implementation of a combinational circuit obviously operates continuously, sampling the inputs and calculating the resulting outputs. A simulator, however, cannot execute a logical statement “continuously”, without causing the zero-delay infinite loop described at the beginning of the article. Therefore to simulate combinational circuit, we have to define specific events at which the simulator should execute procedures, while maintaining correct behaviour.

The simple answer is, to model combinational circuit, the sensitivity list needs to contain all the inputs to the circuit (all the variables on the right hand side of the assignment). The following code describes a combinational AND gate using a Verilog always block.

always @(A or B)
    C = A & B;

More on Event Expression with Verilog Always Block

Here are a few more tips with using Verilog event expression with Verilog always block.

Using Comma in Event Expression

Verilog-2001 standard introduced the use of comma “,” to separate items in the event expression. Prior to that with the original Verilog-1995 standard, the separate items in the event expression must be separated by the keyword “or”. I have used the Verilog-1995 syntax in all the code examples so far. But here is the same flip-flop code example written with the Verilog-2001 syntax.

always @(posedge clk, negedge rst_n)
    if (!rst_n)
        q <= 1'b0;
    else
        q <= d;

Personally I prefer the Verilog-2001 syntax. That is how I write my code.

Implicit Event Expression @* or @(*)

Sensitivity list is a frequent source of bugs in a Verilog design/model. A deeper discussion of common pitfalls will be the subject of a future article. The standard writers wanted to simplify the usage of the the sensitivity list somewhat in Verilog-2001, so they added the “implicit event_expression list” syntax to simplify using Verilog always block, at least to describe combinational logic. Using an implicit event expression list, the AND gate combinational logic can be rewritten as follows.

always @*
    C = A & B;

always @(*)
    C = A & B;

More precisely, the “@*” and “@(*)” syntax will add all nets and variables that appear in the (right hand side of a) statement to the event expression, with some exceptions.

Verilog Always Block, Evolution to SystemVerilog always_comb and always_ff

SystemVerilog adds several new syntax in addition to the Verilog always block, primarily to address the exceptions noted above. You can read more about these constructs in my article SystemVerilog always_comb, always_ff.

Conclusion

Verilog always block is one of the four procedural statements in the original Verilog language. It can be used to model testbench stimulus as well as hardware design. The Verilog always block is essentially an infinite loop. However, when combined with a Verilog event expression, it can be used to model combinational and sequential logic. SystemVerilog adds several new versions of “always”, in addition to Verilog always block, to address some limitations and pitfalls of the original Verilog syntax.

References

The post Verilog Always Block for RTL Modeling appeared first on Verilog Pro.

]]>
https://www.verilogpro.com/verilog-always-block/feed/ 0
Verilog Generate Configurable RTL Designs https://www.verilogpro.com/verilog-generate-configurable-rtl/ https://www.verilogpro.com/verilog-generate-configurable-rtl/#comments Thu, 04 Jan 2018 18:00:09 +0000 http://www.verilogpro.com/?p=641 Verilog generate statement is a powerful construct for writing configurable, synthesizable RTL. It can be used to create multiple instantiations of modules and code, or conditionally instantiate blocks of code. However, many Verilog programmers often have questions about how to use Verilog generate effectively. In this article, I will review the usage of three forms ... Read more

The post Verilog Generate Configurable RTL Designs appeared first on Verilog Pro.

]]>
Verilog generate statement is a powerful construct for writing configurable, synthesizable RTL. It can be used to create multiple instantiations of modules and code, or conditionally instantiate blocks of code. However, many Verilog programmers often have questions about how to use Verilog generate effectively. In this article, I will review the usage of three forms of Verilog generate—generate loop, if-generate, and case-generate.

Types of Verilog Generate Constructs

There are two kinds of Verilog generate constructs. Generate loop constructs allow a block of code to be instantiated multiple times, controlled by a variable index. Conditional generate constructs select at most one block of code between multiple blocks. Conditional generate constructs include if-generate and case-generate forms.

Verilog generate constructs are evaluated at elaboration, which occurs after parsing the HDL (and preprocessor), but before simulation begins. Therefore all expressions within generate constructs must be constant expressions, deterministic at elaboration time. For example, generate constructs can be affected by values from parameters, but not by dynamic variables.

A Verilog generate block creates a new scope and a new level of hierarchy, almost like instantiating a module. This sometimes causes confusion when trying to write a hierarchical reference to signals or modules within a generate block, so it is something to keep in mind.

Use of the keywords generate and endgenerate (and begin/end) is actually optional. If they are used, then they define a generate region. Generate regions can only occur directly within a module, and they cannot nest. For readability, I like to use the generate and endgenerate keywords.

Verilog Generate Loop

The syntax for a generate loop is similar to that of a for loop statement. The loop index variable must first be declared in a genvar declaration before it can be used. The genvar is used as an integer to evaluate the generate loop during elaboration. The genvar declaration can be inside or outside the generate region, and the same loop index variable can be used in multiple generate loops, as long as the loops don’t nest.

Within each instance of the “unrolled” generate loop, an implicit localparam is created with the same name and type as the loop index variable. Its value is the “index” of the particular instance of the “unrolled” loop. This localparam can be referenced from RTL to control the generated code, and even referenced by a hierarchical reference.

Generate block in a Verilog generate loop can be named or unnamed. If it is named, then an array of generate block instances is created. Some tools warn you about unnamed generate loops, so it is good practice to always name them.

The following example shows a gray to binary code converter written using a Verilog generate loop.

Example of parameterized gray to binary code converter

module gray2bin
#(parameter SIZE = 8)
(
  input [SIZE-1:0] gray,
  output [SIZE-1:0] bin
)

Genvar gi;
// generate and endgenerate is optional
// generate (optional)
  for (gi=0; gi&lt;SIZE; gi=gi+1) begin : genbit
    assign bin[gi] = ^gray[SIZE-1:gi]; // Thanks Dhruvkumar!
  end
// endgenerate (optional)
endmodule

Another example from the Verilog-2005 LRM illustrates how each iteration of the Verilog generate loop creates a new scope. Notice wire t1, t2, t3 are declared within the generate loop. Each loop iteration creates a new t1, t2, t3 that do not conflict, and they are used to wire one generated instance of the adder to the next. Also note the naming of the hierarchical reference to reference an instance within the generate loop.

module addergen1
#(parameter SIZE = 4)
(
  input  logic [SIZE-1:0] a, b,
  input  logic            ci,
  output logic            co,
  output logic [SIZE-1:0] sum
);

wire [SIZE :0] c;
genvar i;

assign c[0] = ci;

// Hierarchical gate instance names are:
// xor gates: bitnum[0].g1 bitnum[1].g1 bitnum[2].g1 bitnum[3].g1
// bitnum[0].g2 bitnum[1].g2 bitnum[2].g2 bitnum[3].g2
// and gates: bitnum[0].g3 bitnum[1].g3 bitnum[2].g3 bitnum[3].g3
// bitnum[0].g4 bitnum[1].g4 bitnum[2].g4 bitnum[3].g4
// or gates: bitnum[0].g5 bitnum[1].g5 bitnum[2].g5 bitnum[3].g5
// Gate instances are connected with nets named:
// bitnum[0].t1 bitnum[1].t1 bitnum[2].t1 bitnum[3].t1
// bitnum[0].t2 bitnum[1].t2 bitnum[2].t2 bitnum[3].t2
// bitnum[0].t3 bitnum[1].t3 bitnum[2].t3 bitnum[3].t3

for(i=0; i&lt;SIZE; i=i+1) begin:bitnum
  wire t1, t2, t3;
  xor g1 ( t1, a[i], b[i]);
  xor g2 ( sum[i], t1, c[i]);
  and g3 ( t2, a[i], b[i]);
  and g4 ( t3, t1, c[i]);
  or g5 ( c[i+1], t2, t3);
end

assign co = c[SIZE];

endmodule

Generate loops can also nest. Only a single generate/endgenerate is needed (or none, since it’s optional) to encompass the nested generate loops. Remember each generate loop creates a new scope. Therefore the hierarchical reference to the inner loop needs to include the label of the outer loop.

Conditional If-Generate

Conditional if-generate selects at most one generate block from a set of alternative generate blocks. Note I say at most, because it may also select none of the blocks. The condition must again be a constant expression during elaboration.

Conditional if-generate may be named or unnamed, and may or may not have begin/end. Either way, it can contain only one item. It also creates a separate scope and level of hierarchy, like a generate loop. Since conditional generate selects at most one block of code, it is legal to name the alternative blocks of code within the single if-generate with the same name. That helps to keep hierarchical reference to the code common regardless of which block of code is selected. Different generate constructs, however, must have different names.

Conditional Case-Generate

Similar to if-generate, case-generate can also be used to conditionally select one block of code from several blocks. Its usage is similar to the basic case statement, and all rules from if-generate also apply to case-generate.

Direct Nesting of Conditional Generate

There is a special case where nested conditional generate blocks that are not surrounded by begin/end can consolidate into a single scope/hierarchy. This avoids creating unnecessary scope/hierarchy within the module to complicate the hierarchical reference. This special case does not apply at all to loop generate.

The example below shows how this special rule can be used to construct complex if-else if conditional generate statements that belong to the same hierarchy.

module test;
  parameter p = 0, q = 0;
  wire a, b, c;

  //---------------------------------------------------------
  // Code to either generate a u1.g1 instance or no instance.
  // The u1.g1 instance of one of the following gates:
  // (and, or, xor, xnor) is generated if
  // {p,q} == {1,0}, {1,2}, {2,0}, {2,1}, {2,2}, {2, default}
  //---------------------------------------------------------

  if (p == 1)
    if (q == 0) begin : u1 // If p==1 and q==0, then instantiate
      and g1(a, b, c); // AND with hierarchical name test.u1.g1
    end
    else if (q == 2) begin : u1 // If p==1 and q==2, then instantiate
      or g1(a, b, c); // OR with hierarchical name test.u1.g1
    end
    // "else" added to end "if (q == 2)" statement
    else ; // If p==1 and q!=0 or 2, then no instantiation
  else if (p == 2)
    case (q)
      0, 1, 2:
        begin : u1 // If p==2 and q==0,1, or 2, then instantiate
          xor g1(a, b, c); // XOR with hierarchical name test.u1.g1
        end
      default:
        begin : u1 // If p==2 and q!=0,1, or 2, then instantiate
          xnor g1(a, b, c); // XNOR with hierarchical name test.u1.g1
        end
    endcase

endmodule

This generate construct will select at most one of the generate blocks named u1. The hierarchical name of the gate instantiation in that block would be test.u1.g1. When nesting if-generate constructs, the else always belongs to the nearest if construct. Note the careful placement of begin/end within the code Any additional begin/end will violate the direct nesting requirements, and cause an additional hierarchy to be created.

Named vs Unnamed Generate Blocks

It is recommended to always name generate blocks to simplify hierarchical reference. Moreover, various tools often complain about anonymous generate blocks. However, if a generate block is unnamed, the LRM does describe a fixed rule for how tools shall name an anonymous generate block based on the text of the RTL code.

First, each generate construct in a scope is assigned a number, starting from 1 for the generate construct that appears first in the RTL code within that scope, and increases by 1 for each subsequent generate construct in that scope. The number is assigned to both named and unnamed generate constructs. All unnamed generate blocks will then be given the name genblk[n] where [n] is the number assigned to its enclosing generate construct.

It is apparent from the rule that RTL code changes will cause the unnamed generate construct name to change. That in turn makes it difficult to maintain hierarchical references in RTL and scripts. Therefore, it is recommended to always name generate blocks.

Conclusion

Verilog generate constructs are powerful ways to create configurable RTL that can have different behaviours depending on parameterization. Generate loop allows code to be instantiated multiple times, controlled by an index. Conditional generate, if-generate and case-generate, can conditionally instantiate code. The most important recommendation regarding generate constructs is to always name them, which helps simplify hierarchical references and code maintenance.

References

The post Verilog Generate Configurable RTL Designs appeared first on Verilog Pro.

]]>
https://www.verilogpro.com/verilog-generate-configurable-rtl/feed/ 17
SystemVerilog Arrays, Flexible and Synthesizable https://www.verilogpro.com/systemverilog-arrays-synthesizable/ https://www.verilogpro.com/systemverilog-arrays-synthesizable/#comments Tue, 10 Oct 2017 17:00:44 +0000 http://www.verilogpro.com/?p=607 In my last article on plain old Verilog Arrays, I discussed their very limited feature set. In comparison, SystemVerilog arrays have greatly expanded capabilities both for writing synthesizable RTL, and for writing non-synthesizable test benches. In this article, we’ll take a look at the synthesizable features of SystemVerilog Arrays we can use when writing design ... Read more

The post SystemVerilog Arrays, Flexible and Synthesizable appeared first on Verilog Pro.

]]>
In my last article on plain old Verilog Arrays, I discussed their very limited feature set. In comparison, SystemVerilog arrays have greatly expanded capabilities both for writing synthesizable RTL, and for writing non-synthesizable test benches. In this article, we’ll take a look at the synthesizable features of SystemVerilog Arrays we can use when writing design RTL.

Packed vs Unpacked SystemVerilog Arrays

Verilog had only one type of array. SystemVerilog arrays can be either packed or unpacked. Packed array refers to dimensions declared after the type and before the data identifier name. Unpacked array refers to the dimensions declared after the data identifier name.

bit [7:0] c1;         // packed array of scalar bit
real      u [7:0];    // unpacked array of real

int Array[0:7][0:31]; // unpacked array declaration using ranges
int Array[8][32];     // unpacked array declaration using sizes

Packed Arrays

A one-dimensional packed array is also called a vector. Packed array divides a vector into subfields, which can be accessed as array elements. A packed array is guaranteed to be represented as a contiguous set of bits in simulation and synthesis.

Packed arrays can be made of only the single bit data types (bit, logic, reg), enumerated types, and other packed arrays and packed structures. This also means you cannot have packed arrays of integer types with predefined widths (e.g. a packed array of byte).

Unpacked arrays

Unpacked arrays can be made of any data type. Each fixed-size dimension is represented by an address range, such as [0:1023], or a single positive number to specify the size of a fixed-size unpacked array, such as [1024]. The notation [size] is equivalent to [0:size-1].

Indexing and Slicing SystemVerilog Arrays

Verilog arrays could only be accessed one element at a time. In SystemVerilog arrays, you can also select one or more contiguous elements of an array. This is called a slice. An array slice can only apply to one dimension; other dimensions must have single index values in an expression.

Multidimensional Arrays

Multidimensional arrays can be declared with both packed and unpacked dimensions. Creating a multidimensional packed array is analogous to slicing up a continuous vector into multiple dimensions.

When an array has multiple dimensions that can be logically grouped, it is a good idea to use typedef to define the multidimensional array in stages to enhance readability. But notice the order of the dimensions become a little confusing.

bit [3:0] [7:0] joe [0:9] // 10 elements of 4 8-bit bytes
                          // (each element packed into 32 bits)

typedef bit [4:0] bsix;   // multiple packed dimensions with typedef
bsix [9:0] v5;            // equivalent to bit[9:0][4:0] v5 - thanks Yunsung!

typedef bsix mem_type [0:3]; // array of four unpacked 'bsix' elements
mem_type ba [0:7];           // array of eight unpacked 'mem_type' elements
                             // equivalent to bit[4:0] ba [0:7][0:3] - thanks Yunsung!

SystemVerilog Array Operations

SystemVerilog arrays support many more operations than their traditional Verilog counterparts.

+: and -: Notation

When accessing a range of indices (a slice) of a SystemVerilog array, you can specify a variable slice by using the [start+:increment width] and [start-:decrement width] notations. They are simpler than needing to calculate the exact start and end indices when selecting a variable slice. The increment/decrement width must be a constant.

bit signed [31:0] busA [7:0]; // unpacked array of 8 32-bit vectors
int busB [1:0];               // unpacked array of 2 integers
busB = busA[7:6];             // select a 2-vector slice from busA
busB = busA[6+:2];            // equivalent to busA[7:6]; typo fixed, thanks Tomer!

Assignments, Copying, and other Operations

SystemVerilog arrays support many more operations than Verilog arrays. The following operations can be performed on both packed and unpacked arrays.

A       = B;       // reading and writing the array
A[i:j]  = B[i:j];  // reading and writing a slice of the array
A[x+:c] = B[y+:d]; // reading and writing a variable slice of the array
A[i]    = B[i];    // accessing an element of the array
A       == B;      // equality operations on the array
A[i:j]  != B[i:j]; // equality operations on slice of the array

Packed Array Assignment

A SystemVerilog packed array can be assigned at once like a multi-bit vector, or also as an individual element or slice, and more.

logic [1:0][1:0][7:0] packed_3d_array;

always_ff @(posedge clk, negedge rst_n)
  if (!rst_n) begin
    packed_3d_array <= '0;      // assign 0 to all elements of array
  end
  else begin
    packed_3d_array[0][0][0]   <= 1'b0;         // assign one bit
    packed_3d_array[0][0]      <= 8'h0a;        // assign one element
    packed_3d_array[0][0][3:0] <= 4'ha;         // assign part select
    packed_3d_array[0]         <= 16'habcd;     // assign slice
    packed_3d_array            <= 32'h01234567; // assign entire array as vector
  end

Unpacked Array Assignment

All or multiple elements of a SystemVerilog unpacked array can be assigned at once to a list of values. The list can contain values for individual array elements, or a default value for the entire array.

logic [7:0] a, b, c;
logic [7:0] d_array[0:3];
logic [7:0] e_array[3:0]; // note index of unpacked dimension is reversed
                          // personally, I prefer this form
logic [7:0] mult_array_a[3:0][3:0];
logic [7:0] mult_array_b[3:0][3:0];

always_ff @(posedge clk, negedge rst_n)
  if (!rst_n) begin
    d_array <= '{default:0};      // assign 0 to all elements of array
  end
  else begin
    d_array        <= '{8'h00, c, b, a}; // d_array[0]=8'h00, d_array[1]=c, d_array[2]=b, d_array[3]=a
    e_array        <= '{8'h00, c, b, a}; // e_array[3]=8'h00, e_array[2]=c, e_array[1]=b, d_array[0]=a
    mult_array_a   <= '{'{8'h00, 8'h01, 8'h02, 8'h03},
                        '{8'h04, 8'h05, 8'h06, 8'h07},
                        '{8'h08, 8'h09, 8'h0a, 8'h0b},
                        '{8'h0c, 8'h0d, 8'h0e, 8'h0f}}; // assign to full array
    mult_array_b[3] <= '{8'h00, 8'h01, 8'h02, 8'h03}; // assign to slice of array
  end

Conclusion

This article described the two new types of SystemVerilog arrays—packed and unpacked—as well as the many new features that can be used to manipulate SystemVerilog arrays. The features described in this article are all synthesizable, so you can safely use them in SystemVerilog based RTL designs to simplify coding. In the next part of the SystemVerilog arrays article, I will discuss more usages of SystemVerilog arrays that can make your SystemVerilog design code even more efficient. Stay tuned!

Resources

Sample Source Code

The accompany source code for this article is a toy example module and testbench that illustrates SystemVerilog array capabilities, including using an array as a port, assigning multi-dimensional arrays, and assigning slices of arrays. Download and run it to see how it works!

[lab_subscriber_download_form download_id=11].

The post SystemVerilog Arrays, Flexible and Synthesizable appeared first on Verilog Pro.

]]>
https://www.verilogpro.com/systemverilog-arrays-synthesizable/feed/ 36
Verilog Arrays Plain and Simple https://www.verilogpro.com/verilog-arrays-plain-simple/ https://www.verilogpro.com/verilog-arrays-plain-simple/#comments Tue, 25 Jul 2017 17:00:47 +0000 http://www.verilogpro.com/?p=598 Arrays are an integral part of many modern programming languages. Verilog arrays are quite simple; the Verilog-2005 standard has only 2 pages describing arrays, a stark contrast from SystemVerilog-2012 which has 20+ pages on arrays. Having a good understanding of what array features are available in plain Verilog will help understand the motivation and improvements ... Read more

The post Verilog Arrays Plain and Simple appeared first on Verilog Pro.

]]>
Arrays are an integral part of many modern programming languages. Verilog arrays are quite simple; the Verilog-2005 standard has only 2 pages describing arrays, a stark contrast from SystemVerilog-2012 which has 20+ pages on arrays. Having a good understanding of what array features are available in plain Verilog will help understand the motivation and improvements introduced in SystemVerilog. In this article I will restrict the discussion to plain Verilog arrays, and discuss SystemVerilog arrays in an upcoming post.

Verilog Arrays

Verilog arrays can be used to group elements into multidimensional objects to be manipulated more easily. Since Verilog does not have user-defined types, we are restricted to arrays of built-in Verilog types like nets, regs, and other Verilog variable types.

Each array dimension is declared by having the min and max indices in square brackets. Array indices can be written in either direction:

array_name[least_significant_index:most_significant_index], e.g. array1[0:7]
array_name[most_significant_index:least_significant_index], e.g. array2[7:0]

Personally I prefer the array2 form for consistency, since I also write vector indices (square brackets before the array name) in [most_significant:least_significant] form. However, this is only a preference not a requirement.

A multi-dimensional array can be declared by having multiple dimensions after the array declaration. Any square brackets before the array identifier is part of the data type that is being replicated in the array.

The Verilog-2005 specification also calls a one-dimensional array with elements of type reg a memory. It is useful for modeling memory elements like read-only memory (ROM), and random access memory (RAM).

Verilog arrays are synthesizable, so you can use them in synthesizable RTL code.

reg [31:0] x[127:0];          // 128-element array of 32-bit wide reg
wire[15:0] y[  7:0], z[7:0];  // 2 arrays of 16-bit wide wires indexed from 7 to 0
reg [ 7:0] mema   [255:0];    // 256-entry memory mema of 8-bit registers
reg        arrayb [  7:0][255:0]; // two-dimensional array of one bit registers

Assigning and Copying Verilog Arrays

Verilog arrays can only be referenced one element at a time. Therefore, an array has to be copied a single element at a time. Array initialization has to happen a single element at a time. It is possible, however, to loop through array elements with a generate or similar loop construct. Elements of a memory must also be referenced one element at a time.

initial begin
  mema             = 0; // Illegal syntax - Attempt to write to entire array
  arrayb[1]        = 0; // Illegal syntax - Attempt to write to elements [1][255]...[1][0]
  arrayb[1][31:12] = 0; // Illegal syntax - Attempt to write to multiple elements
  mema[1]          = 0; // Assigns 0 to the second element of mema
  arrayb[1][0]     = 0; // Assigns 0 to the bit referenced by indices [1][0]
end

// Generate loop with arrays of wires
generate
genvar gi;
  for (gi=0; gi&lt;8; gi=gi+1) begin : gen_array_transform
    my_example_16_bit_transform_module u_mod (
      .in  (y[gi]),
      .out (z[gi])
    );
  end
endgenerate

// For loop with arrays
integer index;
always @(posedge clk, negedge rst_n) begin
  if (!rst_n) begin
    // reset arrayb
    for (index=0; index&lt;256; index=index+1) begin
      mema[index] &lt;= 8'h00;
    end
  end
  else begin
    // out of reset functional code
  end
end

Conclusion

Verilog arrays are plain, simple, but quite limited. They really do not have many features beyond the basics of grouping signals together into a multidimensional structure. SystemVerilog arrays, on the other hand, are much more flexible and have a wide range of new features and uses. In the next article—SystemVerilog arrays, Synthesizable and Flexible—I will discuss the new features that have been added to SystemVerilog arrays and how to use them.

References

Sample Source Code

The accompany source code for this article is a toy example module and testbench that illustrates SystemVerilog array capabilities, including using an array as a port, assigning multi-dimensional arrays, and assigning slices of arrays. Download and run it to see how it works!

[lab_subscriber_download_form download_id=11].

The post Verilog Arrays Plain and Simple appeared first on Verilog Pro.

]]>
https://www.verilogpro.com/verilog-arrays-plain-simple/feed/ 14
Verilog reg, Verilog wire, SystemVerilog logic. What’s the difference? https://www.verilogpro.com/verilog-reg-verilog-wire-systemverilog-logic/ https://www.verilogpro.com/verilog-reg-verilog-wire-systemverilog-logic/#comments Tue, 02 May 2017 17:00:40 +0000 http://www.verilogpro.com/?p=499 The difference between Verilog reg and Verilog wire frequently confuses many programmers just starting with the language (certainly confused me!). As a beginner, I was told to follow these guidelines, which seemed to generally work: Use Verilog reg for left hand side (LHS) of signals assigned inside in always blocks Use Verilog wire for LHS ... Read more

The post Verilog reg, Verilog wire, SystemVerilog logic. What’s the difference? appeared first on Verilog Pro.

]]>
The difference between Verilog reg and Verilog wire frequently confuses many programmers just starting with the language (certainly confused me!). As a beginner, I was told to follow these guidelines, which seemed to generally work:

  • Use Verilog reg for left hand side (LHS) of signals assigned inside in always blocks
  • Use Verilog wire for LHS of signals assigned outside always blocks

Then when I adopted SystemVerilog for writing RTL designs, I was told everything can now be “type logic”. That again generally worked, but every now and then I would run into a cryptic error message about variables, nets, and assignment.

So I decided to find out exactly how these data types worked to write this article. I dug into the language reference manual, searched for the now-defunct Verilog-2005 standard document, and got into a bit of history lesson. Read on for my discovery of the differences between Verilog reg, Verilog wire, and SystemVerilog logic.

Verilog data types, Verilog reg, Verilog wire

Verilog data types are divided into two main groups: nets and variables. The distinction comes from how they are intended to represent different hardware structures.

A net data type represents a physical connection between structural entities (think a plain wire), such as between gates or between modules. It does not store any value. Its value is derived from what is being driven from its driver(s). Verilog wire is probably the most common net data type, although there are many other net data types such as tri, wand, supply0.

A variable data type generally represents a piece of storage. It holds a value assigned to it until the next assignment. Verilog reg is probably the most common variable data type. Verilog reg is generally used to model hardware registers (although it can also represent combinatorial logic, like inside an always@(*) block). Other variable data types include integer, time, real, realtime.

Almost all Verilog data types are 4-state, which means they can take on 4 values:

  • 0 represents a logic zero, or a false condition
  • 1 represents a logic one, or a true condition
  • X represents an unknown logic value
  • Z represents a high-impedance state

Verilog rule of thumb 1: use Verilog reg when you want to represent a piece of storage, and use Verilog wire when you want to represent a physical connection.

Assigning values to Verilog reg, Verilog wire

Verilog net data types can only be assigned values by continuous assignments. This means using constructs like continuous assignment statement (assign statement), or drive it from an output port. A continuous assignment drives a net similar to how a gate drives a net. The expression on the right hand side can be thought of as a combinatorial circuit that drives the net continuously.

Verilog variable data types can only be assigned values using procedural assignments. This means inside an always block, an initial block, a task, a function. The assignment occurs on some kind of trigger (like the posedge of a clock), after which the variable retains its value until the next assignment (at the next trigger). This makes variables ideal for modeling storage elements like flip-flops.

Verilog rule of thmb 2: drive a Verilog wire with assign statement or port output, and drive a Verilog reg from an always block. If you want to drive a physical connection with combinatorial logic inside an always@(*) block, then you have to declare the physical connection as Verilog reg.

SystemVerilog logic, data types, and data objects

SystemVerilog introduces a new 2-state data type—where only logic 0 and logic 1 are allowed, not X or Z—for testbench modeling. To distinguish the old Verilog 4-state behaviour, a new SystemVerilog logic data type is added to describe a generic 4-state data type.

What used to be data types in Verilog, like wire, reg, wand, are now called data objects in SystemVerilog. Wire, reg, wand (and almost all previous Verilog data types) are 4-state data objects. Bit, byte, shortint, int, longint are the new SystemVerilog 2-state data objects.

There are still the two main groups of data objects: nets and variables. All the Verilog data types (now data objects) that we are familiar with, since they are 4-state, should now properly also contain the SystemVerilog logic keyword.

wire my_wire;                       // implicitly means "wire logic my_wire" 
wire logic my_wire;                 // you can also declare it this way
wire [7:0] my_wire_bus;             // implicitly means "wire logic[15:0] my_wire_bus" 
wire logic [7:0] my_wire_logic_bus; // you can also declare it this way
reg [15:0] my_reg_bus;              // implicitly means "reg logic[15:0] my_reg_bus" 
//reg logic [15:0] my_reg_bus;        // but if you declare it fully, VCS 2014.10 doesn't like it

There is a new way to declare variables, beginning with the keyword var. If the data type (2-state or 4-state) is not specified, then it is implicitly declared as logic. Below are some variable declaration examples. Although some don’t seem to be fully supported by tools.

  // From the SV-2012 LRM Section 6.8
  var byte my_byte;    // byte is 2-state, so this is a variable
  //  var v;           // implicitly means "var logic v;", but VCS 2014.10 doesn't like this
  var logic v;         // this is okay
  //  var [15:0] vw;   // implicitly means "var logic [15:0] vw;", but VCS 2014.10 doesn't like this
  var logic [15:0] vw; // this is okay
  var enum bit {clear, error} status; // variable of enumerated type
  var reg r;                          // variable reg

Don’t worry too much about the var keyword. It was added for language preciseness (it’s what happens as a language evolves and language gurus strive to maintain backward-compatibility), and you’ll likely not see it in an RTL design.

I’m confused… Just tell me how I should use SystemVerilog logic!

After all that technical specification gobbledygook, I have good news if you’re using SystemVerilog for RTL design. For everyday usage in RTL design, you can pretty much forget all of that!

The SystemVerilog logic keyword standalone will declare a variable, but the rules have been rewritten such that you can pretty much use a variable everywhere in RTL design. Hence, you see in my example code from other articles, I use SystemVerilog logic to declare variables and ports.

module my_systemverilog_module
(
  input  logic       clk,
  input  logic       rst_n,
  input  logic       data_in_valid,
  input  logic [7:0] data_in_bus,
  output logic       data_out_valid, // driven by always_ff, it is a variable
  output logic [7:0] data_out_bus,   // driven by always_comb, it is a variable
  output logic       data_out_err    // also a variable, driven by continuous assignment (allowed in SV)
);

  assign data_out_err = 1'b1; // continuous assignment to a variable (allowed in SV)
//  always_comb data_out_err = 1'b0; // multiple drivers to variable not allowed, get compile time error

  always_comb data_out_bus = <data_out_bus logic expression>;
  always_ff @(posedge clk, negedge rst_n)
    if (!rst_n)
      data_out_valid <= 1'b0;
    else
      data_out_valid <= <data_out_valid logic expression>;
  
endmodule

When you use SystemVerilog logic standalone this way, there is another advantage of improved checking for unintended multiple drivers. Multiple assignments, or mixing continuous and procedural (always block) assignments, to a SystemVerilog variable is an error, which means you will most likely see a compile time error. Mixing and multiple assignments is allowed for a net. So if you really want a multiply-driven net you will need to declare it a wire.

In Verilog it was legal to have an assignment to a module output port (declared as Verilog wire or Verilog reg) from outside the module, or to have an assignment inside the module to a net declared as an input port. Both of these are frequently unintended wiring mistakes, causing contention. With SystemVerilog, an output port declared as SystemVerilog logic variable prohibits multiple drivers, and an assignment to an input port declared as SystemVerilog logic variable is also illegal. So if you make this kind of wiring mistake, you will likely again get a compile time error.

SystemVerilog rule of thumb 1: if using SystemVerilog for RTL design, use SystemVerilog logic to declare:

  • All point-to-point nets. If you specifically need a multi-driver net, then use one of the traditional net types like wire
  • All variables (logic driven by always blocks)
  • All input ports
  • All output ports

If you follow this rule, you can pretty much forget about the differences between Verilog reg and Verilog wire! (well, most of the time)

Conclusion

When I first wondered why it was possible to always write RTL using SystemVerilog logic keyword, I never expected it to become a major undertaking that involved reading and interpreting two different specifications, understanding complex language rules, and figuring out their nuances. At least I can say that the recommendations are easy to remember.

I hope this article gives you a good summary of Verilog reg, Verilog wire, SystemVerilog logic, their history, and a useful set of recommendations for RTL coding. I do not claim to be a Verilog or SystemVerilog language expert, so please do correct me if you felt I misinterpreted anything in the specifications.

References

Sample Source Code

The accompanying source code for this article is a SystemVerilog design and testbench toy example that demonstrates the difference between using Verilog reg, Verilog wire, and SystemVerilog logic to code design modules. Download the code to see how it works!

[lab_subscriber_download_form download_id=8].

The post Verilog reg, Verilog wire, SystemVerilog logic. What’s the difference? appeared first on Verilog Pro.

]]>
https://www.verilogpro.com/verilog-reg-verilog-wire-systemverilog-logic/feed/ 16
SystemVerilog Struct and Union – for Designers too https://www.verilogpro.com/systemverilog-structures-unions-design/ https://www.verilogpro.com/systemverilog-structures-unions-design/#comments Tue, 17 Jan 2017 18:00:28 +0000 http://www.verilogpro.com/?p=455 SystemVerilog struct (ure) and union are very similar to their C programming counterparts, so you may already have a good idea of how they work. But have you tried using them in your RTL design? When used effectively, they can simplify your code and save a lot of typing. Recently, I tried incorporating SystemVerilog struct ... Read more

The post SystemVerilog Struct and Union – for Designers too appeared first on Verilog Pro.

]]>
SystemVerilog struct (ure) and union are very similar to their C programming counterparts, so you may already have a good idea of how they work. But have you tried using them in your RTL design? When used effectively, they can simplify your code and save a lot of typing. Recently, I tried incorporating SystemVerilog struct and union in new ways that I had not done before with surprisingly (or not surprisingly?) good effect. In this post I would like to share with you some tips on how you can also use them in your RTL design.

What is a SystemVerilog Struct (ure)?

A SystemVerilog struct is a way to group several data types. The entire group can be referenced as a whole, or the individual data type can be referenced by name. It is handy in RTL coding when you have a collection of signals you need to pass around the design together, but want to retain the readability and accessibility of each separate signal.

When used in RTL code, a packed SystemVerilog struct is the most useful. A packed struct is treated as a single vector, and each data type in the structure is represented as a bit field. The entire structure is then packed together in memory without gaps. Only packed data types and integer data types are allowed in a packed struct. Because it is defined as a vector, the entire structure can also be used as a whole with arithmetic and logical operators.

An unpacked SystemVerilog struct, on the other hand, does not define a packing of the data types. It is tool-dependent how the structure is packed in memory. Unpacked struct probably will not synthesize by your synthesis tool, so I would avoid it in RTL code. It is, however, the default mode of a structure if the packed keyword is not used when defining the structure.

SystemVerilog struct is often defined with the typedef keyword to give the structure type a name so it can be more easily reused across multiple files. Here is an example:

typedef enum logic[15:0]
{
  ADD = 16'h0000,
  SUB = 16'h0001
} my_opcode_t;

typedef enum logic[15:0]
{
  REG = 16'h0000,
  MEM = 16'h0001
} my_dest_t;

typedef struct packed
{
  my_opcode_t  opcode; // 16-bit opcode, enumerated type
  my_dest_t    dest; // 16-bit destination, enumerated type
  logic [15:0] opA;
  logic [15:0] opB;
} my_opcode_struct_t;

my_opcode_struct_t cmd1;

initial begin
  // Access fields by name
  cmd1.opcode <= ADD;
  cmd1.dest <= REG;
  cmd1.opA <= 16'h0001;
  cmd1.opB <= 16'h0002;

  // Access fields by bit position
  cmd1[63:48] <= 16'h0000
  cmd1[47:32] <= 16'h0000;
  cmd1[31:16] <= 16'h0003;
  cmd1[15: 0] <= 16'h0004;

  // Assign fields at once
  cmd1 <= '{SUB, REG, 16'h0005, 16'h0006};
end

What is a SystemVerilog Union?

A SystemVerilog union allows a single piece of storage to be represented different ways using different named member types. Because there is only a single storage, only one of the data types can be used at a time. Unions can also be packed and unpacked similarly to structures. Only packed data types and integer data types can be used in packed union. All members of a packed (and untagged, which I’ll get to later) union must be the same size. Like packed structures, packed union can be used as a whole with arithmetic and logical operators, and bit fields can be extracted like from a packed array.

A tagged union is a type-checked union. That means you can no longer write to the union using one member type, and read it back using another. Tagged union enforces type checking by inserting additional bits into the union to store how the union was initially accessed. Due to the added bits, and inability to freely refer to the same storage using different union members, I think this makes it less useful in RTL coding.

Take a look at the following example, where I expand the earlier SystemVerilog struct into a union to provide a different way to access that same piece of data.

typedef union packed
{
  my_opcode_struct_t opcode_s; // "fields view" to the struct
  logic[1:0][31:0] dword; // "dword view" to the struct
} my_opcode_union_t;

my_opcode_union_t cmd2;

initial begin
  // Access opcode_s struct fields within the union
  cmd2.opcode_s.opcode = ADD;
  cmd2.opcode_s.dest = REG;
  cmd2.opcode_s.opA = 16'h0001;
  cmd2.opcode_s.opB = 16'h0002;

  // Access dwords struct fields within the union
  cmd2.dword[1] = 32'h0001_0001; // opcode=SUB, dest=MEM
  cmd2.dword[0] = 32'h0007_0008; // opA=7, opB=8
end

Ways to Use SystemVerilog Struct in a Design

There are many ways to incorporate SystemVerilog struct into your RTL code. Here are some common usages.

Encapsulate Fields of a Complex Type

One of the simplest uses of a structure is to encapsulate signals that are commonly used together into a single unit that can be passed around the design more easily, like the opcode structure example above. It both simplifies the RTL code and makes it more readable. Simulators like Synopsys VCS will display the fields of a structure separately on a waveform, making the structure easily readable.

If you need to use the same structure in multiple modules, a tip is to put the definition of the structure (defined using typedef) into a SystemVerilog package, then import the package into each RTL module that requires the definition. This way you will only need to define the structure once.

SystemVerilog Struct as a Module Port

A module port can have a SystemVerilog struct type, which makes it easy to pass the same bundle of signals into and out of multiple modules, and keep the same encapsulation throughout a design. For example a wide command bus between two modules with multiple fields can be grouped into a structure to simplify the RTL code, and to avoid having to manually decode the bits of the command bus when viewing it on a waveform (a major frustration!).

Using SystemVerilog Struct with Parameterized Data Type

A structure can be used effectively with modules that support parameterized data type. For example if a FIFO module supports parameterized data type, the entire structure can be passed into the FIFO with no further modification to the FIFO code.

module simple_fifo
(
  parameter type DTYPE = logic[7:0],
  parameter      DEPTH = 4
)
(
  input  logic                      clk,
  input  logic                      rst_n,
  input  logic                      push,
  input  logic                      pop,
  input  DTYPE                      data_in,
  output logic[$clog2(DEPTH+1)-1:0] count,
  output DTYPE                      data_out
);
  // rest of FIFO design
endmodule

module testbench;
  parameter MY_DEPTH = 4;

  logic clk, rst_n, push, pop, full, empty;
  logic [$clog2(MY_DEPTH+1)-1:0] count;
  my_opcode_struct_t data_in, data_out;

  simple_fifo
  #(
    .DTYPE (my_opcode_struct_t),
    .DEPTH (MY_DEPTH)
  )
  my_simple_fifo (.*);
endmodule

Ways to Use SystemVerilog Union in a Design

Until very recently, I had not found a useful way to use a SystemVerilog union in RTL code. But I finally did in my last project! The best way to think about a SystemVerilog union is that it can give you alternative views of a common data structure. The packed union opcode example above has a “fields view” and a “dword view”, which can be referred to in different parts of a design depending on which is more convenient. For example, if the opcode needs to be buffered in a 64-bit buffer comprised of two 32-bit wide memories, then you can assign one dword from the “dword view” as the input to each memory, like this:

my_opcode_union_t my_opcode_in, my_opcode_out;

// Toy code to assign some values into the union
always_comb begin
  my_opcode_in.opcode_s.opcode = ADD;
  my_opcode_in.opcode_s.dest   = REG;
  my_opcode_in.opcode_s.opA    = 16'h0001;
  my_opcode_in.opcode_s.opB    = 16'h0002;
end

// Use the "dword view" of the union in a generate loop
generate
  genvar gi;
  for (gi=0; gi<2; gi=gi+1) begin : gen_mem
    // instantiate a 32-bit memory
    mem_32 u_mem
    (
      .D (my_opcode_in.dword[gi]),
      .Q (my_opcode_out.dword[gi]),
      .*
    );
  end // gen_mem
endgenerate

In my last project, I used a union this way to store a wide SystemVerilog struct into multiple 39-bit memories in parallel (32-bit data plus 7-bit SECDED encoding). The memories were divided this way such that each 32-bit dword can be individually protected by SECDED encoding so it is individually accessible by a CPU. I used a “dword view” of the union in a generate loop to feed the data into the SECDED encoders and memories. It eliminated alot of copying and pasting, and made the code much more concise!

Conclusion

SystemVerilog struct and union are handy constructs that can encapsulate data types and simplify your RTL code. They are most effective when the structure or union types can be used throughout a design, including as module ports, and with modules that support parameterized data types.

Do you have another novel way of using SystemVerilog struct and union? Leave a comment below!

References

[lab_subscriber_download_form download_id=6].

The post SystemVerilog Struct and Union – for Designers too appeared first on Verilog Pro.

]]>
https://www.verilogpro.com/systemverilog-structures-unions-design/feed/ 4
Clock Domain Crossing Design – Part 3 https://www.verilogpro.com/clock-domain-crossing-design-part-3/ https://www.verilogpro.com/clock-domain-crossing-design-part-3/#comments Tue, 17 May 2016 17:00:20 +0000 http://www.verilogpro.com/?p=328 In Clock Domain Crossing (CDC) Design – Part 2, I discussed potential problems with passing multiple signals across a clock domain, and one effective and safe way to do so. That circuit, however, does hot handle the case when the destination side logic cannot accept data and needs to back-pressure the source side. The two ... Read more

The post Clock Domain Crossing Design – Part 3 appeared first on Verilog Pro.

]]>