VHDL Coding Tips For Excellent Design

Here are the Ten Commandments of Excellent Design:
  1. All state machine outputs shall always be registered
  2. Thou shalt use registers, never latches
  3. Thy state machine inputs, including resets, shall be synchronous
  4. Beware fast paths lest they bite thine ankles
  5. Minimize skew of thine clocks
  6. Cross clock domains with the greatest of caution. Synchronize thy sig- nals!
  7. Have no dead states in thy state machines
  8. Have no logic with unbroken asynchronous feedback lest the fleas of myriad Test Engineers infest thee
  9. All decode logic must be crafted carefully—eschew asynchronicity
  10. Trust not thy simulator—it may beguile thee when thy design is junk
How to Write Ten-Commandment Code

Conforming to the Ten Commandments is not difficult. In this section you’ll see how to write VHDL (your author doesn’t do Verilog, but the translation is easy) that complies with the rules. Robust design and first-silicon success are the goals!

The philosophy behind Ten-Commandment code is that synthesizers are not to be trusted too much. Most of the code you will see is close to the structural level; some more overtly than others. Most of the code is self-explanatory. It is assumed that the reader is familiar with VHDL. Signal names are also obvious to anyone “skilled in the art.”

How to Create a Flip-Flop
One of the basic primitives that we need to create robust synchronous designs is the D-type flip-flop. Look at the code in Code Sample 1: A D-type Flip Flop

-- VHDL Code for a D-Type Flip-Flop with an -- Asynchronous Clear
D_Type_Flip_Flop: process(Reset_n, Clock_In)
  begin
    if (Reset_n = ’0’) then
      Q_Output <= ’0’ after 1 ns;
  elsif (Clock_In’event and Clock_In = ’1’) then
      Q_Output <= D_Input after 1 ns;
  end if;
end process D_Type_Flip_Flop;

This flip-flop has the following properties:
  • An asynchronous active-low clear input sets the Q output to zero.
  • It is triggered on the rising edge of the clock.

How to Create a Latch
While the Ten Commandments specifically forbid the use of latches, there are still those heretics who will insist on the use of latches. The code to instantiate a transparent latch is shown in Code Sample 2: A Transparent Latch

-- VHDL Code for a Transparent Latch
Latch_Data: process(Latch_Open, D_Input)
  begin
    if (Latch_Open = ’1’) then
      Latched_Data <= D_Input;
    -- If Latch_Open = 0, then Latched_Data keeps its old value, -- i.e. the latch is closed.
    end if;
end process Latch_Data;

This latch has the following properties:
  • A latch control that opens the latch when high (the latch is transparent).
How to Create a Metastable- Hardened Flip-Flop

The use of a metastable-hardened flip flop is nothing more than the direct instantiation of a suitable library element in this case, a “dfntns” flip-flop. This is pure structural VHDL. The component declaration is shown in Code Sample 3:  A Metastable-Hardened Flip-Flop, Component Declaration

-- VHDL Code for a Nice Metastable-Hardened Flip-Flop
component dfntns
  Port (
    CP : In std_logic;
    D : In std_logic;
    Q : Out std_logic
  );
end component;

To use the flip-flop in your circuit, instantiate it as shown in Code Sample 4: A Metastable-Hardened Flip-Flop, Instantiation

-- VHDL Code to Instantiate the Metastable-Hardened Flip-Flop
Metastable_Hardened_Flip_Flop_Please: dfntns port map (
  D => D_Input,
  CP => Clock_In,
  Q => Q_Output
  );

This flip-flop has the following properties:
  • A maximum clock-to-out time under worst-case setup and hold time violations. This time is available in the library element specifications.
The Care and Feeding of Toggle Signals

Receiving a Toggle Signal
The Ten Commandments paper suggested that a method for exchanging single- point information across clock domains is by the use of toggle signals. Here, it is assumed that the toggle event should generate an active-high pulse to pass to a state machine. Every toggle—rising edge and falling edge—must create the pulse. In addition, the pulse must be synchronized correctly to the receiver’s clock. The code to accomplish this is shown in Code Sample 5: Receiving a Toggle Signal

-- VHDL Code to Create a Pulse from an Asynchronous -- Toggle Signal
-- First, use a metastable-hardened flip-flop to synchronize the -- toggle input
Metastable_Hardened_Flip_Flop_Please: dfntns port map (
  D => Handshake_T,
  CP => Clock_In,
  Q => Sync_Handshake_T
  );

-- Now pass the synchronized toggle through another flip-flop
Toggle_Reg_Proc: process(Clock_In)
  begin
    if (Clock_In'event and Clock_In = ’1’) then
      Reg_Handshake_T <= Sync_Handshake_T after 1 ns;
    end if;
  end process Toggle_Reg_Proc;

-- Finally XOR the two synchronized signals to create a pulse
Toggle_Pulse <= Reg_Handshake_T xor Sync_Handshake_T;

When synthesizing this code, remember to use the “fix hold” option so a fast path doesn't occur between the two flip-flops in this circuit.

Generating a Toggle Signal
Recall that a toggle signal is generated by simply inverting a level to pass the information. The trivial code to do this is shown in Code Sample 6:  Generating a Toggle Signal

The suffix “_T” is used to denote a toggle signal.

-- VHDL Code to Create a Toggle Signal
Handshake_T <= not (Handshake_T) after 1 ns;

Coding State Machines
The creation of state machines is a mixture of art and science. A well-crafted state machine will possess a sense of elegance; it will be appealing, both functionally and visually.
Here, a very simple example is presented as an illustration of state machine design. The state diagram for the Flintstones State Machine is shown in figure below popularly known as The Flintstones State Machine.


The Flintstones State Machine operates as follows:
  1. The State Machine has two states, State Bed and State Rock.
  2. There is one output, Fred, which takes the value 0 in State Bed and 1 in State Rock.
  3. A reset, caused by a low level on Reset_n, puts the State Machine into State Bed.
  4. The State Machine waits in State Bed while Barney is low, and enters State Rock when Barney goes high.
  5. The State Machine then waits in State Rock while Wilma is low, and returns to State Bed when Wilma goes high.
Implementing the Flintstones State Machine

An example implementation of the Flintstones State Machine is shown in Code Sample 7 below:

-- VHDL Code to Implement the Flintstones State Machine

Flintstones_SM_Proc: process(Sync_Reset_n, Clock_In)
  -- Enumerate state types:
  type Flintstones_Statetype is (
  Bed, Rock
  );

  -- define the state variable:
  variable Flint_State: Flintstones_Statetype;

  -- Here’s the state machine:
  begin
  -- Define the asynchronously set reset states...
    if (Sync_Reset_n = ’0’) then
      Fred <= ’0’ after 1 ns;
      Flint_State := Bed
      -- Default conditions for each output, in this case identical to the -- reset state:
    elsif (Clock_In’event and Clock_In = ’1’) then
      Fred <= ’0’ after 1 ns;

    -- Here are the state transitions:
    case Flint_State is
      when Bed =>
      -- Transition from Bed to Rock:
        if (Barney = ’1’) then
          Fred <= ’1’ after 1 ns;
          Flint_State := Rock;
          -- Holding term in Bed:
        else
          Flint_State := Bed;
        end if;
      when Rock =>
        -- Transition from Rock to Bed:
        if (Wilma = ’1’) then
          Fred <= ’0’ after 1 ns;
          Flint_State := Bed;
        -- Holding term in Rock:
        else
          Fred <= ’1’ after 1 ns;
          Flint_State := Rock;
        end if;
      -- Default term for dead states:
      when others =>
        Flint_State := Bed;
    end case;
  end if;
end process Flintstones_SM_Proc;


Notes on the State machine Implementation
For the most part, the Flintstones State Machine’s operation should be clear. A few points are worth noting, however:
  1. The reset signal (Sync_Reset_n) is synchronized with Clock_In before being sent to the State Machine.
  2. Barney and Wilma must also be synchronous to Clock_In; at the very least, there must be an assurance that the State Machine’s state and output regis- ter’s setup and hold times are not violated.
  3. This design assigns a default value to each output and to the state variable before entering the case statement. This ensures that only those signals that are not taking default (usually inactive) values need be listed in the case statement. This is optional; it is entirely reasonable to list every signal under each transition term, including inactive signals.
  4. Note that the output signal Fred comes directly from a D-type flip-flop: it is not a decode of the state variable. This ensures Fred’s cleanliness (so to speak).
  5. The “when others” in the case statement handles the possibility that the State Machine might end up in a dead state.
The code examples in this tutorial should be considered as examples only. There are many ways to code excellent VHDL; this code is a place to start. If you have a neat snippet of VHDL to add to the list, then please post your valuable comments below.

Calculations using command-line

calculatorMany of you do most of your work from the command-line, using vim to edit files, mutt for e-mails, cd/ls/mv/find/etc instead of a file manager, then you may get annoyed by having to fire up a GUI calculator to make (what may sometimes be) a single calculation.

One useful feature of calculating on the command-line is that you can see what you've typed. For instance, sometimes when I'm entering a long, complex calculation on a calculator (either the GUI or the solid, hold-in-your-hand type), I sometimes forget if I've actually typed in all those numbers or made the calculations in the right order. Maybe it's just me ...

This article shows how to quickly perform standard calculations on the command line including addition, subtraction, multiplication, division, square root, powers, conversion from decimal to hex, decimal to binary, hex to decimal, and binary to decimal. It also briefly introduces using bc in interactive mode and how to write files for use with bc for frequently repeated operations. There is a mention of using Google for performing calculations. It finishes with a little challenge to test the power of your CPU.

Other advantages of using bc include:

  • bc is included with (almost?) all Linux distributions as standard, as well as (again, almost?) all Unix.
  • Results from calculations in some proprietary flavours of bc have up to 99 decimal digits before and after the decimal point. This limit has been greatly surpassed in GNU bc. I don't know what that limit is, but it's at least many, many tens of thousands. Certainly it's more than any GUI-based calculators (I've used) could accommodate.
  • You may also find yourself working in an environment where you simply don't have access to a GUI.
  • The syntax for basic sums is almost identical to Google's calculator function, so you can learn how to use two utilities in one go!

bc is a pre-processor for dc. The useful thing about bc is that it accepts input from files and from standard input. This allows us to pipe data to it for quick calculations.

  • addition
  • subtraction
  • multiplication
  • scale
  • division
  • square root
  • power
  • parentheses
  • obase and ibase
  • convert from decimal to hexadecimal
  • convert from decimal to binary
  • convert from binary to decimal
  • convert from hexadecimal to decimal
  • a brief introduction to interactive mode
  • using bc with shell scripts
  • a brief introduction to using bc with files
  • a quick challenge for your PC (GNU bc only)

Most of these examples follow a simple formula.

addition

$ echo '57+43' | bc
100

subtraction

$ echo '57-43' | bc
14

multiplication

$ echo '57*43' | bc
2451

scale
The scale variable determines the number of digits which follow the decimal point in your result. By default, the value of the scale variable is zero. (Unless you use the l option in which case it defaults to 20 decimal places. More about l later.) This can be set by declaring scale before your calculation, as in the following division example:

division

$ echo 'scale=25;57/43' | bc
1.3255813953488372093023255

square root

$ echo 'scle=30;sqrt(2)' | bc
1.414213562373095048801688724209
This beats Google's calculator function which only calculates the result to 8 decimal places! ;-) Although Google's calculator function has this 8 decimal places limitation, it will allow imaginary numbers as answers.

power

$ echo '6^6' | bc
46656

parentheses
If you have read Robert Heinlein's The Number of the Beast, you may recall that the number of parallel universes in the story equals (six to the power of six) to the power of six. If you should try to calculate that like this:

$ echo '6^6^6' | bc
You will get a screen full of numbers (some 37374 digits), not the
10314424798490535546171949056

that you might expect.

If you're running a non-GNU version of bc, you'll most likely get something like:

exp too big
empty stack
save:args

The Google Calculator balks at '6^6^6' as well. Good ol' GNU.

That's because you typed the wrong question. You need to type:

$ echo '(6^6)^6' | bc

Whereas what you did type was interpreted as:

$ echo '6^(6^6)' | bc

which is an entirely different number altogether. So the positioning of parentheses (brackets to you and me!) is very important. I use brackets to separate the different components of my sums whenever possible, just eliminate any possible doubt that I may get the wrong answer. Consider the following calculations:

$ echo '7+(6*5)' | bc

$ echo '7+6*5' | bc

$ echo '6*5+7' | bc

They all give the same answer, 37, but I would have typed the first calculation, unless of course, I meant:

$ echo '(7+6)*5' | bc

Or to put it another way:

$ echo '13*5' | bc

which is 65.

obase and ibase

obase and ibase are special variables which define output and input base.

Legitimate obase values range from 2 to 999, although anything beyond 16 is wasted on me!

Legitimate ibase values range from 2 to 16.

Some examples will explain all this better.

convert from decimal to hexadecimal

Here we're converting 255 from base 10 to base 16:

$ echo 'obase=16;255' | bc
FF

convert from decimal to binary

And here we're converting the number 12 from base 10 to base 2:

$ echo 'obase=2;12' | bc
1100

Which reminds me of the old joke:

There are only 10 types of people in the world -- those who understand binary, and those who don't.

Which leads us neatly onto the next example:

convert from binary to decimal

Here we're converting the binary number 10 to a base 10 (decimal) number.

$ echo 'ibase=2;obase=A;10' | bc
2

Note that the obase is "A" and not "10". Sorry, you've got to learn some hex. The reason for this is you've set the ibase to "2", so if you now had tried to use "10" as the value for the obase, it would stay as "2", because "10" in base 2 is "2". So you need to use hex to "break out" of binary mode.

Well, that was just to explain the joke; now something a bit more challenging:

$ echo 'ibase=2;obase=A;10000001' | bc
129

convert from hexadecimal to decimal

$ echo 'ibase=16;obase=A;FF' | bc
255

Again, note the use of "A" to denote base 10. That is because "10" in hex (base 16 - the ibase value) is 16.

a brief introduction to interactive mode
You can also run bc in interactive mode:

$ bc

If you're running GNU bc, you should get the following notice:

bc 1.06
Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
Followed by an uninviting blank prompt. Otherwise you'll just get an uninviting blank prompt.
If you wish to get straight to the uninviting blank prompt, use the -q option, which runs bc in quiet mode, preventing the normal GNU bc welcome from being printed:
$ bc -q

Using the basics we've been through from the examples above, enter a calculation:

scale=5
57/43
1.32558


Type quit to exit bc interactive mode.


using bc with shell scripts
You can use shell variables with bc, which is very useful in shell scripts:


$ FIVE=5 ; echo "$FIVE^2" | bc
25


Note the use of double-quotes to preserve the value of the variable $FIVE.

a brief introduction to using bc with files

Using bc with files allows complex calculations to be repeated, again and again, a bit like using a spreadsheet to run the same calculations on changing figures ... but faster.


Here is a simple example:

scale=2

/* C-style comments
are allowed, as are spaces */

print "\nConvert Fahrenheit to Celsius\n\n"
print "Temperature in Fahrenheit: " ; fah = read()
print "\n"
print "Equivalent Temperature in Celsius is: "
(fah - 32.0) * 5.0 / 9.0
quit

Create and save the file, then run it like this:

$ bc -q filename
Convert Fahrenheit to Celsius
Temperature in Fahrenheit: 61
Equivalent Temperature in Celsius is: 16.11


Note that this example has only been tested with GNU bc. Other (proprietary) versions of bc may have more stringent syntax requirements. Some bcs don't allow the use of print or read, for example, so you have to edit your file before each calculation. Not very useful.


a quick challenge for your PC (GNU bc only)
If you wish to test the comparative speed of your PC, try this challenge: use bc to calculate Pi to 5000 decimal places. The idea for this challenge came from a great article at Geekronomicon.
If you really want to tie up your machine for an hour (or more), you could try the "Pi to 25000 decimal places" challenge from the aforementioned Geekronomicon.

First, to put things in perspective, here is some information about my CPU:

$ cat /proc/cpuinfo | egrep "model name|MHz"
model name      : AMD Athlon(tm) 64 Processor 3500+
cpu MHz         : 2211.346


Note the use (below) of the command bc –l -q.
-l loads the math library which is required for the use of the "2" (arctangent) in the calculation for Pi. You can learn more about the math library functions in the
bc command manual.
I'm not sure what effect the -q option (quiet, no welcome message printed) has on our test, but I guess it can't harm.

$ time echo "scale=5000; 4*a(1)" | bc -l -q
3.141592653589793238462643383279502884197169399375105820974944592307\
...
...
...
73774418426312986080998886874132604720


real    0m44.164s
user 0m44.099s
sys 0m0.008s
44.099 seconds! Not bad. :-) I imagine that some Gentoo folks may be interested to see what difference their compile-time optimisations make to the speed of bc. FWIW, my distro of choice is Arch Linux.

Popular Posts