Hints and HELP with VHDL and Programmable Logic

**************************************NOTE*************************************

    This document is by no means held to be the word of God. We cannot guarantee that the info listed below will work in all circumstances.  The actual use of the different CAD tools is not discussed in as deep of detail as others have done.  We felt that there was enough exceptional documentation on the nitty-gritty use of the different tools available to the student.   Rather this document focuses on different methods and practices of design which should aid future students by reducing the amount of time wasted due to encountering problems experienced by this class.
    We assume the reader has a solid VHDL understanding and exposure to Mentor Graphics, Actmap and Designer.

********************************************************************************

1) Compiler Implementation Differences:

    Similar to the differences between Borland and Microsoft C++ compilers, Mentor Graphics and Actmap exhibit differences in what they produce, and how they produce it.  To begin with, Mentor Graphics produces behavioraly functional compilations while Actmap produces physically implemented compilations.  While this may seem to be common sense, one should understand the implications of this before entering into a design.

    Most groups, ours included, began the project by creating all their VHDL code under Mentor Graphics and simulating this code directly through quicksim. Once we completed our design and the simulations showed that our design operated as we desired we then we moved to Actmap. 

    The first problem is that sometimes Actmap is unable to produce a realization for some of the statements you may have used in the design. This is not nescessarily due to any specific statement, but rather how you use the statement. Similarly, Mentor may not be able to decide what to do with statements which compiled without error in Actmap. In the small experience we had, the usual solution involved re-coding your block so that you implemented the process in a slightly different way, either latching signals, changing your functional criteria ( eg. if <criteria> then), or some other similar tactic. This problem can also happen without warning, we had a couple cases where it sis not implement one of our functions but there were no errors.

    Secondly, when Actmap creates an EDN file from your VHDL code, the actual functionality of the block may be altered as well. Mentor analyzes what the code is supposed to do then processes these functions when you simulate it. This operation has no bearing on physical restraints such as gates, delay etc. When Actmap compiles code it starts with a description of operation similar to what Mentor has, then implements it with actual modules as defined in the ACT1 libraries. Along with these physical implementations comes some basic delay information as well as the possibility that some function may be altered due to physical restraints. Lets say you were adding 2 numbers of 8 or more bits each. Now the original simulator would say, "Ok we have to add these two numbers and I am told to do it on the rising edge of clock." but when Actmap creates it, the adder may require more than 2 modules and may even be more than one level. In this case it may take 2 or more clock cycles to produce an answer, waiting for this extra time may produce results totally unexpected from your earlier simulations.

The best way to get around this problem? Well, I'm not sure this is the best way, but it is reliable.

 

2) Modular vs. Flat:

    If you are the kind of person who takes delight in staring at a computer screen for about 2 weeks straight trying to determine why signal X us suddenly unknown, then Flat VHDL design methods are just for you. If you however would rather be able to rapidly find and fix a problem then modular VHDL methods are definitely the way to go. Flat VHDL design involves few files with a great deal of process and therefore code in each file. Modular VHDL design involves several different files each with as small a function as reasonably possible. These files are heirarchicaly organized with the smaller blocks being used as components and forming the basis of the function of the higher blocks. For example, to implement an alarm clock you could use 3 main modules, the real-time clock, the alarm-time storage registers, and the alarm interface. For the real-time clock you could sub divide it into several counters to count seconds, minutes and so on.

    Implementing this code in a modular style allows you to alter the operation of one piece without altering the operation of all the others. So if the clock is fast, you could alter only the portion that counts seconds to slow it down. If it were flat you may have had a single process which kept track of time and altering the seconds portion may affect how the minutes portion works.

3) To Optimize or not to Optimize:

    The function of the optimizer is to parse and alter your netlist files to optimize speed, area or fanout while leaving the functionality of the system unaltered. We did not find this to be the case with this version of Actmaps' Netlist Optimizer. As an example, we compiled the top level of our design with no problem, we simulated the structural code and it behaved perfectly. When we went to fit it to the FPGA for back annotation it was a couple modules too large so logically we optimized it for area, got our timing info and simulated it. We thought we had large timing problems when we looked at the simulation because it was totally different than what we had before. After several attempts to rectify the problem we seemed to be getting nowhere. Finally I changed the way one of out LUT's worked (I removed some values) which shrank the system enough to fit on the FPGA without optimization. The change I made did not effect operational conditions at all. When we back-annotated this new un-optimized code it worked fine. As an experiment I optimized this working code and back-annotated again, as before the simulation was completely different.

    I would suggest that people not use the optimizer unless it is actually nescessary for their project, I feel this is a problem that should be brought to Actel for resolution.

4) Reducing Module Counts:

    For our project in particular module counts were a big issue. We found that there were three methods in particular which were very useful for lowering the amount of modules required by some function. They are:

  1. Removing open If statements. a statement similar to the following produces very large module counts.
    if (Temp_in = "10001000") then
        Display_out <= "1111";

    else
        Display_out <= "0000";
    end if;

    The
    reasonit uses a lot of modules is that the compiler interpretes this as 256 different if statements, each covering 1 possible value of Temp_in. You may get some reduction due to redundant states but 256 possibilities still is large. To reduce the count a piece of code like this will reduce the size while still retaining proper operation.
    if (Temp_in = "10001000") then
        Display_out <= "1111";

    end if;
    if (Temp_in /= "10001000") then
        Display_out <= "0000";

    end if;

    There are 2 possibilities and one or the other will be used in all cases where the signal is not X or U. YOu have reduced 256 possibilities down to 2.
  2. Using LUT's in place of finite valued calculations. Take for instance one of our original pieces of code.(Other than the fact that implementing a while loop is very difficult, this code works fine.)

    ConvTEMP: process(Temp) is --8 bit number to 4 BCD digits.
    begin
    TMPWRK <= RLTMP;
    If (TMPWRK = "00000000") or (TMPWRK = "10000000") then
        TMPBCD1 <= "0000";
        TMPBCD2 <= "0000";
        TMPBCD3 <= "0000";
        TMPBCD4 <= "0000";
    ELSE
        While TMPWRK > "00000000" loop
            if TMPWRK > "100000000" then
                TMPBCD1 <= "1111";
                TMPWRK <= TMPWRK - "100000000";
            else
                if TMPWRK >= "01100100" then
                    TMPBCD2 <= TMPBCD2 + "0001";
                    TMPWRK <= TMPWRK - "01100100";
                else
                    if TMPWRK >= "00001010" then
                        TMPBCD3 <= TMPBCD3 +"0001";
                        TMPWRK <= TMPWRK - "00001010";
                    else
                        if TMPWRK >= "00001010" then
                            TMPBCD4 <= TMPBCD4 + "0001";
                            TMPWRK <= TMPWRK - "00000001";
                        end if;
                    end if;
                end if;
            end if;
        end loop;
    END IF;
    end process;

        This is a semi-nice, straightforward piece of code wich is not to difficult to understand. It recursively subtracts 100 from TMPWRK until the it is < 100, then subtracts 10 until the it is < 10 then the same for ones. Each time it subtracts 100, 10 or 1 it increments the 100's, 10's or 1's display digit as appropriate. When the calculation reaches 0 we have converted 8 bit to BCD. The problem however is that this method uses over 200 modules.

        Since there is a limited number of combinations for the 8 bit number (256) we tried implementing the above conversion using a LUT implemented by a CASE statement. Basically there were 256 when  statements in the CASE and each WHEN set a certain pattern to the outputs, ie. TMPWRK = "00000000" produced BCD1="0000", BCD2="0000", BCD3="0000", and BCD4="0000". By now you should realize the sheer size of the file to do this conversion, and yes it was a lot of work. The result of this painstaking work though, was a converter which fit into 48 modules. I think this clearly illustrates the some of the tradeoffs, More work for less space. We also noticed that the LUT method produces a HIGH complexity when you compile (10892) and this block took an Ultrasparc 35 minutes at minimum to compile this converter.
  3. Intelligent design choices. Under this category all I will say is this. Latches do not take up a lot of space, nor do flip-flops. Comparitors take a LOT of space followed by adders which take a little less space and then subtractors. From the above example, the first code used a lot of comparitors, adders, and subtractors while the LUT used no comparitors, adders, or subtractors.

5) Timing Considerations:

    If you have a lot of functions which are very sensitive to changes in the clock, or other regularily changing signals, we would suggest you use several process statements to implement complex operations rather than just one. Similar to dividing functions between components you can put part of your operation in one process and part in another, this way they will function more in parallel to one another and you can pipeline your data path for each. Using an example from our project, our sensor interface had to read in data, subtract 40 from the number then convert it to 4 BCD codes. As we saw earlier we have a LUT to convert to BCD, but we don't want it constantly spitting out values, we only want it to operate when the data changes. Similarily we do not want the part which communicates with the A/D to have to wait for the calcualtor to finish subtracting 40. So we use 3 processes to co-ordinate them. The converter process is sensitive to changes in the output from the calculator and the calculator is sensitive to changes in the interface. Things operate nicely parallel which makes actual operation much nicer for us. The reason is that this type of system can run at a higher clock frequency due to shorter dependant data paths and pipelining.

 

6) "I hate the CAD Lab syndrome.":

   If you are at all like me, you can get very tired of the CAD lab. To do some of this work at home there are a few possibilities open to you. 

    Actel produces designer lite as a free download which you can obtain. This software is fully functional in every way except it will only handle designs under a certain size, however with the chips that this project uses this limitation never comes into effect.  The package includes all the Actmapw tools (including edn2vhdl, VHDL compiler, Netlist optimizer, among others) as well as Designer.  The only drawback is that you cannot use these tools to simulate your designs, you can only write them, compile them, and lay them out. You may use notepad or wordpad to create your VHDL and they make very nice editors. **note: If you take your code from windows to the unix machines remember that UNIX uses carriage returns while DOS uses carriage returns and line feeds to terminate lines.  To ensure that the unix compilers do not fail with "Unrecognized graphic character" do the following.

  • copy the file to your unix account
  • type dos2unix filename.vhd someotherfilename.vhd
  • type mv someotherfilename.vhd filename.vhd

    If you want to take your code home to work on it. Use wordpad (which properly interprets the unix CR. or if you use notepad do the following.

  • unix2dos filename.vhd someotherfilename.vhd
  • mv someotherfilename.vhd filename.vhd
  • copy the file to your windows box.

    Another option is to do your design in some other package.  I used Warp by cypress logic.  This package came with a VHDL textbook, "VHDL for Programmable Logic." (obtainable from Chapters).  The package includes a VHDL compiler, simulator, and synthesis tools, BUT they are focused on the cypress family of FPGA's.  IF all you are trying to do is produce functional VHDL then this may be of use to you. If you do try to use it, remember some tips.

 

(If you have any ideas to add to this document mail raymond.richmond@ualberta.ca with them so I can add them to my list.)