EE552

Abstract

Our project deals with the development and realization of a data compression and decompression unit for the ASCII character set. The data compression and decompression is based on two algorithms, half-byte and run length encoding/decoding.

Half-byte encoding takes advantage of the fact that many ASCII characters share the same first four bits. Hence when a stream of characters that contain the first fours bits are received these bits may be stripped off, effectively halving the amount of data. This gives a very robust compression scheme that will be effective on almost any data set.

Run length encoding is used to compress data containing repeating characters. When the same character is received consecutively four or more times, this algorithm compresses the data into a sequence of three characters. This algorithm is most effective on graphic files, which commonly contain long sequences of the same character.

Our unit gives the user the choice of using either scheme separately, or combining the two algorithms on one set of data. This way the user can use his best judgment in deciding the most effective and efficient combination of encoding schemes based on his data.

Data Compression/Decompression Devices

Data Sheet

Device #1 Run-Length Encoder

Number of I/O pins 27

Number of FG function generators 318

Number of H function generators 53

Number of flip-flops 75

Signal Name Pin Name Pin Number

Read In H16 1

Read in2 H17 2

Write Done H18 3

Read Done J17 5

Read Done2 K16 6

Write Out K17 7

Write Out2 K18 8

Reset L18 12

Output 0 M18 17

Output 1 M17 18

Output 2 N18 19

Output 3 P18 20

Output 4 N17 21

Output 5 R18 22

Output 6 T18 23

Output 7 P17 24

Write Done2 U16 30

Comp/Decomp R17 32

Input 0 P16 33

Input 1 U18 34

Input 2 T14 35

Input 3 U15 36

Input 4 V16 37

Input 5 T13 38

Input 6 U14 39

Input 7 V15 40

Device #2 Half-Byte Encoder

Number of I/O pins 27

Number of FG function generators 476

Number of H function generators 81

Number of flip-flops 67

Signal Name Pin Name Pin Number

Read Done A16 1

Read Done2 B15 2

Write Out C14 3

Write Out2 A17 4

Read In D17 6

Read In2 B18 7

Write Done F16 8

Write Done2 B16 10

Reset C18 12

Comp/Decomp D18 14

Output 0 F17 17

Output 1 E18 18

Output 2 F18 19

Output 3 G17 20

Output 4 G18 21

Output 5 H16 22

Output 6 H17 23

Output 7 H18 24

Input 0 K16 32

Input 1 K17 33

Input 2 K18 34

Input 3 L18 35

Input 4 L17 36

Input 5 L16 37

Input 6 M18 38

Input 7 M17 39

Theory of Operation

These devices were designed to be both a compressor and decompressor in one chip. Both, the Run-Length and Half-Byte device can work independently or be combined through their asynchronous interface.

The Run-Length device reduces redundant information. Upon receiving a data stream of repeated characters, (more than 3) the device activates it compression routine. The device places a marker bit followed by an information bit and the compressed character. So if you have a bit stream containing 8 bytes of 10110110 in a row the compressed form

would be reduced to 11111110 0001000 10110110 or a net saving of 5 bytes. This device acts as a good prefilter for the Half-Byte device as it reduces the number of half-byte control characters that will appear in a row.

The Half-Byte device works in a very different manner. This device reads in the data and determines if the first half of the byte is the same for more than 4 characters. If this condition is meet, compression begins. For example if

you have a data stream as follows 00101000 00101110 00101010 00101111 00101101 00101011 00101001, it would first place a control character 11111111. Then it would place the information character (the first byte) 00101000. The compressed bytes composed of the second halves of the bytes 11101010 11111101 10111001 would then be placed, followed by the end of compression marker 11111111. In this case we have a net saving of 2 bytes. The compression limit of this device is 50%. This device works well on ASCII based text as the alphabet and all the numbers only differ by five different nibbles, and the lower case alphabet by 2 nibbles. This implies words composed of entirely of the first half of the alphabet such as "beginning" or "finding" are suitable for complete compression. For a complete code listing for both algorithms see appendix C.

Device Operation

Both the Run-Length device, and Half-Byte device have the same input and output ports. It should be noted that all lines that end in 2 correspond to compression mode of operation. However there are some lines that are used for both modes of operation those being reset, clock, input, and output. The method used to operate both the Run-Length device, and Half-Byte device is the same as the other, the following is how to connect these devices to other circuits.

There are 4 inputs and 2 outputs from the device. There is a reset pin that must be asserted high before operation of the circuit. As well there is a mode select line that chooses between decompression and compression modes. This line should not be changed during data processing as data may be lost. There is the main Input, comprised of 8 data lines, the data on these lines must be valid only when Read_in or Read_in2 is high. The Read_in and Read_in2 lines are part of the asynchronous architecture of the design. These lines indicate that data is valid, and can be read in, there corresponding output lines are Read_done and Read_done2 respectively. Read_in corresponds to decompression of data mode, and Read_in2 is for compression of data mode.

The output lines indicate that: data is read, data is valid, as well as displays outputted compressed/decompressed data. The Read_done or Read_done2 line goes high once the data has been read from the input and latched. The Write_out and Write_out2 lines indicate that the data on the output is valid. Once the data has been read by some other device it must indicate this by asserting the Write_done or Write_done2 line. Failure to read the output causes the device to stop reading input.

Design Details and Problems

Run-Length Methodology

The most important part of run length encoding is to be able to count the number of occurrences of repeated characters. To implement this the number of times an input character has been repeated must be kept, as well as a flag to indicate that there is a character currently stored and ready to be compared.

When the state machine first reads in a character, the value of char_count is checked. If char_count is one, then a character needs to be stored, so the incoming characters can be compared to it. If char_count is not one, then a character is already stored and the character just read in needs to be compared to the stored character.

Once the two characters are compared, one of two events will occur. The characters will match and repeat_count will be incremented, or the two characters do not match and output needs to be sent.

There are two different ways output will be sent. If the character currently stored has been seen 4 or more times, then compression will occur and the compression indicator character, followed by the compressed character and the number of occurrences of that character will be output. If the character currently stored repeated less than four times, that character will be output. After it is output, the repeat_count is decremented. The character will be output until the repeat_count reaches 0.

Once the output is finished, the most recently read character will become the stored character and a new character will be read in.

To implement run length decoding, the input stream must be searched for the compression indicator character. If the input character is not the compression indicator, then the character read in to the decoder will be directly output. Once the compression indicator character is found, a flag is set to indicate that decompression must occur.

The characters input immediately after the compression indicator will be the compressed character, followed by the number of occurrences of the compressed character. Once the number of occurrences is read in, that value is loaded into a counter. The compressed character is then output and the counter is decremented. This cycle repeats until the counter reaches 0, indicated decompression is complete.

Design Modifications

Our initial design called for tri-state lines for communication between the compression device and the input and output devices. Unfortunately, implementing tri-state lines using Actel or Xilinx tools proves to be troublesome at the best. To overcome this difficulty, full handshaking using two lines between both input and output devices was employed.

Initially, a data producer must check a 'Read_done' line. If this line is low, then the bus is free to put data on. If the line is high, then the data on the bus may not be overwritten. Once the data is put on the bus, the producer asserts 'Read_in' line, indicating to the consumer that there is new data to be read from the bus.

Once the consumer reads the data, it asserts the 'Read_done' line, indicating to the producer that it has the data. The producer acknowledges this by deasserting the 'Read_in' line. The consumer then completes the transaction by deasserting the 'Read_done' line, allowing more data to be put on the bus.

Half-Byte Problems

There are many particular situations one has to deal with when implementing the Half-byte algorithm. The first one is the inability of the algorithm to handle an even number of bytes. This is solved by not compressing the last byte should it be even.

A more difficult problem to solve was the possibility of creating an inadvertent compression marker. Should the input bytes into compression be something like 10101111 10101111 there is a problem because the compressed version will be 11111111 which indicates that compression is over when in fact it shouldn't be. The solution for this problem, regrettably forces us to end the compression prior to such a situation, and dump the uncompressable characters. This solution was chosen for its compact nature, as module number was a concern when we were using the Actel chip. Upon switching to Xilinx we could have (but didn't have time to) used another method that would not stop compression, but rather just not compressed those characters.

Another algorithm problem is what to do with control characters that appear in the data. For this problem we simply just repeat this character twice, a better solution would be to look how many there are and perform Run-Length encoding on them, but seeing as how our project already included a Run-Length encoder we just recommend using that device first.

Testing And Test Circuit Design

In order to test our design we did extensive simulations these simulations can be found in Appendix A. From these simulations we were certain that our design was sound and should port easily to the FPGA. We were wrong, very wrong, after approximately 170 hours of recoding and testing we still were unable to get our state machines to function properly. Unfortunately the Run-Length device didn't function at all. However in the Half-Byte device we did get some results that indicated at least part of the state machine functioned properly.

The Reset function worked properly, initializing the output to 0 as well the Read_in and Read_in2 lines functioned properly as did there corresponding Read_done and Read_done2. The device would handle the case of the input being the control character properly, but failed to process any other data properly. The Write_out and Write_out2 lines functioned indicating that data was ready. As well the device responded correctly to a Write_done and Write_done2 acknowledge. The Main compression engine as well as the main decompression engine failed to function. In the end It was concluded that these problems were probably the result of too many purely combinational loops.
The device we used to test this FPGA was rather simple in design. A complete schematic can be found in Appendix B. No debouncing was done as the designs asynchronous nature doesn't allow for multiple signals without acknowledges. This test device was invaluable, but building it with department parts was more than just a little frustrating. 30% of the first SK-10 board was non-functional and 20% of the second one had dead lines.

Conclusion

This project proved to be more than challenging, in the future I would recommend that one build just a few states, test them simulate them back annotate them, and then add a few more. In this way slowly build your state machines. The functionality of our project was less than desired, however we learned invaluable information from our project. The CAD tools are far less than fool proof and appear more like hacked together programs with pretty little pictures ( this is especially true of the Actel tools). State machines are a tricky thing to get to function on an FPGA. In the future I would not recommend building any one state machine larger than 10 states.

In order to test the effectiveness of our compression algorithms we wrote programs in C language. On a typical text file with both compression algorithms in use we achieved a typical compression of 33%. Hence our algorithms were sound but are implementations failed to function correctly.

Appendix A

Appendix B

Appendix C