Abstract
Our project deals with the development and realization of a data compression and decompression unit for the ASCII character set. The data compression and decompression is based on two algorithms, half-byte and run length encoding/decoding.
Half-byte encoding takes advantage of the fact that many ASCII characters share the same first four bits. Hence when a stream of characters that contain the first fours bits are received these bits may be stripped off, effectively halving the amount of data. This gives a very robust compression scheme that will be effective on almost any data set.
Run length encoding is used to compress data containing repeating characters. When the same character is received consecutively four or more times, this algorithm compresses the data into a sequence of three characters. This algorithm is most effective on graphic files, which commonly contain long sequences of the same character.
Our unit gives the user the choice of using either scheme separately, or combining the two algorithms on one set of data. This way the user can use his best judgment in deciding the most effective and efficient combination of encoding schemes based on his data.
Data Compression/Decompression Devices
Data Sheet
Device #1 Run-Length Encoder
Number of I/O pins 27
Number of FG function generators 318
Number of H function generators 53
Number of flip-flops 75
Signal Name Pin Name Pin Number
Read In H16 1
Read in2 H17 2
Write Done H18 3
Read Done J17 5
Read Done2 K16 6
Write Out K17 7
Write Out2 K18 8
Reset L18 12
Output 0 M18 17
Output 1 M17 18
Output 2 N18 19
Output 3 P18 20
Output 4 N17 21
Output 5 R18 22
Output 6 T18 23
Output 7 P17 24
Write Done2 U16 30
Comp/Decomp R17 32
Input 0 P16 33
Input 1 U18 34
Input 2 T14 35
Input 3 U15 36
Input 4 V16 37
Input 5 T13 38
Input 6 U14 39
Input 7 V15 40
Device #2 Half-Byte Encoder
Number of I/O pins 27
Number of FG function generators 476
Number of H function generators 81
Number of flip-flops 67
Signal Name Pin Name Pin Number
Read Done A16 1
Read Done2 B15 2
Write Out C14 3
Write Out2 A17 4
Read In D17 6
Read In2 B18 7
Write Done F16 8
Write Done2 B16 10
Reset C18 12
Comp/Decomp D18 14
Output 0 F17 17
Output 1 E18 18
Output 2 F18 19
Output 3 G17 20
Output 4 G18 21
Output 5 H16 22
Output 6 H17 23
Output 7 H18 24
Input 0 K16 32
Input 1 K17 33
Input 2 K18 34
Input 3 L18 35
Input 4 L17 36
Input 5 L16 37
Input 6 M18 38
Input 7 M17 39
Theory of Operation
These devices were designed to be both a compressor and decompressor in one chip. Both, the Run-Length and Half-Byte device can work independently or be combined through their asynchronous interface.
The Run-Length device reduces redundant information. Upon receiving a data stream of repeated characters, (more than 3) the device activates it compression routine. The device places a marker bit followed by an information bit and the compressed character. So if you have a bit stream containing 8 bytes of 10110110 in a row the compressed form
would be reduced to 11111110 0001000 10110110 or a net saving of 5 bytes. This device acts as a good prefilter for the Half-Byte device as it reduces the number of half-byte control characters that will appear in a row.
The Half-Byte device works in a very different manner. This device reads in the data and determines if the first half of the byte is the same for more than 4 characters. If this condition is meet, compression begins. For example if
you have a data stream as follows 00101000 00101110 00101010 00101111 00101101 00101011 00101001, it would first place a control character 11111111. Then it would place the information character (the first byte) 00101000. The compressed bytes composed of the second halves of the bytes 11101010 11111101 10111001 would then be placed, followed by the end of compression marker 11111111. In this case we have a net saving of 2 bytes. The compression limit of this device is 50%. This device works well on ASCII based text as the alphabet and all the numbers only differ by five different nibbles, and the lower case alphabet by 2 nibbles. This implies words composed of entirely of the first half of the alphabet such as "beginning" or "finding" are suitable for complete compression. For a complete code listing for both algorithms see appendix C.
Device Operation
Both the Run-Length device, and Half-Byte device have the same input and output ports. It should be noted that all lines that end in 2 correspond to compression mode of operation. However there are some lines that are used for both modes of operation those being reset, clock, input, and output. The method used to operate both the Run-Length device, and Half-Byte device is the same as the other, the following is how to connect these devices to other circuits.
There are 4 inputs and 2 outputs from the device. There is a reset pin that must be asserted high before operation of the circuit. As well there is a mode select line that chooses between decompression and compression modes. This line should not be changed during data processing as data may be lost. There is the main Input, comprised of 8 data lines, the data on these lines must be valid only when Read_in or Read_in2 is high. The Read_in and Read_in2 lines are part of the asynchronous architecture of the design. These lines indicate that data is valid, and can be read in, there corresponding output lines are Read_done and Read_done2 respectively. Read_in corresponds to decompression of data mode, and Read_in2 is for compression of data mode.
The output lines indicate that: data is read, data is valid, as well as displays outputted compressed/decompressed data. The Read_done or Read_done2 line goes high once the data has been read from the input and latched. The Write_out and Write_out2 lines indicate that the data on the output is valid. Once the data has been read by some other device it must indicate this by asserting the Write_done or Write_done2 line. Failure to read the output causes the device to stop reading input.
Design Details and Problems
Run-Length Methodology
The most important part of run length encoding is to be able to count the number of occurrences of repeated characters. To implement this the number of times an input character has been repeated must be kept, as well as a flag to indicate that there is a character currently stored and ready to be compared.
When the state machine first reads in a character, the value of char_count is checked. If char_count is one, then a character needs to be stored, so the incoming characters can be compared to it. If char_count is not one, then a character is already stored and the character just read in needs to be compared to the stored character.
Once the two characters are compared, one of two events will occur. The characters will match and repeat_count will be incremented, or the two characters do not match and output needs to be sent.
There are two different ways output will be sent. If the character currently stored has been seen 4 or more times, then compression will occur and the compression indicator character, followed by the compressed character and the number of occurrences of that character will be output. If the character currently stored repeated less than four times, that character will be output. After it is output, the repeat_count is decremented. The character will be output until the repeat_count reaches 0.
Once the output is finished, the most recently read character will become the stored character and a new character will be read in.
To implement run length decoding, the input stream must be searched for the compression indicator character. If the input character is not the compression indicator, then the character read in to the decoder will be directly output. Once the compression indicator character is found, a flag is set to indicate that decompression must occur.
The characters input immediately after the compression indicator will be the compressed character, followed by the number of occurrences of the compressed character. Once the number of occurrences is read in, that value is loaded into a counter. The compressed character is then output and the counter is decremented. This cycle repeats until the counter reaches 0, indicated decompression is complete.
Design Modifications
Our initial design called for tri-state lines for communication between the compression device and the input and output devices. Unfortunately, implementing tri-state lines using Actel or Xilinx tools proves to be troublesome at the best. To overcome this difficulty, full handshaking using two lines between both input and output devices was employed.
Initially, a data producer must check a 'Read_done' line. If this line is low, then the bus is free to put data on. If the line is high, then the data on the bus may not be overwritten. Once the data is put on the bus, the producer asserts 'Read_in' line, indicating to the consumer that there is new data to be read from the bus.
Once the consumer reads the data, it asserts the 'Read_done' line, indicating to the producer that it has the data. The producer acknowledges this by deasserting the 'Read_in' line. The consumer then completes the transaction by deasserting the 'Read_done' line, allowing more data to be put on the bus.
Half-Byte Problems
There are many particular situations one has to deal with when implementing the Half-byte algorithm. The first one is the inability of the algorithm to handle an even number of bytes. This is solved by not compressing the last byte should it be even.
A more difficult problem to solve was the possibility of creating an inadvertent compression marker. Should the input bytes into compression be something like 10101111 10101111 there is a problem because the compressed version will be 11111111 which indicates that compression is over when in fact it shouldn't be. The solution for this problem, regrettably forces us to end the compression prior to such a situation, and dump the uncompressable characters. This solution was chosen for its compact nature, as module number was a concern when we were using the Actel chip. Upon switching to Xilinx we could have (but didn't have time to) used another method that would not stop compression, but rather just not compressed those characters.
Another algorithm problem is what to do with control characters that appear in the data. For this problem we simply just repeat this character twice, a better solution would be to look how many there are and perform Run-Length encoding on them, but seeing as how our project already included a Run-Length encoder we just recommend using that device first.
Testing And Test Circuit Design
In order to test our design we did extensive simulations these simulations can be found in Appendix A. From these simulations we were certain that our design was sound and should port easily to the FPGA. We were wrong, very wrong, after approximately 170 hours of recoding and testing we still were unable to get our state machines to function properly. Unfortunately the Run-Length device didn't function at all. However in the Half-Byte device we did get some results that indicated at least part of the state machine functioned properly.
The Reset function worked properly, initializing the output to 0 as well the Read_in and Read_in2 lines functioned properly as did there corresponding Read_done and Read_done2. The device would handle the case of the input being the control character properly, but failed to process any other data properly. The Write_out and Write_out2 lines functioned indicating that data was ready. As well the device responded correctly to a Write_done and Write_done2 acknowledge. The Main compression engine as well as the main decompression engine failed to function. In the end It was concluded that these problems were probably the result of too many purely combinational loops.
The device we used to test this FPGA was rather simple in design. A complete schematic can be found in Appendix B. No debouncing was done as the designs asynchronous nature doesn't allow for multiple signals without acknowledges. This test device was invaluable, but building it with department parts was more than just a little frustrating. 30% of the first SK-10 board was non-functional and 20% of the second one had dead lines.
Conclusion
This project proved to be more than challenging, in the future I would recommend that one build just a few states, test them simulate them back annotate them, and then add a few more. In this way slowly build your state machines. The functionality of our project was less than desired, however we learned invaluable information from our project. The CAD tools are far less than fool proof and appear more like hacked together programs with pretty little pictures ( this is especially true of the Actel tools). State machines are a tricky thing to get to function on an FPGA. In the future I would not recommend building any one state machine larger than 10 states.
In order to test the effectiveness of our compression algorithms we wrote programs in C language. On a typical text file with both compression algorithms in use we achieved a typical compression of 33%. Hence our algorithms were sound but are implementations failed to function correctly.
Appendix A
Appendix B
Appendix C