Featured post

Top 5 books to refer for a VHDL beginner

VHDL (VHSIC-HDL, Very High-Speed Integrated Circuit Hardware Description Language) is a hardware description language used in electronic des...

Sunday 15 March 2015

A Cache Memory

Today we feel to revise what we know about cache memory. A cache is a memory device that improves performance of the processor by transparently storing data such that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere.



Access to cache can result in either one of the following: cache miss or cache hit.Cache hit means that the requested data is contained in the cache and cache miss means data is not found there in cache.On cache hit processor takes data from cache itself for processing.On cache miss the data is fetched from the original memory location.Cache memories are volatile and small in storage size.Since the storage size is small the address decoding takes less time and hence caches are faster then normal physical memories(RAM's) in computers.

As I said the data is stored transparently in cache.This means that the user who is requesting data from the cache need not know whether data is stored in cache or system memory.It is handled by the processor.The word cache means "conceal" in French.


A simple cache contains three fields.

1. An index which is local to the cache.
2. A tag which is the index with reference to the main memory.This will let the processor know the location in main memory where an exact copy of data is stored.
3. Data, which is actual data needed by the processor.


When processor needs some data from the memory it first checks in cache.It sees all the tag fields in the cache to see whether same data is available in cache.If the tag is found then the corresponding data is taken.Otherwise a cache miss error is asserted and the main memory is accessed.Also the cache memory is updated with the recent memory access.This is called cache update on cache miss.


During a cache update if the cache is full, then it has to delete a row.This is decided on a cache replacement algorithm.Some algorithms are:

1. LRU - Least recently used data is replaced.
2. MRU - Most recently used data is replaced.
3. Random replacement - Simple, used in ARM processors.
4. Belady's Algorithm - discards the data which may not be used for the longest time in future.Not perfectly implementable in practice.


The average memory access time of a cache enabled system can be calculated using the hit and miss ratio of a cache.

Average memory access time = (Time_cache * Hit_counts ) + ( (Time_cache + Time_mm) * Miss_counts)

where,
Time_cache and Time_mm is the time needed to access a location for cache and main memory respectively.
Hit_counts and Miss_counts are the hit and miss probabilities.


There are two types cache writing: write back(copy-back) and write through.

When the data at a particular memory location is updated then this data must be written back to cache.If the data is updated only in the cache then it is called write back.If the updating of data happens both in cache and main memory then it is called write through.Write through keeps the cache and memory synchronized.In the write back operation since the cache data is not same as the main memory data it is marked as "dirty" data.These dirty data will be written back into main memory when the particular data is cleared from the cache.If a miss happens in a write-back cache it may sometimes require two memory accesses to service : one to first write the dirty location to memory and then another to read the new location from memory.

The main memory locations may be altered without proper updating in cache by peripherals using DMA or by a multi core processor.This results in a out of date data in cache.These type of data is called "stale" data.To solve these stale data problems we have to use cache coherence protocols between the cache managers to keep the data consistent.


All caches are CAM(content accessible memory).And for efficiency we have to scan all the memory contents in one cycle.This requires parallel hardware.Also higher the memory size the more is the memory access time.

Let us see now,how a cache is made.Say we have a 32 bit main memory in our system and the cache chip size is 4 Kb.Also say each line in cache stores 32 bytes so that there are totally 128 lines.Each line in cache have two fields. Address(4 bytes) and Data.The address is further divided into two fields- Tag(27 bits) and offset(5 bits for indexing a particular byte among the data).Remember that the tag contain the MSB 27 bits of the address here.These kind of caches are called Fully associative caches.Since the tag is 27 bits(relatively long) it takes more time to read data from Fully associative cache.Also more hardware circuit is required for parallel reading of tags from the CAM type cache.So they are expensive but more efficient.

Fully associative cache


Another type of cache architecture is known as direct mapped cache.In this the address is divided into three fields named tag(20 bits),index(7 bits used as an index the 128 lines in cache) and offset(5 bits).The problem with this type of cache is that the cache is less efficient since the main memory cannot be copied to any line in cache as in fully associative cache.This is because the addresses with the same index will be mapped to the same line in cache.But the cache access time is less here.In certain situations you may get a cache miss for almost every access.So they are cheap but less efficient.

Direct mapped Cache
Another type of cache is called set associative cache which has the advantages of both direct mapped and fully associative caches.These are again subdivided based on the number of bits in the index field.

2. 2 way set associative cache - In this type of cache we have two group of lines,each containing 64 lines.The cache has the same number of fields as direct mapped cache but tag has 21 bits and index has 6 bits here.


2. 4 way set associative cache - Here we have 4 groups each contains 32 lines.index has 5 bits and tag has 22 bits.

2-way and 4-way set associative caches

Sunday 1 March 2015

SystemVerilog Associative Arrays

In previous post we learn in detail about SystemVerilog Dynamic arrays which is useful for dealing with contiguous collections of variables whose number changes dynamically. Now consider if the size of array is unknown then how much size will you allocate to array?

An Associative array is one to use when the size of the collection is unknown or the data space is sparse. So the associative arrays are mainly used to model the sparse memories. In the associative arrays the storage is allocated only when we use it not initially like in dynamic arrays. Associative arrays can be assigned only to another Associative array of a compatible type and with the same index type.

Another main difference between Associative array and normal arrays is in that in assoc arrays the arrays index can be any scalar value. 

Properties of Associative arrays:
  • Dynamically allocated, non-contiguous elements
  • Accessed with integer, or string index, single dimension
  • Great for sparse arrays with wide ranging index
  • Array functions: exists, first, last, next, prev
Declaration Syntax:

data_type array_name [ index_type ];

Where,
data_type : data type of the array element. This can be any type that is allowed for Fixed Arrays
array_name : name of the array being declared.
index_type : data type to be used as index

Example : Associative array declaration

int array_name[*];//Wildcard index. can be indexed by any integral datatype.
int array_name [string];// String index
int array_name [some_Class];// Class index
int array_name [integer];// Integer index
typedef bit signed [4:1] Nibble;
int array_name [Nibble]; // Signed packed array

Accessing the Associative arrays
SystemVerilog provides various in-built methods to access, analyze and manipulate the associative arrays.
  • num() or size() returns the number of entries in the associative arrays.
  • delete() removes the entry from specified index.
  • exist() checks weather an element exists at specified index of the given associative array.
  • first() assigns to the given index variable the value of the smallest/first index in the associative array. Returns 0 if array is empty; else returns 1.
  • last() assigns to the given index variable the value of the largest/last index in the associative array. Returns 0 if array is empty; else returns 1.
  • next() finds the entry whose index is greater than the given index. If next entry exists then the index variable is assigned to the index of next entry and returns 1. Otherwise the index is unchanged and the function returns 0.
  • previous() finds the entry whose index is smaller than the given index. If previous entry exists then the index variable is assigned to the index of previous entry and returns 1. Otherwise the index is unchanged and the function returns 0.

Example : Associative Arrays in-built methods

// SystemVerilog Associative arrays
module associative_array ();
 
 integer associative_array [integer];
 
 integer i;
 
 initial begin
   // Add element array
   associative_array[100] = 101;
   $display ("value stored in 100 is %d", associative_array[100]);
   associative_array[1]   = 100;
   $display ("value stored in 1   is %d", associative_array[1]);
   associative_array[50]   = 99;
   $display ("value stored in 50  is %d", associative_array[50]);
   associative_array[250] = 22;
   $display ("value stored in 250 is %d", associative_array[250]);
   // Print the size of array
   $display ("size of array is %d", associative_array.num());
   // Check if index 2 exists
   $display ("index 2 exists   %d", associative_array.exists(2));
   // Check if index 100 exists
   $display ("index 100 exists %d", associative_array.exists(100));
   // Value stored in first index
   if (associative_array.first(i)) begin
     $display ("value at first index %d value %d", i, associative_array[i]);
   end
   // Value stored in last index
   if (associative_array.last(i)) begin
     $display ("value at last index  %d value %d", i,  associative_array[i]);
   end
   // Delete the first index
   associative_array.delete(100);
   $display ("Deleted index 100");
   // Value stored in first index
   if (associative_array.first(i)) begin
     $display ("value at first index %d value %d", i, associative_array[i]);
   end
    #1  $finish;
 end
 
 endmodule

Simulation Result:
value stored in 100 is 101
value stored in 1 is 100
value stored in 50 is 99
value stored in 250 is 22
size of array is 4
index 2 exists 0
index 100 exists 1
value at first index 1 value 100
value at last index 250 value 22
Deleted index 100
value at first index 1 value 100

Try simulation yourself here

Now as you know that we can use any data type as index of associative arrays, below are the things to keep in mind while using different index datatypes.

1. Integer or int index : While using integer in associative arrays, following rules need to be kept in mind.

  • A 4-state index containing X or Z is invalid.
  • Indices smaller than integer are sign extended to 32 bits.
  • The ordering is signed numerical.
  • Indices can be any integral expression.
  • Indices are signed.
  • Example: int array_name [ integer ];

2. String index : While using string in associative arrays, following rules need to be kept in mind.

  • An empty string "" index is valid.
  • The ordering is lexicographical (lesser to greater).
  • Indices can be strings or string literals of any length.
  • Example: int array_name [ string ];

3. Class index : While using class in associative arrays, following rules need to be kept in mind.
  • A null index is valid.
  • The ordering is deterministic but arbitrary.
  • Indices can be objects of that particular type or derived from that type.
  • Example: int array_name [ some_Class ];

4. Wild Character index : While using wild characters in associative arrays, following rules need to be kept in mind.
  • A 4-state Index containing X or Z is invalid.
  • Indices are unsigned.
  • Indexing expressions are self-determined; signed indices are not sign extended.
  • The ordering is numerical (smallest to largest).
  • A string literal index is auto-cast to a bit-vector of equivalent size.
  • The array can be indexed by any integral data type.
  • Example: int array_name [*];