In different levels of the multiprocessor system, there could be variations of the data. This may also happen in the level of memory hierarchy. For instance, there could be a variation in the copy from the original object in the main memory and the cache.
The different copies of the block of memories vary as the operation of the multiple processors is in parallel and independent, thus leading to cache coherence problem. To overcome this problem, parallel architecture provides with the cache coherence schemes which facilitated in retaining the identical state of the cached data.
From the figure depicted, two processor P1 and P2 refer the shared data by element X. The new data X1 is written by the processor P1, this is enabled by the write-through policy to copy the same in the shared memory thus leading to mismatch between the main memory and cached memory. This can be overcome by using the write-back policy which facilitates in updating the main memory whenever the cached memory data is replaced.
The problem of inconsistency may arise due to many sources. Some of them are -
The system of memory which enables to maintain uniformity of the data in the shared memory and cached memory is termed as snoopy protocols which resembles the memory system that is based on bus memory. The consistency and uniformity is maintained by the snoopy bus protocols by using the policies of Write-invalidate and write-update.
In the first figure, it is depicted that, the processors P1, P2, and P3 have the copy of the data ‘X’ in shared memory and cache memory. The write-invalidate protocol is used for writing X1 in the cache memory by the processor P1 and the bus is used invalidating the other copies. The blocks that are invalidated are not to be used and hence are termed as dirty. Through bus, the copies of the cache are updated by using the write-update protocol and the copies of the memory are updated by using the write back cache.
When the commands for execution and invalidation are used in the due course, some of the actions and events take place. They are as follows -
To suit and cope with the network with multistage, several changes and updation need to be carried out for the snoopy cache protocols thus enabling them capable of developing large multiprocessor including hundreds of processors. The caches that maintain the copy of the black are communicated with the consistency commands, as broadcasting turn out to be too expensive in a multistage network environment. To serve the purpose of network-connected multiprocessors, the directory-based protocols are being created and designed.
In this system, a common directory maintains the data that is required to be shared and a logical consistency is built between all the caches. An entry is being loaded from the primary memory to the cache memory only when the directory permits to do so. The entry when changed is either updated or will enable to invalidate other cache entries.
The exchange of information between different processors by one processor communicating with the other processor is known as synchronization.
The process of synchronization is carried out by mostly using hardware mechanisms of the multiprocessor systems. The synchronization process is carried out by using the some of the primitives such as memory read, write or read-modify-write along with some of the inter-processor interrupts.
For the processor with cache memory, it is very difficult and important concern for maintaining cache coherence. As the chances of occurrence of inconsistency in the data among different caches is more.
The major concern areas are −
The data element X in the local caches of the two processors P1 and P2 is the same and when P1 writes to X, the main memory is also updated. Now X is not identified by P2 if it wants to read X, as it is updated.
Initially, the data element X is present in the cache of P1 and not in P2. Any process done on P2 is received on P1 only after writing on X. As the data element on X turns to be outdated, the process cannot read the data element on X anymore. The data element X is initially written by P1 and is shifted to P2. Then the data element X is read by P2 but the copy of the outdated X still exists in the main memory.
For all the two-processor multiprocessor architecture, a bus is derived and an I/O device is being added to the bus. The new data element is enabled by the I/O device to be stored in the main memory thus making the data element X outdated. When X is transferred by I/O device, the outdated copy is sent.
The processors which possess the same shared memory in the computer system refer to Uniform Memory Access (UMA) architecture. Symmetric Multiprocessors (SMPs) are those UMAs that are most widely used by the servers. The processor is enabled to uniform access to all the resources like memory, disks, and other I/O devices by the SMP.
Internal shared networks are being possessed by the SMPs in the NUMA design. All the networks are being connected through a network of message-passing. Hence NUMA depicts the architecture of logically shared physically distributed memory.
A particular element of the memory is determined as either from the local SMP memory or from remote memory by the NUMA machine by using the processor cache-controller. The remote data can be cached by applying the cache processor of the NUMA architecture. There is a need for maintaining of cache coherency as caches exist and hence is also termed as CC-NUMA (Cache Coherent NUMA).
On the basis of addresses, a particular location in the DRAM cache is being identified for mixing the data blocks. The local main memory stored that remotely obtained data. Data blocks are enabled to move in the system as any home location is not assigned to them.
For passing of the messages, the architecture of COMS follows the hierarchical process in which the directory is provided by the tree and the sub-trees constitute the data elements. The requirement of data leads to searching as a home location is not assigned to it. The desired data need to be searched from the directories of the tree, enabling the requirement of a traversal along the switches for accessing remotely. The multiple requests received by the subtrees are combined as one request and is sent to the parent tree. The multiple copies of the desired output data is sent to all the subtrees.
Some of the significant differences between COMA and CC-NUMA are as follows -
Parallel Computer Architecture Related Interview Questions
|Python Interview Questions||C++ Interview Questions|
|Artificial Intelligence Interview Questions||Computer Graphics Interview Questions|
|Compiler Design Interview Questions||Computer architecture Interview Questions|
|Synchronized Multimedia Integration Language (SMIL) Interview Questions||x86 Interview Questions|
|Multimedia compression Interview Questions||Advanced C++ Interview Questions|
|Basic C Interview Questions|
Parallel Computer Architecture Tutorial
Parallel Computer Architecture
All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd
Wisdomjobs.com is one of the best job search sites in India.