Parallel Computer Architecture Interview Questions & Answers

5 avg. rating (100% score) - 1 votes

Need to change your career to Parallel Computer Architecture? Then we will offer you with all the essential entity for you to clear the interview in Parallel Computer Architecture jobs. With our jobs portal, you will find the number of jobs associated to you along with the Parallel Computer Architecture Interview Questions and Answers. There are numerous important companies that offer jobs in several roles like Graphics & Parallel Programming Architect, Computer Systems Researcher, Deep Learning Performance Architect, Cryptanalytic Computer Scientist, Performance Architect, and many other roles too. To save the time in searching for all the topics related Parallel Computer Architecture on different websites we have provided you with all types of topics at one place. For more details please feel free to visit our site

Parallel Computer Architecture Interview Questions

    1. Question 1. What Is Shared Memory Architecture?

      Answer :

      A single address space is visible to all execution threads.

    2. Question 2. What Is Numa Memory Architecture?

      Answer :

      NUMA stands for NonUniform memory access and is a special type of shared memory architecture where access times to different memory locations by a processor may vary as may also access times to the same memory location by different processors. 

    3. Question 3. Name Some Network Architectures Prevalent In Machines Supporting The Message Passing Paradigm?

      Answer :

      Ethernet, Infiniband, Tree

    4. Question 4. What Is Dataparallel Computation?

      Answer :

      Data is partitioned across parallel execution threads, each of which perform some computation on its partition usually independent of other threads.

    5. Question 5. What Is Taskparallel Computation?

      Answer :

      The parallelism manifests across functions. A set of functions need to compute, which may or may not have order constraints among them.

    6. Question 6. What Is Tasklatency?

      Answer :

      The time taken for a task to complete since a request for it is made.

    7. Question 7. What Is Speedup?

      Answer :

      The ratio of some performance metric (like latency) obtained using a single processor with that obtained using a set of parallel processors.

    8. Question 8. What Is Parallel Efficiency?

      Answer :

      The Speedup per processor

    9. Question 9. What Is An Inherently Sequential Task?

      Answer :

      On whose maximum speedup (using any number of processors) is 1.

    10. Question 10. What Is The Memory Consistency Model Supported By Openmp?

      Answer :

      There is no “guaranteed” sharing/consistency of shared variables until a flush is called. Flush sets that overlap are sequentially consistent and the writes of a variable become visible to every other thread at the point flush is serialized. This is slightly weaker than “weak consistency.”

    11. Question 11. How Are Threads Allocated To Processors When There Are More Threads Than The Number Of Processors?

      Answer :

      once a thread is completed on a core, a new thread is run on it. The order can be controlled using the “Schedule” clause.

    12. Question 12. What Is The Maximum Time Speedup Possible According To Amdahl's Law?

      Answer :

      1/f, where f is inherently sequential fraction of the time taken by the best sequential execution of the task.

    13. Question 13. What Is Simd?

      Answer :

      A class belonging to Flynn’s taxonomy of parallel architectures, it stands for single instruction multiple data. In this architecture, different processing elements all execute the same instruction in a given clock cycle, with the respective data (e.g., in registers) being independent of each other.

    14. Question 14. What Is A Hypercube Connection?

      Answer :

      A single node is a hypercube. An n node hypercube is made of two n/2 node hypercube, with their corresponding nodes connected to each other.

    15. Question 15. What Is The Diameter Of An Nnode Hypercube?

      Answer :

      log n. The diameter is the minimum number of links required to reach two furthest nodes. 

    16. Question 16. How Does Open Mp Provide A Shared Memory Programming Environment?

      Answer :

      OpenMP uses pragmasto control automatic creation of threads. Since the thread share the address space, they share memory. However, they are allowed a local view of the shared variables through “private” variables. The compiler allocates a variablecopy for each thread and optionally initializes them with the original variable. Within the thread the references to private variable are statically changed to the new variables.

    17. Question 17. What Is Common Crcw Pram?

      Answer :

      Parallel Random Access Model of Computation in which the processors can write to a common memory address in the same step, as long as they are all writing the same value.

    18. Question 18. What Is The Impact Of Limiting Pram Model To A Fixed Number Of Processors Or A Fixed Memory Size?

      Answer :

      Prams with higher capacities can be simulated can be simulated (with linear slowdown).

    19. Question 19. What Is The Impact Of Eliminating Shared Write From Pram?

      Answer :

      It can be simulated by crew pram with a log n factor in the time. However, the algorithms in this model can become a little complicated, as they must ensure conflict free writes.

    20. Question 20. What Is The Significance Of Work Complexity Analysis?

      Answer :

      Time complexity does not account for the size of the machine. Work complexity is more reflective of practical efficiency. Worktime scheduling principle describes the expected time for a p processor pram as work/p.

    21. Question 21. What Does Bulk Synchronous Model Add To Pram For Parallel Algorithm Analysis?

      Answer :

      Pram assumes constant time access to shared memory, which is unrealistic. Bsp counts time in "message communication" and in this model a step isn't initiated until the input data has arrived.

    22. Question 22. Is It True That All Nc Problems Parallelize Well?

      Answer :

      In general NC problems do parallelize well in terms of having a poly log solution in pram model while it only has a super log solution in ram model. However, for problems with polylog solution in ram models, there may not be an effective speedup.

    23. Question 23. Is User Locking Required To Control The Order Of Access To Guarantee Sequential Consistency?

      Answer :

      Sequential consistency is independent of user locking but does require delaying of memory operations at the system level. Precise ordering of operations need not be preordained by the program logic. There just must exist a global ordering which is consistent with the local view observed by each processor

    24. Question 24. What Is Pipelining?

      Answer :

      Pipelining is a process in which the data is accessed in a stage by stage process. The data is accessed in a sequence that is each stage performs an operation. If there are n number of stages then n number of operations is done. To increase the throughput of the processing network the pipe lining process is done. This method is adopted because the operation or the data is accessed in a sequence with a fast mode.

    25. Question 25. What Is Cache?

      Answer :

      Cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere. If requested data is contained in the cache (cache hit), this request can be served by simply reading the cache, which is comparatively faster.

      Otherwise (cache miss), the data has to be recomputed or fetched from its original storage location, which is comparatively slower. Hence, the more requests can be served from the cache the faster the overall system performance is. 

      Caching is often considered as a performance-enhancement tool than a way to store application data. If u spends more server resources in accessing the same data repeatedly, use caching instead. Caching data can bring huge performance benefits, so whenever u find that u need to frequently access data that doesn’t often change, cache it in the cache object and your application's performance will improve

    26. Question 26. What Is Write Back And Write Through Caches?

      Answer :

      write-back cache a caching method in which modifications to data in the cache aren't copied to the cache source until absolutely necessary. write-through cache performs all write operations in parallel -- data is written to main memory and the L1 cache simultaneously.

      Write-back caching yields somewhat better performance than write-through caching because it reduces the number of write operations to main memory. With this performance improvement comes a slight risk that data may be lost if the system crashes.

    27. Question 27. What Are Different Pipelining Hazards And How Are They Eliminated?

      Answer :

      Pipeline is a process where a business object goes through several stages asynchronously. Where one stage picks up processes and drops it for the next process to pick up. The hazard is when the a different thread of the same process picks up the business object leads to malfunction. This can be handled by status handling or scan delays.

    28. Question 28. What Are Different Stages Of A Pipe?

      Answer :

      There are two types of pipelines-

      1. Instructional pipeline where different stages of an instruction fetch and execution are handled in a pipeline.
      2. Arithmetic pipeline are different stages of an arithmetic operation are handled along the stages of a pipeline.

    29. Question 29. Explain More About Branch Prediction In Controlling The Control Hazards ?

      Answer :

      A branch prediction control device, in an information processing unit which performs a pipeline process, generates a branch prediction address used for verification of an instruction being speculatively executed. The branch prediction control device includes a first return address storage unit storing the prediction return address, a second return address storage unit storing a return address to be generated depending on an execution result of the call instruction, and a branch prediction address storage unit sending a stored prediction return address as a branch prediction address and storing the sent branch prediction address.

      When the branch prediction address differs from a return address, which is generated after executing a branch instruction or a return instruction, contents stored in the second return address storage unit are copied to the first return address storage unit.

    30. Question 30. Give Examples Of Data Hazards With Pseudo Codes.?

      Answer :

      A hazard is an error in the operation of the microcontroller, caused by the simultaneous execution of multiple stages in a pipelined processor. There are three types of hazards: Data hazards, control hazards, and structural hazards.

    31. Question 31. How Do You Calculate The Number Of Sets Given Its Way And Size In A Cache?

      Answer :

      A cache in the primary storage hierarchy contains cache lines that are grouped into sets. If each set contains k lines then we say that the cache is k-way associative.

      A data request has an address specifying the location of the requested data. Each cache-line sized chunk of data from the lower level can only be placed into one set. The set that it can be placed into depends on its address. This mapping between addresses and sets must have an easy, fast implementation. The fastest implementation involves using just a portion of the address to select the set.

            When this is done, a request address is broken up into three parts:

      • An offset part identifies a particular location within a cache line.
      • A set part identifies the set that contains the requested data.
      • A tag part must be saved in each cache line along with its data to distinguish different addresses that could be placed in the set.

    32. Question 32. Scoreboard Analysis?

      Answer :

      Scoreboarding is a centralized method, used in the CDC 6600 computer, for dynamically scheduling a pipeline so that the instructions can execute out of order when there are no conflicts and the hardware is available. In a scoreboard, the data dependencies of every instruction are logged.

      Instructions are released only when the scoreboard determines that there are no conflicts with previously issued and incomplete instructions. If an instruction is stalled because it is unsafe to continue, the scoreboard monitors the flow of executing instructions until all dependencies have been resolved before the stalled instruction is issued.

    33. Question 33. What Is Miss Penalty And Give Your Own Ideas To Eliminate It?

      Answer :

      The fraction or percentage of accesses that result in a hit is called the hit rate. The fraction or percentage of accesses that result in a miss is called the miss rate. hit rate + miss rate = 1.0 (100%) The difference between lower level access time and cache access time is called the miss penalty.

    34. Question 34. How Do You Improve The Cache Performance?

      Answer :

      1. Reduce the miss rate,
      2. Reduce the miss penalty, or
      3. Reduce the time to hit in the cache.

      CPU time = (CPU execution clock cycles + Memory stall clock cycles) x clock cycle time try stall clock cycles = (Reads x Read miss rate x Read

      miss penalty + Writes x Write miss rate x Write miss penalty)

      Memory stall clock cycles = Memory accesses x Miss rate x Miss penalty

      CPUtime = IC x (CPIexecution + (Mem accesses per instruction x Miss rate x Miss penalty)) x Clock cycle time hits are included in CPIexecution

      Misses per instruction = Memory accesses per instruction x Miss rate 

      CPUtime = IC x (CPIexecution + Misses per instruction x Miss penalty) x Clock cycle time.

    35. Question 35. Different Addressing Modes.?

      Answer :

      Addressing modes are an aspect of the instruction set architecture in most central processing unit (CPU) designs. The various addressing modes that are defined in a given instruction set architecture define how machine language instructions in that architecture identify the operand (or operands) of each instruction. An addressing mode specifies how to calculate the effective memory address of an operand by using information held in registers and/or constants contained within a machine instruction or elsewhere.

    36. Question 36. Computer Arithmetic With Two's Complements.?

      Answer :

      The two's complement of a binary number is defined as the value obtained by subtracting the number from a large power of two (specifically, from 2N  for an N-bit two's complement). The two's complement of the number then behaves like the negative of the original number in most arithmetic, and it can coexist with positive numbers in a natural way.

      A two's-complement system or two's-complement arithmetic is a system in which negative numbers are represented by the two's complement of the absolute value; this system is the most common method of representing signed integers on computers. In such a system, a number is negated (converted from positive to negative or vice versa) by computing its two's complement. An N-bit two's-complement numeral system can represent every integer in the range −2N−1 to +2N−1−1.

    37. Question 37. About Hardware And Software Interrupts ?

      Answer :

      Hardware Interrupt:

      Each CPU has External Interrupt lines. Other external devices line keyboard, Mouse, Other controllers can send signals to CPU asynchronously.

      Software Interrupt:

      is an interrupt generated with in a processor by executing an instruction . Software interrupt are often used to implemented system calls because they implemented a subroutine call with a CPU ring level change.

    38. Question 38. What Is Bus Contention And How Do You Eliminate It?

      Answer :

      Bus contention occurs when more than one memory module attempts to access the bus simultaneously. It can be reduced by using hierarchical bus architecture

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd Protection Status

Parallel Computer Architecture Tutorial