int x[2][128];
int i;
int sum = 0;
for (i = 0; i < 128; i++) {
sum += x[0][i] * x[1][i];
}
A.
C = 512, E = 1, B = 16, S = 32
total read count: 2 * 128
x[0][i] address: i*4
x[1][i] address: (128+i)4 = 512 + i4
so x[0][i] and x[1][i] are cached into same block.
miss rate 100%
B.
C = 1024, E = 1, B = 16, S = 64
sizeof(x) == 2 * 128 * 4 == C
whole array can be cached.
miss rate is depended on block size B.
B = 16, sizeof(int) = 4, so
miss rate is 25%
C.
C = 512, E = 2, B = 16, S = 16
total read count: 2 * 128
x[0][i] address: i*4
x[1][i] address: (128+i)4 = 512 + i4
so x[0][i] and x[1][i] are cached into different block in same set.
in first half, all elements can be cached. miss rate is 25%.
for (i = 0; i < 64; i++)
sum += x[0][i] * x[1][i];
in second half
for (i = 64; i < 128; i++)
sum += x[0][i] * x[1][i];
x[0][i] is not in cache. according to LRU strategy, cache x[0][i] into the same block with x[0][i-64], cache x[1][i] into the same block with x[1][i-64]. miss rate is 25%.
finally, miss rate is still 25%.
D.
No
if B is still 16, sizeof(int) = 4, block can only cache 4 int one time.
read int first time toggle memory cache, miss; next 3 time read hit.
25% miss rate is lowest.
E.
Yes
assume B = 32, block cache 8 int at one time.
only 1 miss in 8 time int read.
miss rate can be 12.5%.