MuCache

Microservice Call Graph Caching

Task #1: Reading

1. Read https://www.cis.upenn.edu/~sga001/papers/mucache-nsdi24.pdf. Especially Sec 4, Sec 5 and Appendix A formal proof of correctness. Source code: https://github.com/eniac/mucache

You can see my reading traces in the annotations of the paper.

2. For the example in Figure 4, draw the execution sequence (similar to Figure 3) to show how MuCache protocol solves the challenge of the “diamond” pattern. What if the write and read operations come from independent processes in S1? Draw all possible execution sequences.

Diamond Graph

flowchart LR;
    subgraph Service1
    S1-.-> W11[W1]
    S1-.-> W12[W1]
    W12-.->CM1((CM1))
    W11-.->CM1((CM1))
    CM1-.->C1[(C1)]
    W11-.->C1
    end

    subgraph Service2
    S2-.-> W21[W2]
    S2-.-> W22[W2]
    W22-.->CM2((CM2))
    W21-.->CM2((CM2))
    CM2-.->C2[(C2)]
    W21-.->C2
    end

    subgraph Service3
    S3-.-> W31[W3]
    S3-.-> W32[W3]
    W32-.->CM3((CM3))
    W31-.->CM3((CM3))
    CM3-.->C3[(C3)]
    W31-.->C3
    end

    subgraph Service4
    S4-.-> W41[W4]
    S4-.-> W42[W4]
    W42--> D4[(D4)]
    W42-.->CM4((CM4))
    W41-.->CM4((CM4))
    CM4-.->C4[(C4)]
    W41-.->C4
    end
    
    W11-->S2
    CM2-.->CM1
    W11-->S3
    CM3-.->CM1
    W21-->S4
    CM4-.->CM2
    W31-->S4
    CM4-.->CM3

Single Process

Call Stack

Call stack of the example in paper(with MuCache) Firstly S1 read(k) with trace S1-->S3-->S4(To prepare cache and set status). Then S1 write(k) with trace S1-->S2-->S4. Eventualy S1 read(k) with trace S1-->S3-->S4.(Same as the first one, so ignore it.)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
W1 preReqStart(ctx[call("read",k)])
CM1 startHandler(call("read",k)):
history=[Call("read",k)]
W1 preCall(ctx,ca):[ca=Call("read",k)]
readset={cid:ca}
cache.get(ca)->None

W1 call("read",k) to S3
W3 preReqStart(ctx[call("read",k)])
CM3 startHandler(call("read",k)):
history=[Call("read",k)]
W3 preCall(ctx,ca):[ca=Call("read",k)]
readset={cid:ca}
cache.get(ca)->None

S3 call("read",k) to S4
W4 preReqStart(ctx[call("read",k)])
CM4 startHandler(call("read",k)):
history=[Call("read",k)]
W4 preRead(ctx,k):
readsets={cid:[k]}
W4 read k from DB, get ret
W4 preReturn(ctx, ret)
CM4 endHandler(ca,[k],S2,ret,{S1,S3,S4}):
saved={k:(S3,ca)}

S4 return ret to S3
CM3 saveHandler(ca,ret,{S1,S3,S4}):
visited={ca:{S1,S3,S4}}
cache={ca:ret}
W3 preReturn(ctx,ret)
CM3 endHandler(ca,[ca],client,ret,{S1,S3,S4}):
saved={ca:(S1,ca)}

S3 return ret to S1
CM1 saveHandler(ca,ret,{S1,S3,S4}):
visited={ca:{S1,S3,S4}}
cache={ca:ret}
W1 preReturn(ctx,ret)
CM1 endHandler(ca,[ca],None,ret,{S1,S3,S4}):
saved={ca:(None,ca)}
S1 return ret

S1 call("write",k,v) to S2(ignore preReqStart & preCall, because "write" is not RO)
S2 call("write",k,v) to S4
S4 write(k,v) to DB:
DB={k:v}
W4 postWrite(k):
CM4 invHandler(k):
history=[Call(ca),Inv(k)]
saved[k]=(S3,ca)
saved={}

CM3 invHandler(ca):
history=[Call(ca),Inv(ca)]
saved[ca]=(S1,ca)
saved={}
cache={}

CM1 invHandler(ca):
history=[Call(ca),Inv(ca)]
saved[ca]=(None,ca)
saved={}
cache={}

Then S1 call("read",k) would not return the cache unchanged because the cache is empty.

S1 would execute call("read",k) just like the first part of the call stack above.

Execution Sequence
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
(1) C1.get(call("read",k))->None
(2a) S1 call("read",k) to S3
(2b) CM1.Start(call("read",k))
(3) C3.get(call("read",k))->None
(4a) S3 call("read",k) to S4
(4b) CM3.Start(call("read",k))
(5) CM4.Start(call("read",k))
(6) Read(k)
(7a) S4 return ret to S3
(7b) CM4 End(call("read",k),{k},ret)
(8) CM3.Save call("read",k)->ret
(9) C3.set(call("read",k))->ret
(9a) S3 return ret to S1
(9b) CM3 End(call("read",k),{call("read",k)},ret)
(10) CM1.Save call("read",k)->ret
(11) C1.set(call("read",k))->ret
(12) return ret

(13) S1 call("write",k,v) to S2
(14) S2 call("write",k,v) to S4
(15) S4 Write(k,v)
(16) CM4.Inv(k)
(17) CM3.Inv(call("read",k))
(18) C3.delete(call("read",k))
(19) CM1.Inv(call("read",k))
(20) C1.delete(call("read",k))

(21) C1.get(call("read",k))->None
(22) ......(Same as (1)-(12))

Multiple Independent Process

flowchart LR
    S1-->|W|S2
    S2-->|W|S4
    S1-->|R|S3
    S3-->|R|S4

Ignore the equivalent modulo reordering traces.

If, as in the case of a single process, there is a cache for call("read", k) in S1, then there are only two possibilities:

  1. After Inv propagation back to S1, the Read propagation follows the trace S1 -> S3 -> S4, consistent with the trace of the single process mentioned earlier.
  2. If Inv has not yet propagated back to S1, then the read operation directly returns the content from the cache.
    1
    2
    3
    4
    5
    6
    7
    8
    Write
    ......
    Read(k)
    C1.get(ca)-->ret
    S1 return ret
    ......
    C1.Inv(ca)
    ......

Let's consider the case where there is no cache entry for read(k) in S1.

  1. In CM4, if Inv(k) arrives before End(read...) is reached, then the information about the cache entry for k is not yet recorded in saved, and Inv(k) cannot propagate along S4-->S3-->S1. Also, it cannot send Save information to S3. However, since CM3 has not received Inv(ca), it sends Save to CM1.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    (1) C1.get(call("read",k))->None
    (2a) S1 call("read",k) to S3
    (2b) CM1.Start(call("read",k))
    (3) C3.get(call("read",k))->None
    (4a) S3 call("read",k) to S4
    (4b) CM3.Start(call("read",k))
    (5) CM4.Start(call("read",k))
    (6) Read(k)->ret

    (7) S1 call("write",k,v) to S2
    (8) S2 call("write",k,v) to S4
    (9) S4 Write(k,v)

    (1)-(6) and (7)-(9) 是同步独立进行的。

    (10) Inv(k)
    (11a) CM4 End(call("read",k),{k},ret)
    (11b) S4 return ret to S3

    (12a) S3 return ret to S1
    (12b) CM3 End(call("read",k),{call("read",k)},ret)
    (13) CM1.Save call("read",k)->ret
    (14) C1.set(call("read",k))->ret
    (15) return ret
  2. In CM4, if Inv(k) arrives after End(read...), then let's consider the situation in CM3 (keeping in mind that Save and Inv cannot be reordered, so Save should arrive before Inv in CM3, and similarly in CM1):
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    (1) C1.get(call("read",k))->None
    (2a) S1 call("read",k) to S3
    (2b) CM1.Start(call("read",k))
    (3) C3.get(call("read",k))->None
    (4a) S3 call("read",k) to S4
    (4b) CM3.Start(call("read",k))
    (5) CM4.Start(call("read",k))

    (6) S1 call("write",k,v) to S2
    (7) S2 call("write",k,v) to S4

    (8) Read(k)
    (9) S4 Write(k,v)


    (10) CM4 End(call("read",k),{k},ret)
    (11) S4 return ret to S3
    (12) Inv(k)
    • Save-->Inv-->End

      1
      2
      3
      4
      5
      6
      (13) CM3.Save(ca)->ret;C3.set(ca,ret)
      (14) CM3.Inv(ca);C3.delete(ca)
      (15) CM3.End(ca,...)
      (16) S3 return ret to S1
      (17) CM1.End(ca,...)
      (18) return ret

    • End-->Save--Inv

      1
      2
      3
      4
      5
      6
      7
      8
      (13) CM3.End(ca,...)
      (14) CM3.Save(ca)->ret;C3.set(ca,ret)
      (15) CM3.Inv(ca);C3.delete(ca)
      (16) S3 return ret to S1
      (17) CM1.End(ca,...)
      (18) CM1.Save(ca)->ret;C1.set(ca,ret)
      (19) CM1.Inv(ca);C1.delete(ca)
      (20) return ret

3. Choose a microservice and provide a few example function calls to simulate which data could be cached, and simulate a cache hit and a cache miss. (You may choose any cache size and eviction policy you like)

Initialization

Cache size: 2 Eviction policy: LRU Microservice:

classDiagram
class client
class frontend
class pic_manager
class text_manager

class text_database
class pic_database
    
client-->frontend
frontend-->pic_manager
pic_manager-->pic_database
frontend-->text_manager
text_manager-->text_database

All caches and states are empty at the beginning.

Function Sequence

1
2
3
4
5
   download_text(name1)
-->download_text(name2)
-->download_pic(name3)
-->uploadpic(name3,pic2)
-->download_text(name2)

Memory States

Only simulate the memory states.

---
title: Init
---
classDiagram
class client
class frontend
class pic_manager
class text_manager

class text_database
text_database: name1->txt1
text_database: name2->txt2
class pic_database
pic_database: name3->pic1
    
client-->frontend
frontend-->pic_manager
pic_manager-->pic_database
frontend-->text_manager
text_manager-->text_database
---
title: download_text(name1), cache miss
---
classDiagram

class client
client: download_text(name1)->txt1
class frontend
frontend: download_text(name1)->txt1
class pic_manager
class text_manager
text_manager: download_text(name1)->txt1

class text_database
text_database: name1->txt1
text_database: name2->txt2
class pic_database
pic_database: name3->pic1


client-->frontend:download
frontend-->pic_manager
pic_manager-->pic_database
frontend-->text_manager:download
text_manager-->text_database:download
text_database-->text_manager:save
text_manager-->frontend:save
frontend-->client:save
---
title: download_text(name2), cache miss
---
classDiagram

class client
client: download_text(name1)->txt1
client: download_text(name2)->txt2
class frontend
frontend: download_text(name1)->txt1
frontend: download_text(name2)->txt2

class pic_manager
class text_manager
text_manager: download_text(name1)->txt1
text_manager: download_text(name2)->txt2

class text_database
text_database: name1->txt1
text_database: name2->txt2
class pic_database
pic_database: name3->pic1


client-->frontend:download
frontend-->pic_manager
pic_manager-->pic_database
frontend-->text_manager:download
text_manager-->text_database:download
text_database-->text_manager:save
text_manager-->frontend:save
frontend-->client:save
---
title: download_pic(name3), cache miss/eviction
---
classDiagram

class client
client: download_text(name2)->txt2
client: download_pic(name3)->pic1
class frontend
frontend: download_text(name2)->txt2
frontend: download_pic(name3)->pic1

class pic_manager
pic_manager: download_pic(name3)->pic1
class text_manager
text_manager: download_text(name1)->txt1
text_manager: download_text(name2)->txt2

class text_database
text_database: name1->txt1
text_database: name2->txt2
class pic_database
pic_database: name3->pic1


client-->frontend:download
frontend-->pic_manager:download
pic_manager-->pic_database:download
frontend-->text_manager
text_manager-->text_database
pic_database-->pic_manager:save
pic_manager-->frontend:save
frontend-->client:save
---
title: uploadpic(name3,pic2), cache invalidate
---
classDiagram

class client
client: download_text(name2)->txt2
class frontend
frontend: download_text(name2)->txt2

class pic_manager
class text_manager
text_manager: download_text(name1)->txt1
text_manager: download_text(name2)->txt2

class text_database
text_database: name1->txt1
text_database: name2->txt2
class pic_database
pic_database: name3->pic2


client-->frontend:upload
frontend-->pic_manager:upload
pic_manager-->pic_database:upload
frontend-->text_manager
text_manager-->text_database
pic_database-->pic_manager:inv
pic_manager-->frontend:inv
frontend-->client:inv
---
title: download_text(name2), cache hit
---
classDiagram

class client
client: download_text(name2)->txt2
class frontend
frontend: download_text(name2)->txt2

class pic_manager
class text_manager
text_manager: download_text(name1)->txt1
text_manager: download_text(name2)->txt2

class text_database
text_database: name1->txt1
text_database: name2->txt2
class pic_database
pic_database: name3->pic2


client-->frontend
frontend-->pic_manager
pic_manager-->pic_database
frontend-->text_manager
text_manager-->text_database

Task #2: Coding

Summarize the key components to apply the MuCache protocol, and implement an example to show that the datastore is linearizable. Note: We are not asking you to build and run the open-source version of MuCache (you may use it as a reference). Please implement your own version of MuCache protocol (just a prototype to show the key ideas). You may remove any component you think is not critical in the design, and use simple functions to simulate the behavior of each service/thread in your code.

Please refer to the code mucache and readme.