Discussion points for Grappa

No questions for you to answer from this paper. Grappa is a new approach for graph (what the paper calls “vertex centric”) and large-scale data-processing computations. The observation is that since these systems store state in memory distributed across machines, a system that implements a distributed shared memory would provide a more flexible programming model. The key ideas are

1.       Coarse-grained access to distributed shared memory, executed by delegates

2.       Massive multi-threading to hide latency – when a worker (which is a user-level thread) blocks on a remote access, Grappa switches to another worker. This same idea used in GPUs to hide memory latency

3.       Aggregation of messages – since the system can tolerate latency well (the point above), Grappa can afford to delay messages to get more opportunity for aggregation. (This point is not explained that well in the paper.)

The key contributions of this paper are the implementation techniques to achieve the goals above, such as the ability to have tens of thousands of workers and still perform context switches efficiently.

As you read this paper, please keep in mind the difference between Grappa and GraphLab with respect to the programming models and the consistency models.

Discussion questions for Arabesque

This paper provides a distributed programming model for graph mining problems. One thing to keep in mind is the difference between a “pattern” and a “motif” – a pattern is labelled while a motif is not. e.g. A square is a motif while a square with node colors = (red, blue, red, blue) is a pattern.

The key ideas that make this paper practical are discussed in Sections 5.1 and 5.2.

1.       In figure 5, explain why (in a short sentence each)

a.       2,4,3 is not a canonical embedding

b.       1,2,4 is not a canonical embedding

2.       What are the spurious embeddings in Figure 6? (A spurious embeddding is a path of the form (a,b,c) present in the ODAG in Fig 6, but not in the table in Fig 5.) For each of the spurious embeddings, explain how (in a short sentence each) Arabesque eliminates them.