CompSci 141 /
CSE 141 / Informatics 101 Winter 2014 | News | Course Reference | Schedule | Project Guide
| Code Examples
This webpage was adapted from Alex
Thornton’s offering of CS 141
CompSci 141 /
CSE 141 / Informatics 101 Winter 2014
Project #2
Due date and time: 11:59pm, Tuesday, February 4, 2014.
Late homework will not be accepted.
Part I: GC Simulator
(80 points)
Introduction
As
we discussed in class, a garbage collector can automatically determine if an object
is reachable and reclaim unreachable objects. We have discussed two types of
GC: mark/sweep and copying. The goal of
this project is to implement a simulator for a mark/sweep GC in Java.
What
does a mark/sweep GC do?
Simulating a Mark/Sweep GC in Java
Implementation
of a real GC requires identification of all stack variables and object
references. It is very difficult to achieve without modifying a JVM. In this project, we are not implementing a
real GC. Instead, we are going to implement the Mark/Sweep algorithm in a
simulated environment. For a given Java
program (i.e., a test case), the simulated GC should be able to scan the
objects created in the program and report those that are unreachable and should
be deallocated.
1. Building a simulation library
The first step is to
build a library to make all the heap references (pointers) available to
us. Suppose our target Java program only
supports the following four types of instructions:
In the simulated environment, we should have our own modeling of
each type of instruction. For example,
we need to create a class called GCSimulator, and inside this class, we create a static
method for each type of instruction to model its effect. The skeleton of the
class is as follows:
class GCSimilator{
…
static void assign(String aName, String bName, Object o)
{…} // aName and bName represent the variable names
“a” and “b”, respectively; o denotes the
object referenced by a and b
static void createObject
(String aName, Object o) {…} // aName is the name of variable “a”, and o represents the newly created
object
static void readObject
(String bName, String aName, Object o) {…}
// bName and aName represent
the two variables “b” and “a”, and o denotes the heap object referenced by b
static void writeObject(String
aName, String bName, String fieldname, Object oa,
Object ob) {…} // bName and aName represent the two variables “b”
and “a”, fieldname denotes the name of the field f,
//
oa denotes the heap object referenced by a, and ob denotes the heap object referenced by b
static void gc()
{…} // a gc method that will be explicitly
called
}
In a test Java program,
for each statement, we (manually) add a call to a simulation method to capture
its effect. For instance, a modified
Java program looks like the following:
class Test{
static void main(String args[]) {
A a = new
A();
GCSimulator.createObject(“a”, a);
B b = new
B();
GCSimulator.createObject(“b”, b);
B
c = b;
GCSimulator.assign(“c”, “b”, b);
c.f = a;
GCSimulator.writeObject(“c”, “a”, “f”,
c, a);
P
p = a.m;
GCSimulator.readObject(“p”, “a”, p);
GCSimulator.gc();
}
}
In our GC
simulator, we use string names “a”, “b”,
“c”, “d” to represent stack variables. Our assumption is that different
variables in a method must have different names. These strings can help us identify the roots
for a GC traversal. A reference can be
simulated by using a hashmap. In a Java program, there are two types of
references: (1) stack-heap reference: a stack variable containing a pointer
that points to a heap object; and (2) a heap-heap reference: a field of an object containing a pointer
that points to another heap object. To
simulate type 1 references, we can create a hash map stackRef in GCSimilator; Each entry in the
hash map is a pair of a string (representing a stack variable) and an object
(pointed to by the variable). For example, in method assign, we add pair <aName, o> into the stackRef; in method readObject, <bName, o> should be added to stackRef. To simulate type 2 references, we create
another hash map heapRef
in GCSimulator; each entry is a <Object,
Map<String, Object>> pair. For each object o, heapRef maps it to a another map, which, in turn, maps each
field name to an object. When a field write is seen (i.e., writeObject),
we need to perform the following map update:
((Map)heapRef.get(oa)).put(fieldname,
ob) ), which basically replaces the old object referenced by
the field “fieldname” of oa with a new object
ob. By appropriately implementing each
simulation method, we can simulate all real reference relationships in our
environment, which will be used later to perform reachability analysis.
2. Reachability analysis
In this
project, the simulated GC does not automatically run. The programmer needs to
explicitly call a library method GCSimulator.gc() to
invoke our GC. We need to implement this
gc method. The
first step is to identify root objects. In our environment, they are objects
pointed to by stack variables. We can
easily find these objects by traversing the value set of map stackRef. Next,
we should develop a graph traversal algorithm that iteratively identifies
transitively reachable objects by chasing references (in map heapRef).
3. Identifying unreachable objects
Any
object encountered in the reachability analysis is a reachable object. Our goal is to identify and print unreachable objects. This requires us to use a set to store all
objects created during the execution.
For example, every time we see an object allocation (in method createObject),
we add the newly created object o into the set.
When GC is invoked, we traverse this set and report objects that haven’t
been encountered in the reachability analysis.
4. Output of the simulator
The GC simulator should
be available as a library that provides the aforementioned methods. To evaluate each project, we will create a
set of test programs with calls to your library methods (including the GC
method). After each GC runs, the simulator should print unreachable objects in
the following format:
GC#1:
The following objects
are unreachable:
Object … ,
Object …,
…
GC#2:
The following objects
are unreachable:
Object … ,
Object …,
Part II:
Call-by-value and call-by-reference (20 points)
Consider
the following program in Java:
class T{
A foo(B c, A d){
5 c = new B(); // The
address of this object is 0x00000037
6 c.intField
= 23;
7 d.bField
=c;
8 return d;
}
}
class A {
public B bField;
}
class B{
public int
intField;
public void main(String[] args){
1 T t
= new T(); // Suppose the heap address
of the object is 0x00000012
2 A a = new A(); // The address of this
object is 0x00000025
3 B b = null;
4 A m = t.foo(b, a);
9 Systems.out.println(a.bField.intField);
return;
}
}
1.We all know that Java is a call-by-value language. Write
down the values of the variables t, a, b, m, c, d at each indexed program point
as well as the final output (i.e., the value printed in line 9).
2.Now let’s pretend that Java is a call-by-reference language.
Write down the values of these six variables at each program point as well as
the final output.
An example solution:
1 t: 0x00000012, a: ?, b: ?, m: ?, c: ?,
d?, final output: ? (? means the value is unknown.)
2 t: 0x00000012, a: 0x00000025, b: ?, m: ?, c:
?, d?
3 …
4 …
5 …
6 …
7 …
8 …
9 …
Deliverables
Create a
zip file with your solution to Part I (a zip file containing your .java and .class
files, as well as a readme telling the TA how to run it) and the text file for
part II. Follow this link for a discussion of how to
submit your document. Remember that we do not accept paper submissions of your
assignments, nor do we accept them via email under any circumstances.
·
Adapted from a similar document from CS 141 Winter 2013 by Alex
Thornton,