Program Understanding
Software engineers commonly encounter scenarios where they are asked to
work with unfamiliar software systems that are too large in scope for
them to fully comprehend within the time given to complete a requested
task. The ability to quickly locate the relevant code and safely
perform the required modifications is a crucial skill. In this
laboratory, we will revisit the Saros project, using it as a sample
large-scale project within which basic source code comprehension will
be necessary for completing the given exercises.
Problem Areas
Note: To ensure that line numbers and referenced
code matches, use the Saros code
snapshot
available from the class
svn repository.
For this lab we'll bring together two previous visited topics: Saros
and code smells. Specifically, we will be sniffing out bad code
smells within the Saros code that will help us simulate the process of
diving into a large and unfamiliar code base and determining how to go
about modifying the code to fix a given issue. Saros, like nearly
any significant body of code developed over a non-trivial amount of
time, has a few warts. This part of the lab will contain
exercises that involve finding given code smells and proposing the
changes necessary to remove the issue.
Duplicate Code
Code that has been copy/pasted into multiple locations in a system
causes massive maintenance headaches. Future changes to one
instance of a duplication almost always should
be made to the other
location as well, but often are not. This oversight is frequently
made because the person making the change does not know that the
duplication exists elsewhere, or because they just plain
forgot. Instead, the "once and only once" principle should be
applied, namely: behaviors should be declared in the code only
once. This principle is violated multiple places in the Saros
code, so we will take a look at an example.
The following code exists in both the "OutgoingSessionNegotiation" and
"OutgoingProjectNegotiation" classes (lines 372-473 and 241-342
respectively) within the "de.fu_berlin.inf.dpp.invitation" package:
}
public void localCancel(String errorMsg, CancelOption cancelOption) {
if (!cancelled.compareAndSet(false, true))
return;
log.debug("Inv" + Utils.prefix(peer) + ": localCancel: " + errorMsg);
if (monitor != null)
monitor.setCanceled(true);
cancellationCause = new LocalCancellationException(errorMsg,
cancelOption);
}
/**
* Cancels the invitation process based on the exception stored in
* {@link #cancellationCause}. This method is always called by this local
* object, so even if an another object "cancels" the invitation (
* {@link #localCancel(String, CancelOption)}, {@link #remoteCancel(String)}
* ), the exceptions will be thrown up on the stack to the caller of
* {@link #start(SubMonitor)}, and not to the object which has "cancelled"
* the process. The cancel methods (
* {@link #localCancel(String, CancelOption)}, {@link #remoteCancel(String)}
* ) do not cancel the invitation alone, but they set the
* {@link #cancellationCause} and cancel the {@link #monitor}. Now it is the
* responsibility of the objects which use the {@link #monitor} to throw a
* {@link SarosCancellationException} (or it's subclasses), which will be
* caught by this object causing a call to this method. If this does not
* happen, the next {@link #checkCancellation(CancelOption)} cancels the
* invitation.
*/
protected void executeCancellation() throws SarosCancellationException {
log.debug("Inv" + Utils.prefix(peer) + ": executeCancellation");
if (!cancelled.get())
throw new IllegalStateException(
"executeCancellation should only be called after localCancel or remoteCancel!");
String errorMsg;
String cancelMessage;
if (cancellationCause instanceof LocalCancellationException) {
LocalCancellationException e = (LocalCancellationException) cancellationCause;
errorMsg = e.getMessage();
switch (e.getCancelOption()) {
case NOTIFY_PEER:
transmitter.sendCancelInvitationMessage(peer, errorMsg);
break;
case DO_NOT_NOTIFY_PEER:
break;
default:
log.warn("Inv" + Utils.prefix(peer)
+ ": This case is not expected here.");
}
if (errorMsg != null) {
cancelMessage = "Invitation was cancelled locally"
+ " because of an error: " + errorMsg;
log.error("Inv" + Utils.prefix(peer) + ": " + cancelMessage);
monitor.setTaskName("Invitation failed. (" + errorMsg + ")");
} else {
cancelMessage = "Invitation was cancelled by local user.";
log.debug("Inv" + Utils.prefix(peer) + ": " + cancelMessage);
monitor.setTaskName("Invitation has been cancelled.");
}
} else if (cancellationCause instanceof RemoteCancellationException) {
RemoteCancellationException e = (RemoteCancellationException) cancellationCause;
errorMsg = e.getMessage();
if (errorMsg != null) {
cancelMessage = "Invitation was cancelled by the remote user "
+ " because of an error on his/her side: " + errorMsg;
log.error("Inv" + Utils.prefix(peer) + ": " + cancelMessage);
monitor.setTaskName("Invitation failed.");
} else {
cancelMessage = "Invitation was cancelled by the remote user.";
log.debug("Inv" + Utils.prefix(peer) + ": " + cancelMessage);
monitor.setTaskName("Invitation has been cancelled.");
}
} else {
log.error("This type of exception is not expected here: ",
cancellationCause);
monitor.setTaskName("Invitation failed.");
}
sarosSession.returnColor(this.colorID);
invitationProcesses.removeInvitationProcess(this);
throw cancellationCause;
}
/**
* Checks whether the invitation process or the monitor has been cancelled.
* If the monitor has been cancelled but the invitation process has not yet,
* it cancels the invitation process.
*
* @throws SarosCancellationException
* if the invitation process or the monitor has already been
* cancelled.
*/
protected void checkCancellation(CancelOption cancelOption)
throws SarosCancellationException {
if (cancelled.get()) {
log.debug("Inv" + Utils.prefix(peer) + ": Cancellation checkpoint");
throw new SarosCancellationException();
That's a lot of
duplication! Instead, the commonality should be moved into an
abstraction layer. Specifically for this case a new abstract base
class should be created to contain the cancellation logic. Namely
an "OutgoingCancellableNegotiation" which contains the duplicated
code. The two classes that previously contained the duplicate
code should then inherit from this new base class. By merging the
duplicated code into a single definition we can ensures future changes
that are required within this block of code will be automatically
applied to both places within the system that this functionality is
needed. Whenever possible the burden of work should be placed
earlier in the development process instead of delegating it to later
maintainers.
Exercise
Locate a similar chunk of duplicated code within the
"de.fu_berlin.inf.dpp.invitation" package. Describe in detail the
refactoring that should be performed to remove this duplication.
Long Method
Each method within a class should be concise and easily
understandable. Long methods work against this principle and make
it less clear what the method is suppossed to do. Clarity is king
when developing maintainable code, so many software developers operate
under the general design guideline (but not a hard and fast rule) that
a method should fit entirely on screen without scrolling. Saros
does a pretty good job of adhering closely to this goal, but there are
some methods that are a bit long. For instance, the "start"
method in the "ActivitySequencer" class (found in the
"de.fu_berlin.inf.dpp.net.internal" package) weighs in at 121 lines of
code. Digging deeper into how the method works we can see that
most of this is due to the method containing an anonymous "TimerTask"
subclass. Anonymous subclasses are good for classes that can be
extended simply, but in this case many lines of code are needed to
express the desired operation of the subclass. A separate
internal class would be better here. Indeed, by creating a
private internal class within "ActivitySequencer" it becomes clearer
what "start" does and also allows for a more natural separation of what
the "TimerTask" will do (something currently lacking even the most
basic comment documentation).
Exercise
Locate a method that is too large within the "Saros" class in the
"de.fu_berlin.inf.dpp" package. Describe in detail the
refactoring that should be done to improve the clarity of the code.
Large Class
As new functionality is added to a system, it is natural for the
underlying classes to grow as well. However, at some point it
makes sense to take a step back and determine if the class is doing too much and should be partitioned
so that it focuses on solving a primary concern. This principle
of
"separation of concerns" is a crucial concept in improving software
maintainability. Looking at the Saros code, we can find several
classes that have grown quite large. Take a look at the "Utils"
class in the "de.fu_berlin.inf.dpp.util" package. This class has
grown to over 1100 lines, far too large to quickly understand and work
with easily. It clearly has become a dumping ground for general
purpose static methods, we can help improve the clarity here by
refactoring the structure and adding additional subcategories of
utilities (indeed, the "de.fu_berlin.inf.dpp.util" package already
contains some such subcategories of utilities). Looking at the
current contents of the "Utils" class there are several groups of
methods that could be moved into their own class. One example
would be the many "runner" methods (e.g.: "runSafeSync",
"runSafeAsync", etc.). Moving these into their own "RunnerUtils"
class would help reduce the massive scope of the "Utils" class nicely.
Exercise
Locate a class that is too large within the Saros source code.
Describe in detail the refactoring that should be done to improve the
structure and clarity of the code.
Hint: The following command line
incantation can help find large classes by listing the 20 largest Java
source code files within the current working directory and
subdirectories
(see the previous lab on command line utilities
for a reminder of how this command works).
find . -name '*.java' | xargs wc | sort -n | tail -n20
Cross-cutting Concerns
Up until now in this lab, we have focused our code comprehension and
suggested changes on a mostly local scope affecting only a single or
handful of classes. However, often desired new functionality in a
system will require more extensive changes to the code. This
section will introduce such a scenario for the Saros project and ask
you to craft a design that meets the desired requirements while
simultaneously improving the maintainability of the codebase.
Your solution will require you to approach the problem in three steps:
- Locate and understand the relevant code as it is currently
implemented
- Propose a refactoring plan that will enable the change
- Propose the required additions that will need to be made given
the proposed refactoring
Exercise
As currently implemented, the Saros project is dependent upon using an
XMPP (also known as Jabber) server for all the backend
communications. In our hypothetical situation, a client
has expressed interest in using a different backend communications
infrastructure: IRC (Internet Relay Chat). They find this more
desirable than XMPP because many of them already have IRC
infrastructure readily accessible to them and do not want to go to the
trouble of deploying their own XMPP service. Your task:
- At a high level, describe how the current backend communication
system works in the Saros codebase
- Describe what refactoring needs to be performed to provide the
level of abstraction necessary to support additional communication
backends
- Describe the necessary additions to hook into the refactored code
proposed above
In your solution be sure to:
- include an overview of the new abstractions that are required to
provide a unified interface to the non-communication aspects of the
Saros project
- describe how this abstraction affects the structure of the Saros
source code
- outline what packages and classes will be affected
- provide details about what new classes will be needed
Reference materials:
Developed with guidance from
Miryung Kim
Copyright
(c) 2011
Professor Miryung Kim and a course development TA Evan Grim
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.1 or
any later version published by the Free Software Foundation; with no
Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
A copy of the license can be found on the GNU web site here.