Homework 5. Scheme code difference analyzer

The problem

Your employer Tortuous Computer Processing Inc. (TCP) is suing Unicorn Data Prospector Inc. (UDP), claiming that UDP stole large bodies of TCP's code and incorporated it into their data mining product. As part of the legal discovery process, TCP has obtained copies of UDP's data miner and wants to compare it to TCP's to find evidence of unauthorized copying. About 5% of both data miners are written in Scheme. Your team has been assigned the job of comparing the Scheme parts.

Your boss suggested that you prototype a procedure compare-expr that compares two Scheme expressions x and y, and produces a difference summary of where the two expressions are the same and where they differ. Your boss wants the difference summary to be easily checkable, in case there is a bug in compare-expr itself. So you decide to have the difference summary also be a Scheme expression which, if executed in an environment where the Scheme variable TCP is true, has the same behavior as x, and otherwise has the same behavior as y.

To keep things simple your prototype need not handle arbitrary Scheme expressions; it can be limited to the Scheme subset that consists of constant literals, variable references, procedure calls, the special form (quote datum), the special form (lambda formals body) where body consists of a single expression, the special form (let bindings body) where body consists of a single expression, and the special-form conditionals (if expr expr expr) and (if expr expr). To avoid confusion the input Scheme expressions cannot contain the symbol TCP anywhere. Your prototype need not check that its inputs are valid; it can have undefined behavior if given inputs outside the specified subset.

Assignment

First, write a Scheme procedure (compare-expr x y) that implements the specification described above. Your implementation must be free of side effects; for example you cannot use the set! special form. Returned values should share storage with arguments when possible; they should not copy their arguments.

The output expression should use if expressions to represent differences whenever the two input expressions disagree, attempting to minimize the size of the subexpressions under the generated ifs. As a special case, it should use TCP to represent a subexpression that is #t in TCP's version and #f in UDP's version, and should use (not TCP) to represent the reverse situation. You need not worry about cleverly comparing lambda or let expressions if their variables differ in any way. Here are some examples and what they should evaluate to.

(compare-expr 12 12)  ===>  12
(compare-expr 12 20)  ===>  (if TCP 12 20)
(compare-expr #t #t)  ===>  #t
(compare-expr #f #f)  ===>  #f
(compare-expr #t #f)  ===>  TCP
(compare-expr #f #t)  ===>  (not TCP)
(compare-expr 'a '(cons a b))  ===>  (if TCP a (cons a b))
(compare-expr '(cons a b) '(cons a b))  ===>  (cons a b)
(compare-expr '(cons a b) '(cons a c))  ===>  (cons a (if TCP b c))
(compare-expr '(cons (cons a b) (cons b c))
              '(cons (cons a c) (cons a c)))
  ===> (cons (cons a (if TCP b c)) (cons (if TCP b a) c))
(compare-expr '(cons a b) '(list a b))  ===>  ((if TCP cons list) a b)
(compare-expr '(list) '(list a))  ===>  (if TCP (list) (list a))
(compare-expr ''(a b) ''(a c))  ===>  (if TCP '(a b) '(a c))
(compare-expr '(quote (a b)) '(quote (a c)))  ===>  (if TCP '(a b) '(a c))
(compare-expr '(quoth (a b)) '(quoth (a c)))  ===>  (quoth (a (if TCP b c)))
(compare-expr '(if x y z) '(if x z z))  ===>  (if x (if TCP y z) z)
(compare-expr '(if x y z) '(if x y))  ===>  (if TCP (if x y z) (if x y))
(compare-expr '(if x y z) '(g x y z))
  ===> (if TCP (if x y z) (g x y z))
(compare-expr '(let ((a 1)) (f a)) '(let ((a 2)) (g a)))
  ===> (let ((a (if TCP 1 2))) ((if TCP f g) a))
(compare-expr '(+ #f (let ((a 1) (b 2)) (f a b)))
              '(+ #t (let ((a 1) (c 2)) (f a c))))
  ===> (+
        (not TCP)
        (if TCP
            (let ((a 1) (b 2)) (f a b))
            (let ((a 1) (c 2)) (f a c))))
(compare-expr '((lambda (a) (f a)) 1) '((lambda (a) (g a)) 2))
  ===> ((lambda (a) ((if TCP f g) a)) (if TCP 1 2))
(compare-expr '((lambda (a b) (f a b)) 1 2)
              '((lambda (a b) (f b a)) 1 2))
  ===> ((lambda (a b) (f (if TCP a b) (if TCP b a))) 1 2)
(compare-expr '((lambda (a b) (f a b)) 1 2)
              '((lambda (a c) (f c a)) 1 2))
  ===> ((if TCP (lambda (a b) (f a b))
		(lambda (a c) (f c a)))
	1 2)

(Note that the Racket read-eval-print loop quotes its results unless they are self-quoting, so that, for example, although 12 prints as itself, (if TCP 12 20) prints as '(if TCP 12 20).)

Second, write a Scheme procedure (test-compare-expr x y) that tests your implementation of compare-expr by using eval to evaluate the expression x in the (rnrs base (6)) library, and to evaluate the expression returned by (compare-expr x y) in the same context except with TCP bound to #t, and which checks that the two expressions yield the same value. Similarly, it should check that y in the base library evaluates to the same value that the output of compare-expr evaluates to with TCP bound to #f. The test-compare-expr function should return a true value if both tests succeed, and #f otherwise.

Third, define two Scheme variables test-x and test-y that contain data that can be interpreted as Scheme expressions that test compare-expr well. Your definitions should look like this:

(define test-x '(+ 3 (let ((a 1) (b 2)) (list a b))))
(define test-y '(+ 2 (let ((a 1) (c 2)) (list a c))))

except that your definitions should attempt to exercise all the specification in order to provide a single test case for this complete assignment.

Racket vs Scheme

The Racket language is not exactly the same as Scheme; for example, Racket does not have two-argument if. To run standard R6RS Scheme, put a #!r6rs line at the start of your source file. For more, please see Running Top-Level Programs.

Submit

Submit a file compare-expr.ss containing the definitions of compare-expr, test-compare-expr, test-x, and test-y along with any other auxiliary definitions needed. Make sure that your definitions work with racket, the Scheme implementation installed on SEASnet.


© 2015 Paul Eggert. See copying rules.
$Id: hw5.html,v 1.45 2015/11/12 15:54:55 eggert Exp $