Prasad Ram, ``A New Understanding of Greed'', Ph.D. Dissertation, UCLA Computer Science Department, June 1993.
Anyone who has read traditional presentations of what greedy solvable problems are, comes away with a sense of confusion or disbelief or both. The pervasiveness of greedy algorithms is due to their simplicity. It is just not comprehensible that a complex theory (e.g. matroids, greedoids) underlies the class of optimization problems that can be solved using greedy algorithms. It seems impossible to reconcile one's intuitive understanding with matroids, polymatroids, greedoids, submodularity, etc.
Our chief contribution is to provide a new and simple characterization of greed. This is a very important problem with several applications in all the areas of computer science. Befitting the prominence of the problem, it has been worked on by many reasearchers. Our characterization of greed is a radical departure from the other characterizations based on matroids, polymatroids, greedoids, and submodular functions, and subsumes all these existing formalisms. In this dissertation we have proposed a novel solution to an important problem.
Greedy algorithms to optimization problems are typically iterative and generate a sequence as an answer. Each step of the iteration uses the prefix sequence generated up to the previous iteration to compute the next element of the sequence. We define a very useful partial order relation on sequences called majorization and immediately see the intimate relationship between greed and majorization. We define lower triangular stochastic matrices (LTSMs) and relate them to majorization. We show that a sequence x is majorized by y if y = xL where L is a lower triangular stochastic matrix. An LTSM can also be viewed as a transformer, i.e., one can start with a feasible answer and by repeated applications of LTSMs more optimal answers can be obtained. Order-preserving maps on sequences called Schur-convex functions are defined and necessary and sufficient conditions for a function to be Schur-convex have been identified. The work on defining and developing concepts related to majorization is important in itself, beyond its applicability to characterizing greed.
An optimization problem is greedy solvable if the objective function is Schur-convex on the set of feasible answers when the majorization relation is a topped partial order. The above definition of greedy solvability is the first comprehensive definition of greedy solvability. The classical properties of greedy solvable problems (viz. greedy choice property and the optimal substructure property) can be interpreted in terms of majorization and Schur-convex functions. This helps reconcile our intuitive understanding of greed in terms of our characterization. This definition of greedy solvability creates exciting new possibilities for studying optimization problems in other contexts, such as `greedy heuristics' and `dynamic programming problems'.
S-T. Huang, ``Dual M-Sets: The Theoretical Framework and Applications'', Ph.D. Dissertation, Technical Report, UCLA Computer Science Department, December 1993.
A new framework, the dual-M-set, is proposed both as a means for structuring certain kinds of programs, and as a means for reasoning about aspects of software practice such as programming by patterns.
The framework is based upon the `rigid body' metaphor in physics, an outlook which models information as the result of operator applications (measurements) on immutable objects (rigid bodies). This framework takes monoids as points and vectors, and structure preserving homomorphisms as linear operators. Order preserving endomorphisms (in particular, induction morphisms) are also included as basics. The conceptual framework treats both the objects and their functions as first class citizens. Instead of taking a conventional approach using category theory, we emphasize the local, though universally uniform, analysis of the construction and interaction of subalgebraic closure and homomorphisms, and order preserving morphisms.
We apply the framework to model, comprehensively, two different systems: The Theory of Lists (also known as the Bird-Meertens formalism), for specification and systematic derivation of programs; and the Lambek Calculus, a deductive systems and formal foundation for linguistic applications in Categorial Grammars and formal semantics. Beyond the modeling and new proofs of old results, we also develop extensions inspired by the perspectives that accompany the dual-M-set modeling.
As the rigid body metaphor is very basic in the modeling of information, monoid structures are prevalent in current mathematical program structures. The framework of dual-M-sets aspires for a mathematical program development methodology exploiting this basic metaphor, in keeping with the common long term aim of developing programs as mathematical objects susceptible to mathematical analysis and manipulation. With the encouraging results from its applications to the two examples, we believe that the framework of dual-M-sets is a powerful and promising tool with wide applicability.
Ping-Yu Hsu, ``Incorporating Context Into Databases'', Ph.D. Dissertation, UCLA Computer Science Department, September 1995.
In this dissertation, we study the incorporation of context into databases. Contexts are viewed as ``databases'', have names, and can be treated as values. As databases, contexts contain both schema and data associated with the schema. Treating contexts as databases allows each context to contain independent information and to be updated independently. With names, a context can be referred to by other contexts. Such a design allows users to specify relations between contexts. As values, the contents or identifiers of contexts can be returned as query values and can be used to form queries. No special syntax is required to query contexts.
With these features, we show that databases which can store sets of contexts can handle information that cannot be properly stored by conventional databases. Three directions for incorporating context into databases are shown: contexts as name spaces, contexts as nodes in a specially-designed semantic index system, and contexts as operands of generalized quantifiers. Each of these directions produces a significant contribution:
- Treating name spaces as contexts inspires the design of a system that is effective for storing evolving and irregular information. The system can also be applied to WWW and provide a possible way to perform distributed set-oriented search on HTML pages.
- Treating nodes as contexts in semantic index systems leads to the design of a query language that allows users to consult any subset of the index systems. Such systems can guide neophyte users search while allowing experienced users to search as if there were no index.
- Treating generalized quantifiers as predicates of contexts allows us to implement a useful subset of generalized quantifiers. The implementation discussed here not only works with contextual DBMS, but can also work with ordinary DBMS.
A. Bostani, CFC: An Efficient Stream-Processing Environment, M.S. Thesis, CSD-920099, UCLA Computer Science Department, March 1992.
This thesis describes the CFC, an effcient stream-processing environment. Stream-processing applications are developed using F*, a new programming language which combines logic programming, rewriting and lazy evaluation. Our primary focus in this work is to develop an environment for the effcient execution of F* programs and, where necessary, to provide extensions to the F* language itself. In the course of this research, we have developed a compiler that translates a class of F* programs called DF* into instructions for an abstract machine, called DFAM. We show that it is possible to directly translate DFAM programs into C programs which are extremely portable and effcient. CF* is introduced as a novel extension to the C programming language, providing a non-deterministic function call mechanism. We show that general F* programs can be compiled into an extension of DFAM, called FAM. A compiler is described which compiles FAM programs into CF*. Furthermore, we show that it is possible to effciently implement non-deterministic control structures, such as those found in CF*, on conventional machine architectures. Finally, we have extended F* to make it suitable as a general purpose programming language. Also, unlike the early implementations of F*, this extended F* programming environment does not rely on the availability of Prolog as a host environment.
M. Coleman, Aesthetics-based Graph Layout for Human Consumption, M.S. Thesis, Technical Report CSD-930004, UCLA Computer Science Department, March 1993.
M. Coleman, D.S. Parker, ``Aesthetics-based Graph Layout for Human Consumption'', UCLA Computer Science Department, June 1993. Submitted to Software -- Practice and Experience. Abstract will appear in the Proceedings of Graph Drawing '93, ALCOM International Workshop on Graph Drawing and Topological Graph Algorithms , Paris, September 26--29, 1993.
Automatic graph layout is an important and long-studied problem in computer science. The basic straight-edge graph layout problem is: Given a graph, position the vertices in a way which maximizes some measure of desirability. When graph layout is intended for human consumption, we call this measure of desirability an aesthetic. We seek an algorithm which produces graph layouts of high aesthetic quality, and which handles trees, directed acyclic graphs, and general graphs.
Our approach is to model graph layout as a multiobjective optimization problem, where the value of a layout is determined by a user-controlled set of layout aesthetics. We justify this model theoretically, and describe our (AGLO) algorithm and its implementation, the aglo library.
The AGLO algorithm combines the power and flexibility of the simulated annealing approach of Davidson and Harel (1989) with the relative speed of the method of Fruchterman and Reingold (1991), and provides a better theoretical foundation for these methods. In addition, we have developed several new layout aesthetics to support new layout styles. Using these aesthetics, we are able to produce pleasing displays for graphs on which these other methods flounder.
S-T. Huang, D. Stott Parker, ``A Study of Variants of the Lambek Calculus and an Implementation in Prolog'', Technical Report CSD-9200028, UCLA Computer Science Department, June 1992.
The Lambek Calculus was proposed as a syntactic calculus of expressions of natural languages, of one of its variants leads to the discovery of the property of local permutability; and the theorems and proofs suggest directions for useful extensions of the systems to meet concerns of specific applications, beyond linguistics. A couple of such possible variants are suggested and discussed. Some recently proposed linguistic extensions are shown to be achieved in those variants, from an algebraic consideration independent from the linguistic proposal. The study also pinpoints the similarity and differences between Categorial Grammars and Phrase Structure Grammars, from a computational perspective. An implementation in Prolog is discussed and linked with the modularity and compositionality features of the calculus, an instance of gain through interdisciplinary interplay.
S-T. Huang, D. Stott Parker, ``Types in Lazy Logic Programs: Computation and Infinite Objects'', Technical Report CSD-9200029, UCLA Computer Science Department, June 1992.
A framework for types of logic programs, in particular, those embodying lazy computation and infinite objects, is proposed. Examples are given to illustrate the need. Starting with two methodological views, Types As Sets (subsets of the Herbrand Universe) and Types for The Prolog Computational Model, we give a simple monomorphic type structure and extend it with principles of polymorphism and homogeneity. The SVP model of Parker, Simon and Valduriez is given a detailed type analysis as an application of this framework. Common logic programs serve as other instances of application. Directions for enriching the type structure (Herbrand Type Universe) are discussed.
X. Wang, D. Stott Parker, ``Computing Least Fixed Points by Asynchronous Iterations'', Technical Report CSD-9200030, UCLA Computer Science Department, June 1992.
We establish conditions for convergence of asynchronous iterations and random (chaotic) iterations, such as might be used in evaluation of database queries.
S-T. Huang, D. Stott Parker, ``Structure and Order Preservation: A Unifying Framework using Dual-M-Sets'', Technical Report CSD-930004, UCLA Computer Science Department, March 1993.
Structure and order are key aspects of computational objects. Structure preserving morphisms are emphasized in the study of abstract data types, and have been analyzed with universal algebra or category theory. The order preserving morphisms are traditionally the major focus of program semantics, and in particular, domain theory and denotational semantics, which typically capture order as increase in information. However, the interaction between structure preservation and order preservation has been studied less. While they share many similarities, and while in most interesting cases such as lattices, they can be mutually defined, the issue of their interaction is not trivial at all. As we also know, in lattices, structure preservation is not the same as order preservation.
This interaction is important in a variety of situations. Our particular interest comes from database processing, especially processing of large collections of structured (``complex object'') data. In such applications, both data structure and the morphisms defined on that data are of central importance, as in the SVP model. Structure preservation and order preservation translate directly to performance benefits in parallel processing, as they permits problems to partitioned over multiple computers in a direct and natural way. Ordering also permits lazy evaluation techniques, and in particular permits pipeline processing.
In this article, we introduce the notion of dual-M-sets, an algebraic structure capturing dualities among the category of monoids. In special cases this exposes the interaction between structure preservation and order preservation, by simultaneously considering acceptable morphisms and structured objects. As applications, we show that several useful formal systems can be reconstructed as dual-M-sets, including the Bird-Meertens functional calculus for computable functions on lists, and certain monads (Wadler's list comprehensions). The analysis by categorical constructions also points to relevances in database applications for complex objects, in particular, structured objects and queries defined in terms of structural preservations and possible extensions.
We also show that the Lambek Calculus with type judgement can be modeled as a dual-M-set. The Lambek Calculus has recently enjoyed a revival in formal linguistics as a mathematical foundation of Categorial Grammar. It is a rich grammatical formalism that can serve as the basis for powerful parsing mechanisms, like Narrowing Grammar. Again, with a suitable ordering we can relate structure and order preservation. The Lambek Calculus is also directly translated to a minimal propositional fragment of Linear Logic. The translation leads to a minimal propositional Linear Logic without the Exchange structural rule, and serves a claim that the (formula) resource-conscious linear logic may also location (order of formula) conscious too.
The framework induced by dual-M-sets with natural orderings ties together several interesting concepts, including structure and order preservation, the duality of category and monoid, list, grammatical, and logical formalisms. The framework induced by dual-M-sets therefore appears both powerful and useful.
C. Amador, ``An Implementation of the SVP Database Model'', Technical Report, UCLA Computer Science Department, July 1993.
We describe an implementation of the SVP data model. SVP is a database model intended for modeling both set and stream data, and to model parallelism in bulk data processing.
SVP models collections, which include sets and streams as special cases. Collections are represented as ordered tree structures, and divide-and-conquer mappings are easily defined on these structures. We show that many useful database mappings (queries) have a divide-and-conquer format when specified using collections, and that this specification exposes parallelism.
We describe a multithreaded implementation written in C. This implementation is parametric in control schemes for forking threads; that is, one can easily modify the control scheme used for generating new threads of execution while processing queries. It has been run successfully on an Encore Multimax at INRIA Rocquencourt. While performance has not been a key focus of this implementation, we feel the implementation shows what is possible.
Xin Wang, D. Stott Parker, ``Moebius Inversion, Boolean Differentiation, and Incremental Computation'', Technical Report, UCLA Computer Science Department, July 1993.
This paper studies Moebius inversion on Boolean algebras and its relation with generalized Boolean differentials on Boolean rings. A motivation for this study stems from an interest in improving our understanding of the Moebius inversion formula as an analog of the fundamental theorem of calculus, and pursuing concepts and tools of incremental computations for functions on ordered algebras.
Shen-Tzay Huang, D. Stott Parker, ``Kahn's Networks of Processes as discrete analogues of Picard's Systems of Differential Equations'', Technical Report, UCLA Computer Science Department, February 1994.
Anyone who has studied parallel programming with Kahn's networks of processes and also systems theory or control theory has noticed a strong similarity between the two. For one thing, both use similar diagrams, often with feedback, to express computations formed as compositions of incremental transforms. What is a mathematical basis for reconciling the two?D. Stott Parker (email@example.com)
Picard's theorem is a classical result in the theory of differential equations and integral equations. Kahn's theorem has similar renown in the theory of parallel programming, networks of processes, and stream processing. In this paper we clarify how these results are related, and show that Kahn's theorem and Picard's theorem are both special cases of a more general result. That we can do this may be surprising, since the foundations of the two theorems are quite different: metric function spaces and the fixed point theorem for contraction mappings in Picard's case, and complete partially ordered sets and the fixed point theorem for continuous mappings in Kahn's.
Our strategy is to translate the foundations for Kahn's theorem into appropriate metric space concepts. We accomplish this by establishing discrete versions of Picard's Theorem for the metric space of streams. We define contraction mappings over the set of functions of streams, and show that Kahn's Theorem is a variant of Picard's Theorem in this setting.
We explore a number of representative examples showing applications of this perspective: Mealy machines (showing feedback-free solution), stream prefix accumulators and the `powerset' function (showing both simple and complex feedback loops), recursively-defined fractal objects (such as Cantor's set), and formal power series or generating functions. The metric perspective seems particularly useful for stream computations.
The result here, that Kahn's theorem is related to Picard's theorem, is not just a technical result. It gives a formal connection between stream processing and differential equations, a connection between continuous and discrete mathematical models. This connection gives new ways to formalize incremental computation, deal with infinite objects like streams, understand the role of feedback, and adapt tools in the theory of differential equations to stream processing.