You're working for Acme System Builders (ASB), which specializes in quickly building extremely large packages from source code. Your company maintains a software build factory, which pulls in source code from all over the Internet, configures, compiles, and packages it up using instructions taken from your customers, tests the resulting packages, and then delivers the resulting packages to your customers.
Currently your build system is based on a lot of shell scripts and makefiles. For example, it might use makefile entries like this:
sort: sort.o lib.a gcc -o sort sort.o lib.a LIBRARIES = mpsort.o nanosleep.o error.o lib.a: $(LIBRARIES) ar cru lib.a $(LIBRARIES) .c.o: gcc -c -o $@ $<
These rules say the following things:
Your company is doing well and you're getting a lot of work to do. You're building software on big machines with lots of processors, using GNU make's parallel build feature. This works reasonable well on big machines but they're expensive (especially their network file system), and you want to do the builds more cheaply, on a cluster or perhaps with cloud computing.
Your boss has heard good things about Hadoop for these kinds of environments. He suggests that you look into the possibility of translating your company's makefiles into Pig Latin, a high level language that can be used to control computations running on Hadoop.
Investigate the suitability of using Pig Latin as a substitute for makefiles, for building large software projects. The goal is to build lots of programs as quickly as possible, using a cluster or cloud computing to keep the cost down. Run some small Pig Latin programs to get a feel for how it works, and then take a crack at translating the above makefile fragment into Pig Latin. Assume that commands like gcc will be invoked via Hadoop Streaming. You can't run all the commands in parallel, as some depend on others' output, so you'll have to address the issue of how to implement makefile-style dependencies in Pig Latin.
Write a two-page executive summary assessing the suitability of Pig Latin and Hadoop for improving the performance of your company's software build factory. The summary should be in 10-point font or larger. You can put references on a third page, if there's not enough room on one page. Your summary should focus on the technologies' effects on performance, cost-effectiveness, reliability, portability (to future hardware), flexibility, and ease of use, compared to using shell scripts and makefiles. It should be suitable for software executives, that is, for readers who have some expertise in software, particularly in managing software developers, but who are not experts in Hadoop or Pig Latin or makefiles. Please keep the resources for written reports in mind.
Pig Latin is installed on SEASnet, in /usr/local/cs/bin as usual; you can try it out with, for example, the command pig -x local.
Submit a file hw6.pdf containing your summary.