Time due: 11:00 PM Thursday, June 4 Tuesday, June 9
An online forum hosts many chat rooms. A user can join a chat, contribute to the conversation in that chat, and leave a chat. A user doesn't have to leave one chat before joining another. A user's current chat is the existing chat the user most recently joined that the user has not left since most recently joining. Your assignment is to implement a chat tracker that keeps track of the number of contributions users make to the chats. Users and chats are named by strings. Here's an example sequence of activities:
The ChatTracker class you will implement has member functions corresponding to the five types of activities illustrated above. We will use two notions: a user's count of contributions to a chat and a chat's count of contributions, and describe the effect of the functions on each.
// A user joins a chat void join(string user, string chat);
// A chat is terminated int terminate(string chat);
// A user contributes to that user's current chat int contribute(string user);
// A user leaves a chat int leave(string user, string chat);
// A user leaves that user's current chat int leave(string user);
We would represent the example above by creating a ChatTracker named, say,
ct
, and making the following sequence of calls. We will
use assert
to show you what the function return values are.
ChatTracker ct; /* 1 */ ct.join("Fred", "Breadmaking"); /* 2 */ ct.join("Ethel", "Breadmaking"); /* 3 */ assert(ct.contribute("Fred") == 1); /* 4 */ ct.join("Lucy", "Lint Collecting"); /* 5 */ assert(ct.contribute("Ethel") == 1); /* 6 */ assert(ct.contribute("Fred") == 2); /* 7 */ assert(ct.contribute("Fred") == 3); /* 8 */ ct.join("Ricky", "Lint Collecting"); /* 9 */ ct.join("Fred", "Lint Collecting"); /* 10 */ assert(ct.contribute("Fred") == 1); /* 11 */ ct.join("Lucy", "Elbonian Politics"); /* 12 */ assert(ct.contribute("Fred") == 2); /* 13 */ assert(ct.contribute("Ricky") == 1); /* 14 */ assert(ct.contribute("Lucy") == 1); /* 15 */ ct.join("Lucy", "Breadmaking"); /* 16 */ ct.join("Ethel", "Breadmaking"); /* 17 */ assert(ct.contribute("Lucy") == 1); /* 18 */ assert(ct.terminate("Lint Collecting") == 3); /* 19 */ ct.join("Lucy", "Burmese Cats"); /* 20 */ assert(ct.contribute("Lucy") == 1); /* 21 */ ct.join("Lucy", "Worm Farming"); /* 22 */ ct.join("Lucy", "Elbonian Politics"); /* 23 */ assert(ct.leave("Lucy", "Breadmaking") == 1); /* 24 */ assert(ct.contribute("Lucy") == 2); /* 25 */ assert(ct.leave("Ethel") == 1); /* 26 */ assert(ct.leave("Lucy") == 2); /* 27 */ ct.join("Ricky", "Lint Collecting"); /* 28 */ assert(ct.contribute("Ricky") == 1); /* 29 */ assert(ct.contribute("Ricky") == 2); /* 30 */ assert(ct.leave("Ricky") == 2); /* 31 */ ct.join("Lucy", "Breadmaking"); /* 32 */ assert(ct.leave("Fred") == 3);
Here is a declaration of the interface for the ChatTracker class (the constructor argument will be explained later on in this spec):
class ChatTracker { public: ChatTracker(int maxBuckets = 20000); ~ChatTracker(); void join(std::string user, std::string chat); int terminate(std::string chat); int contribute(std::string user); int leave(std::string user, std::string chat); int leave(std::string user); };
We have written a correct but inefficient
ChatTracker implementation in ChatTracker.slow.cpp
. Your
assignment is to write a more efficient correct implementation. If you
wish, you can do this by starting with our implementation, and changing
the data structures and algorithms that implement the class and its member
functions. You may change or add classes/structs, data members, and/or
member functions as you wish. Your implementation, though, must produce
correct results and not leak any memory, just like the one given, except
that it should be faster. Correctness will count for 40% of your score,
although if you turn in a correct implementation that is no faster than
the inefficient one we gave you, you will get zero correctness points
(since you could just as well have turned in that same inefficient
implementation).
Of course, we have to give you some assumptions about the way clients will use the ChatTracker so that you know what needs to be faster. There may be thousands of chats and tens of thousands of users. The average number of users per chat is about 10, with many chats having much fewer, and a few chats having many more.
Performance will count for 50% of your score (and your report for the remaining 10%). To earn any performance points, your implementation must be correct and not leak memory. (Who cares how fast a program is if it produces incorrect results?) This is a critical requirement — you must be certain your implementation produces the correct results, doesn't leak memory, and does not do anything undefined. Given that you are starting from a correct implementation, this should be easier than if you had to start from scratch. The faster your program performs on the tests we make, the better your performance score. We have to be strict about no memory leaks so that you don't try to save time by not deleting allocated memory.
You'll have to come up with your own test cases to assure yourself your
program is correct. (The example above is the basic correctness test that
the testChatTracker.cpp
file contained in
our correct but inefficient implementation
performs.) To build your confidence that your program is correct for
much larger data sets, the thorough correctness test in
testChatTracker.cpp
reads a file named commands.txt
that you provide. Each line of that file requests a call to one of the
ChatTracker member functions; the five kinds of lines are
j user chat
, which requests a call to join(user, chat)
t chat
, which requests a call to terminate(chat)
c user
, which requests a call to contribute(user)
l user chat
, which requests a call to leave(user, chat)
l user
, which requests a call to leave(user)
One limitation of this file format is that it doesn't provide for user names
that have any internal whitespace, such as Little Ricky
, although
it handles chat names that have internal whitespace, such as Lint
Collecting
.
Because the tool we're about to describe generates large files like these for you, you don't really need to understand their format, but as an example for the curious, here's a test file that produces the same sequence of calls as the example above.
The thorough correctness test compares your functions' return values against
those of a correct (but slow) solution. To run the test, create a project with
our ChatTracker.h
, our
testChatTracker.cpp
, and your ChatTracker.cpp
.
The file generateTests.cpp produces an
executable that generates a random test file suitable for use for
correctness and performance testing. You can specify the file name it should
produce. (You'd copy that file to commands.txt
to run the
tests.) One run we did produced this test file
(2MB), involving 1000 chats and 10000 users. (These are constants
defined at the top of generateTests.cpp.)
Our performance test takes the same commands.txt
file and times
how long your implementation takes to process its commands. As an example,
we used that test file (2MB) to time our
correct but inefficient implementation of ChatTracker. We then replaced
that ChatTracker.cpp
with a better one. The timing results on the
SEASnet Linux server were 5950 milliseconds for the inefficient implementation,
and 40 milliseconds for the better one; on a 2.5 GHz MacBook Pro, the
numbers were 5130 milliseconds and 20 milliseconds, respectively. On the
SEASnet Windows server, they were 5010 and 54 milliseconds, respectively.
Around line 19 of testChatTracker.cpp
is a declaration that
specifies the name of the test file, commands.txt
. If you leave
that string alone, then be aware of where that file must be located: If you
launched your program from Visual Studio, it must be in the same folder as
your C++ source file; if you're using Linux, the file must be in the
current directory; if you're using Xcode, the file must be in the
yourProject/DerivedData/yourProject/Build/Products/Debug
or …/Products/Release folder.
So many students waste time on trying to figure out where to put input files,
so it might be easier to just change the "commands.txt"
string
literal to a full path name like "C:/CS32/P4/commands.txt"
(note the forward slashes) or
"/Users/yourUserName/CS32/P4/commands.txt"
. You don't
even have to use the name commands.txt
for the file.
When you use Visual C++ or Xcode, the default build configuration is the Debug configuration, one in which the compiler inserts extra code to check for a variety of runtime problems. This is nice when you're not yet sure your program is correct, but these extra checks take time, and in the case of checks involving STL containers, potentially a lot of time. The g32 command is similar in this regard.
When you are sure your program is correct, and you're ready to test its speed,
you'll want to build an optimized version of your program (which can run an
order of magnitude faster than the Debug version). See the last problem of
Homework 4 for instructions
on how to build in Release mode for Xcode and Visual C++; for g++, on
cs32.seas.ucla.edu
say
g32fast -o tester ChatTracker.cpp testChatTracker.cpp ./tester
Here are some requirements you must meet:
You must not change ChatTracker.h
in any way. (In fact, you
will not turn that file in; when we test your program, we'll use ours.)
You can change ChatTracker.cpp
however you want, subject to
this spec. (Notice that by giving ChatTracker just one private data
member, a pointer to a ChatTrackerImpl object (which you can define
however you want in ChatTracker.cpp
), we have given you free
rein for the implementation while still preventing you from changing the
interface in any way.
You may design interesting data structures of your own, but the only
containers from the C++ library you may use are vector
,
list
, forward_list
(a singly-linked list),
stack
, and queue
(and string
);
you must not use set
, multiset
,
map
, multimap
, unordered_set
,
unordered_multiset
, unordered_map
, or
unordered_multimap
. If you want anything fancier like a hash
table or a binary search tree, implement it yourself. (It'll be a good
exercise for you.) Although we're limiting your use of
containers from the library, you are free to use library
algorithms (e.g., sort
).
If you choose to use hash tables in your implementation, no hash table may
ever have a number of buckets greater than the argument passed to the
ChatTracker constructor. A simple way to test that you handle collisions
correctly would be to specify the number of buckets to be 2 or 3 (e.g.,
ChatTracker ct(3);
and run a small correctness test that puts
several times that many items in the table. If you don't use any hash
tables, your constructor may ignore the value of its parameter (as the one
in ChatTracker.slow.cpp does).
A project consisting of your ChatTracker.cpp
and our
ChatTracker.h
and testChatTracker.cpp
from our
inefficient implementation must build correctly, and when executed, must
run to completion without error.
During execution, your program must not perform any undefined actions, such as dereferencing a null or uninitialized pointer, and must not leak memory.
By Wednesday, June 3, there will be a link on the class webpage that will enable you to turn in your source file and report. You will turn in a zip file containing these two files:
ChatTracker.cpp
. You will not turn in ChatTracker.h
or a main routine.
report.docx
, report.doc
, or report.txt
,
a report containing
a description of your algorithms and data structures (good diagrams may help reduce the amount you have to write), and why you made the choices you did. You can assume we know all the data structures and algorithms discussed in class and their names.
pseudocode for non-trivial algorithms.
a note about any known bugs, serious inefficiencies, or notable problems you had.