Spring 2020 CS 32

Programming Assignment 4
Chat Stats

Time due: 11:00 PM Thursday, June 4 Tuesday, June 9

An online forum hosts many chat rooms. A user can join a chat, contribute to the conversation in that chat, and leave a chat. A user doesn't have to leave one chat before joining another. A user's current chat is the existing chat the user most recently joined that the user has not left since most recently joining. Your assignment is to implement a chat tracker that keeps track of the number of contributions users make to the chats. Users and chats are named by strings. Here's an example sequence of activities:

  1. User Fred joins chat Breadmaking (so that becomes his current chat)
  2. User Ethel joins chat Breadmaking (so that becomes her current chat)
  3. User Fred contributes (to his current chat, Breadmaking)
  4. User Lucy joins chat Lint Collecting (so that becomes her current chat)
  5. User Ethel contributes (to her current chat, Breadmaking)
  6. User Fred contributes (to his current chat, Breadmaking)
  7. User Fred contributes (to his current chat, Breadmaking)
  8. User Ricky joins chat Lint Collecting (so that becomes his current chat)
  9. User Fred joins chat Lint Collecting (so that becomes his current chat)
  10. User Fred contributes (to his current chat, Lint Collecting)
  11. User Lucy joins chat Elbonian Politics (so that becomes her current chat)
  12. User Fred contributes (to his current chat, Lint Collecting)
  13. User Ricky contributes (to his current chat, Lint Collecting)
  14. User Lucy contributes (to her current chat, Elbonian Politics)
  15. User Lucy joins chat Breadmaking (so that becomes her current chat)
  16. User Ethel joins chat Breadmaking (no effect; it's already her current chat)
  17. User Lucy contributes (to her current chat, Breadmaking)
  18. Chat Lint Collecting is terminated (so Fred's current chat reverts to Breadmaking, Lucy's current chat is still Breadmaking, and Ricky has no current chat)
  19. User Lucy joins chat Burmese Cats (so that becomes her current chat)
  20. User Lucy contributes (to her current chat, Burmese Cats)
  21. User Lucy joins chat Worm Farming (so that becomes her current chat)
  22. User Lucy joins chat Elbonian Politics (so that becomes her current chat)
  23. User Lucy leaves chat Breadmaking (her current chat remains Elbonian Politics)
  24. User Lucy contributes (to her current chat, Elbonian Politics)
  25. User Ethel leaves chat Breadmaking (so Ethel has no current chat)
  26. User Lucy leaves her current chat (so her current chat reverts to Worm Farming)
  27. User Ricky joins chat Lint Collecting (so that becomes his current chat)
  28. User Ricky contributes (to his current chat, Lint Collecting)
  29. User Ricky contributes (to his current chat, Lint Collecting)
  30. User Ricky leaves his current chat (so Ricky has no current chat)
  31. User Lucy joins chat Breadmaking (so that becomes her current chat)
  32. User Fred leaves his current chat (so Fred has no current chat)

The ChatTracker class you will implement has member functions corresponding to the five types of activities illustrated above. We will use two notions: a user's count of contributions to a chat and a chat's count of contributions, and describe the effect of the functions on each.

  // A user joins a chat
void join(string user, string chat);
The user joins a new or existing chat. Whether or not the user was associated with that chat before, that chat is now the user's current chat. If the user had previously joined that chat and not left it since then, the user's count of contributions to that chat remains the same; otherwise, it becomes 0. If the chat had previously been joined by this or any other user and has not been terminated since then, the chat's count of contributions is unchanged; otherwise, it becomes 0.
  // A chat is terminated
int terminate(string chat);
If the chat has no users associated with it or does not exist, this function returns 0; otherwise, all users associated with the chat are no longer associated with it (as if they left the chat), and the function returns the chat's count of contributions. If that chat was any user's current chat, then the existing chat the user most recently joined and hasn't left since most recenty joining becomes the user's current chat; if there is no such chat, then that user has no current chat.
  // A user contributes to that user's current chat
int contribute(string user);
If the user has no current chat, return 0. Otherwise, both the user's count of contributions to that user's current chat and that current chat's count of contributions are increased by one. Return the resulting user's count of contributions to that current chat.
  // A user leaves a chat
int leave(string user, string chat);
If the user is not associated with the indicated chat, return -1. Otherwise, the user is no longer associated with that chat, and the function returns the user's count of contributions to that chat. If that chat was the user's current chat, then the existing chat the user most recently joined and hasn't left since most recenty joining becomes the user's current chat; if there is no such chat, then that user has no current chat. This function leaves the chat's count of contributions unchanged.
  // A user leaves that user's current chat
int leave(string user);
If the user has no current chat, return -1. Otherwise, the user is no longer associated with that user's current chat, and the function returns the user's count of contributions made to that chat. The existing chat the user most recently joined and hasn't left since most recenty joining bcomes the user's current chat; if there is no such chat, then that user has no current chat. This function leaves the chat's count of contributions unchanged.

We would represent the example above by creating a ChatTracker named, say, ct, and making the following sequence of calls. We will use assert to show you what the function return values are.

          ChatTracker ct;
/*  1 */  ct.join("Fred", "Breadmaking");
/*  2 */  ct.join("Ethel", "Breadmaking");
/*  3 */  assert(ct.contribute("Fred") == 1);
/*  4 */  ct.join("Lucy", "Lint Collecting");
/*  5 */  assert(ct.contribute("Ethel") == 1);
/*  6 */  assert(ct.contribute("Fred") == 2);
/*  7 */  assert(ct.contribute("Fred") == 3);
/*  8 */  ct.join("Ricky", "Lint Collecting");
/*  9 */  ct.join("Fred", "Lint Collecting");
/* 10 */  assert(ct.contribute("Fred") == 1);
/* 11 */  ct.join("Lucy", "Elbonian Politics");
/* 12 */  assert(ct.contribute("Fred") == 2);
/* 13 */  assert(ct.contribute("Ricky") == 1);
/* 14 */  assert(ct.contribute("Lucy") == 1);
/* 15 */  ct.join("Lucy", "Breadmaking");
/* 16 */  ct.join("Ethel", "Breadmaking");
/* 17 */  assert(ct.contribute("Lucy") == 1);
/* 18 */  assert(ct.terminate("Lint Collecting") == 3);
/* 19 */  ct.join("Lucy", "Burmese Cats");
/* 20 */  assert(ct.contribute("Lucy") == 1);
/* 21 */  ct.join("Lucy", "Worm Farming");
/* 22 */  ct.join("Lucy", "Elbonian Politics");
/* 23 */  assert(ct.leave("Lucy", "Breadmaking") == 1);
/* 24 */  assert(ct.contribute("Lucy") == 2);
/* 25 */  assert(ct.leave("Ethel") == 1);
/* 26 */  assert(ct.leave("Lucy") == 2);
/* 27 */  ct.join("Ricky", "Lint Collecting");
/* 28 */  assert(ct.contribute("Ricky") == 1);
/* 29 */  assert(ct.contribute("Ricky") == 2);
/* 30 */  assert(ct.leave("Ricky") == 2);
/* 31 */  ct.join("Lucy", "Breadmaking");
/* 32 */  assert(ct.leave("Fred") == 3);

Here is a declaration of the interface for the ChatTracker class (the constructor argument will be explained later on in this spec):

	class ChatTracker
	{
	  public:
	    ChatTracker(int maxBuckets = 20000);
	    ~ChatTracker();
            void join(std::string user, std::string chat);
            int terminate(std::string chat);
            int contribute(std::string user);
            int leave(std::string user, std::string chat);
            int leave(std::string user);
	};

We have written a correct but inefficient ChatTracker implementation in ChatTracker.slow.cpp. Your assignment is to write a more efficient correct implementation. If you wish, you can do this by starting with our implementation, and changing the data structures and algorithms that implement the class and its member functions. You may change or add classes/structs, data members, and/or member functions as you wish. Your implementation, though, must produce correct results and not leak any memory, just like the one given, except that it should be faster. Correctness will count for 40% of your score, although if you turn in a correct implementation that is no faster than the inefficient one we gave you, you will get zero correctness points (since you could just as well have turned in that same inefficient implementation).

Of course, we have to give you some assumptions about the way clients will use the ChatTracker so that you know what needs to be faster. There may be thousands of chats and tens of thousands of users. The average number of users per chat is about 10, with many chats having much fewer, and a few chats having many more.

Performance will count for 50% of your score (and your report for the remaining 10%). To earn any performance points, your implementation must be correct and not leak memory. (Who cares how fast a program is if it produces incorrect results?) This is a critical requirement — you must be certain your implementation produces the correct results, doesn't leak memory, and does not do anything undefined. Given that you are starting from a correct implementation, this should be easier than if you had to start from scratch. The faster your program performs on the tests we make, the better your performance score. We have to be strict about no memory leaks so that you don't try to save time by not deleting allocated memory.

You'll have to come up with your own test cases to assure yourself your program is correct. (The example above is the basic correctness test that the testChatTracker.cpp file contained in our correct but inefficient implementation performs.) To build your confidence that your program is correct for much larger data sets, the thorough correctness test in testChatTracker.cpp reads a file named commands.txt that you provide. Each line of that file requests a call to one of the ChatTracker member functions; the five kinds of lines are

One limitation of this file format is that it doesn't provide for user names that have any internal whitespace, such as Little Ricky, although it handles chat names that have internal whitespace, such as Lint Collecting.

Because the tool we're about to describe generates large files like these for you, you don't really need to understand their format, but as an example for the curious, here's a test file that produces the same sequence of calls as the example above.

The thorough correctness test compares your functions' return values against those of a correct (but slow) solution. To run the test, create a project with our ChatTracker.h, our testChatTracker.cpp, and your ChatTracker.cpp.

The file generateTests.cpp produces an executable that generates a random test file suitable for use for correctness and performance testing. You can specify the file name it should produce. (You'd copy that file to commands.txt to run the tests.) One run we did produced this test file (2MB), involving 1000 chats and 10000 users. (These are constants defined at the top of generateTests.cpp.)

Our performance test takes the same commands.txt file and times how long your implementation takes to process its commands. As an example, we used that test file (2MB) to time our correct but inefficient implementation of ChatTracker. We then replaced that ChatTracker.cpp with a better one. The timing results on the SEASnet Linux server were 5950 milliseconds for the inefficient implementation, and 40 milliseconds for the better one; on a 2.5 GHz MacBook Pro, the numbers were 5130 milliseconds and 20 milliseconds, respectively. On the SEASnet Windows server, they were 5010 and 54 milliseconds, respectively.

Around line 19 of testChatTracker.cpp is a declaration that specifies the name of the test file, commands.txt. If you leave that string alone, then be aware of where that file must be located: If you launched your program from Visual Studio, it must be in the same folder as your C++ source file; if you're using Linux, the file must be in the current directory; if you're using Xcode, the file must be in the yourProject/DerivedData/yourProject/Build/Products/Debug or …/Products/Release folder.

So many students waste time on trying to figure out where to put input files, so it might be easier to just change the "commands.txt" string literal to a full path name like "C:/CS32/P4/commands.txt" (note the forward slashes) or "/Users/yourUserName/CS32/P4/commands.txt". You don't even have to use the name commands.txt for the file.

When you use Visual C++ or Xcode, the default build configuration is the Debug configuration, one in which the compiler inserts extra code to check for a variety of runtime problems. This is nice when you're not yet sure your program is correct, but these extra checks take time, and in the case of checks involving STL containers, potentially a lot of time. The g32 command is similar in this regard.

When you are sure your program is correct, and you're ready to test its speed, you'll want to build an optimized version of your program (which can run an order of magnitude faster than the Debug version). See the last problem of Homework 4 for instructions on how to build in Release mode for Xcode and Visual C++; for g++, on cs32.seas.ucla.edu say

	g32fast -o tester ChatTracker.cpp testChatTracker.cpp
	./tester

Here are some requirements you must meet:

Turn it in

By Wednesday, June 3, there will be a link on the class webpage that will enable you to turn in your source file and report. You will turn in a zip file containing these two files: