[Note: This writeup is not relevant for Project 2.]
A string is a sequence of zero or more characters. The size
function tells you how many characters are in the string:
string s = "Hello";
cout << s.size(); // writes 5
s = "Wow";
cout << s.size(); // writes 3
s = "";
cout << s.size(); // writes 0
(For historical reasons, there is also a length function that
returns the same value that size does. In other words,
s.length() and s.size() may be used interchangeably.)
You can access individual characters in a string using the at
function. The positions of the characters in a string are numbered from left
to right, starting at 0. Your program will die with a runtime error if it
tries to access a character at a position that is out of range for the string.
(You can also access individual characters in a string using the
[] operator. Your program's behavior is undefined if it tries to
access a character at a position that is out of range for the string.)
// 01234
string s = "Hello"; // Hello
cout << s.at(0); // writes H
cout << s.at(4); // writes o
cout << s.at(6); // Runtime error!
cout << s.at(-1); // Runtime error!
To visit every character in a string (for example, to write each character of the string on a line by itself), you can say
string s = "Hello";
for (int k = 0; k != s.size(); k++)
cout << s.at(k) << endl;
[You don't have to read the nerdy footnote at the bottom of this page that has something to say about the loop above.]
Another thing you can do with a string is to append characters to the end
of the string. The += operator lets you do this. (This is
a different use of the operator than the one that lets you add a number
to an int or double variable.) Here's an example
where we copy all of the non-blank characters from the string s
to the string t:
string s = "Hello there! How are you?";
string t; // automatically initialized to the empty string
for (size_t k = 0; k != s.size(); k++)
{
if (s.at(k) != ' ') // If s.at(k) is not a blank
t += s.at(k); // append s.at(k) to t
}
cout << t; // writes Hellothere!Howareyou?
Notice that when talking about constants representing single characters, we
use single quote marks, not double quote marks. C++ distinguishes between
the type string, objects of which are sequences of zero or more
characters, and the type char, objects of which are always a
single character. If s is a string, then the
expression s.at(k) is a char. The language lets us
compare a char with another char, like the
constants ' ' or '@' or 'A'.
(The single quotes denote a char constant.)
You are also able to copy a substring of a string. For example, here's how
we can copy the substring of s starting at position 5 and going
for 3 characters:
// 012345678
string s = "duplicate"; // duplicate
cout << s.substr(5,3); // writes cat
Here's how to clip off the first six characters of a string:
string t = "fingernail";
t = t.substr(6, t.size()-6); // t is now "nail"
Sometimes we want to classify characters, asking, for example, whether they are letter characters or digit characters. If you say
#include <cctype>
then you can use character classification functions like these:
// 012345678
string s = "30 For 30"; // 30 For 30
if (isdigit(s.at(0))) // tests as true, since '3' is a digit character
...
if (isalpha(s.at(3))) // tests as true, since 'F' is a letter
...
if (isupper(s.at(3))) // tests as true, since 'F' is an uppercase letter
...
if (islower(s.at(5))) // tests as true, since 'r' is a lowercase letter
...
if (islower(s.at(3))) // tests as false, since 'F' is not a lowercase letter
...
if (isalpha(s.at(2))) // tests as false, since ' ' is not a letter
...
if (isalpha(s.at(0))) // tests as false, since '3' is not a letter
...
This code copies all non-letters in a string:
string s = "#1 in 2025: Yeah!";
string t;
for (size_t k = 0; k != s.size(); k++)
if (!isalpha(s.at(k))) // if not a letter
t += .at(k); // append it to t
// t is now "#1 2025: !"
Caution: For historical reasons, isalpha,
isdigit, etc., return an int, not a bool. If the condition they
test for is met, they return a non-zero value (which tests as true), but that
value might be a non-zero value other than 1. So to test if the condition is
met, write your test as, say,
if (isalpha(ch))
instead of
if (isalpha(ch) == true) // WRONG!!!!
since in a comparison involving an int and a bool, the bool will be converted to int; because true converts to 1, and the non-zero int that isalpha returns for a letter might not be 1, the condition for the if might evaluate to false.
The function tolower, when given an uppercase letter, returns
the lowercase equivalent of that letter; when given any other character,
just returns that same character. So
string s = "Don't SHOUT!";
string t;
for (size_t k = 0; k != s.size(); k++)
t += tolower(s.at(k));
cout << t;
writes don't shout!. Similarly, the function toupper
returns the uppercase equivalent of a letter.
There's a lot more you can do with strings and characters, but the information in this tutorial will suffice to enable you to do Project 3.
While a loop starting
for (int k = 0; k != s.size(); k++)
will work for everything you're doing in this class, technically the
expression s.size() returns a number of a special type
defined in the library: not int, but
string::size_type. This type name is a synonym for some
unsigned integer type. (An unsigned integer variable can contain only
whole numbers, no negatives.) It turns out that a consequence of the C++
expression rules is that if k is an int, the loop
above might not work correctly for strings over 2 billion characters long,
and the compiler might give you a warning about that, phrased as a
"signed/unsigned mismatch" or a "comparison of integer expressions of
different signedness". Since we won't be using such ridiculously long
strings, declaring k to be an int is fine.
Still, it's good practice to try to get a clean build with no warnings. Like the boy who cried wolf, if the compiler gives you many warnings about things that are harmless, you won't notice the warnings you should take seriously. To eliminate the warning you might get, you should declare k to be of the technically proper type:
for (string::size_type k = 0; k != s.size(); k++)
Most C++ library implementations make size_t synonymous with
string::size_type, so you can get away with the somewhat shorter
for (size_t k = 0; k != s.size(); k++)
Again, you don't have to do this; you can declare k to
be an int if you like, but in that case be prepared for possible
(harmless) signed/unsigned warnings.
If you do choose to declare k to be of type
string::size_type or size_t, you need to be sure
that you never try to make k negative. For example, if you try to
traverse a string backward, then your saying
for (string::size_type k = s.size()-1; k >= 0; k--) // WRONG
{
... s.at(k) ...
}
would lead to undefined behavior. (If an unsigned integer k is 0
when you execute k--, it will end up with a huge positive
value. An unsigned integer is always >= 0, so we execute the loop body
and try to talk about a character at a position way past the end of the
string.) One correct way to write the loop is
string::size_type k = s.size();
while (k > 0)
{
k--;
... s.at(k) ...
}
Again, if we choose to make k an int, the
for loop version would be fine, but we'd get the (harmless)
signed/unsigned warning.