We have already talked a bit about characters. The primitive data type char represents a single character (i.e. anything that can be typed on a keyboard).
In most computer languages, characters are stored in the ASCII (American Standard Code for Information Interchange). When you save a file as "plain text", it is stored using ASCII.
ASCII format uses 1 byte per character
1 byte gives only 256 possible characters (128 standard and 128 non-standard). Page 357 lists the standard characters. Or see table on web.
Example: The letter 'A' has a code of 65 (base 10) which is stored internally as 01000001 (base 2)
Note that the upper case and lower case letters of the alphabet are stored consecutively. This is why "addition" works on characters. That is, 'A'+1 corresponds to 65+1=66 which is the code for 'B'.
In order to accommodate more possible characters (e.g. from other natural languages), a new code was recently introduced called Unicode
Unicode format uses 2 bytes per character
which allows for 216 ~ 65K different characters. The downside is that at unicode text file takes up twice the amount of space to store. See charts for examples.
All characters that have an ASCII representation, have the same representation in Unicode (except that it is stored in 2 bytes).
Java uses the Unicode format.
A char can hold a single character. A char is a primitive type.
A String, on the other hand, can hold a sequence of characters. It is not a primitive type. A String is a class.
To create a String, one can either
String name = "Alice";
or
String name = new String("Alice");
The above declarations are not really completely the same (see discussion later about comparing Strings);
A String is stored as an array (we'll study this in next chapter), i.e. as a sequence of characters. Each character has an index that can be used to reference that character.
The first character always starts at index 0. For example:
char z = name.charAt(3); // set z equal to 'c'
Note that characters might be blanks:
String day = " Wednesday";
char d = day.charAt(0); // set d = ' '
The String class has a number of other useful methods. Take a moment to look up String in the API reference.
length(): gives the number of characters in the String
Example: outputBox.printLine(name.length()); // prints 5
toUpperCase(): returns a new String with all upper case but does not change the original String.
Example: outputBox.printLine(name.toUpperCase());
// prints "ALICE". Note, name is not changed
.
Also look up: concat, compareTo, indextOf, valueOf, etc
Comparing Strings is tricky. There are several different ways of doing it depending on what you want to compare and how the String is stored.
Given strings s1 and s2, one can compare using
s1 == s2
This is true if s1 and s2 point to same object
s1.equals(s2)
This is true if the characters of s1 and s2 match exactly.
Can you predict what the output of the following will be?
String n1 = "cat" ;
String n2 = "cat";
String n3 = new String("cat");
String n4 = n1;
String n5 = " cat";
// ********* compare n1 and n2
if (n1.equals(n2)) System.out.println("n1 equals n2");
else System.out.println("n1 does not equal n2");
if (n1 == n2) System.out.println("n1 == n2");
else System.out.println("n1 != n2");
// ********* compare n1 and n3
if (n1.equals(n3)) System.out.println("n1 equals n3");
else System.out.println("n1 does not equal n3");
if (n1 == n3) System.out.println("n1 == n3");
else System.out.println("n1 != n3");
// ********* compare n1 and n4
if (n1.equals(n4)) System.out.println("n1 equals n4");
else System.out.println("n1 does not equal n4");
if (n1 == n4) System.out.println("n1 == n4");
else System.out.println("n1 != n4");
// ********* compare n1 and n5
if (n1.equals(n5)) System.out.println("n1 equals n5");
else System.out.println("n1 does not equal n5");
if (n1 == n5) System.out.println("n1 == n5");
else System.out.println("n1 != n5");
When you declare a String as
String n = new String("a unique string");
you are always creating a new String object. However, when you create a string
String n = "an anonymous string";
you are creating a new object only if there is not already an anonymous String object with the same letters.
Usually what you are interested in is whether or not two strings contain the same sequence of characters. In this case, using the equals() method will always work regardless of how the string is declared.
Strings are said to be immutable, meaning they can't be changed. However, you can set a string reference to point to a new String:
String d1 = "Saturday";
d1 = "Wednesday"; // original string garbage collected
If you concatenate
String n1 = "Willamette";
String n2 = n1.concat(" University"); // concat returns
// a new string but doesn't change n1
What do you do if you want to manipulate the actual characters in Strings? - Use a StringBuffer.
StringBuffer name = new StringBuffer("willamette");
name.setCharAt(0,'W');
name.append(" University");