Strings
A string literal is a sequence of characters enclosed within double quotes:
“Put a disk in drive A, then press any key to continue\n”
They often appear in calls of print and scanf.
If a string is too long it can be broken up with a \.
printf(“Put a disk in drive A, then \
press any key to continue\n”);
But the string must continue at the beginning of the next line – ruining the indented structure.
A better way was added to C when the language was standardized. We can split a string literal like this:
Printf(“Put a disk in drive A, then ”
“press any key to continue\n”);
C treats string literals as character arrays.
When a C compiler encounters a string literal of length n in a program, it sets aside n + 1 characters of memory for the string. This area of memory will contain the characters in the string, plus one extra character – the null character – to mark the end of the string.
The null character is the very first character in the ASCII character set, so it’s represented by the \0 escape sequence.
Ex. sheet Strings 1.
Since a string literal is stored as an array, the compiler treats it as a pointer of type char *. Both printf and scanf expect a value of char * as their first argument.
Printf(“abc”);
When printf is called, it is passed the address of “abc” (a pointer to where the letter a is stored in memory).
char *p;
p = *abc;
This assignment doesn’t copy the characters in “abc”; it makes p point to the first character of the string.
You can subscript:
Char ch;
Ch = “abc”[1];
Ch is now the letter b.
0 would be the letter a, 2 the letter c, and 3 the null character. This is not used much.
Ex. returns hex digit.
Char digit_to_hex_char(int digit)
{
return “0123456789ABCDEF”[digit];
}
Changing characters in a string literal is possible, but not recommended.
char *p = “abc”;
*p=’b’; /* string literal is now “bbc” */
With some compilers, changing a string literal may cause programs to behave erratically.
A string literal containing one character isn’t the same as a character constant. “a” is represented by a pointer to a memory location that contains the character a (followed by a null character. The character constant ‘a’ is represented by an integer (the ASCII code for the character).
printf(‘\n’); /*WRONG*/
Not legal because printf expects a pointer as its first argument.
#define STR_LEN 80
if we need to store up to 80 characters.
Char str[STR_LEN+1];
Add 1 to the macro because of the null character. We defined as 80 rather than 81 to emphasize the fact that str can store strings of no more than 80 characters.
This doesn’t mean that it will always contain a string of STR_LEN characters. The length of a string depends on the position of the terminating null character, not on the length of the array in which the string is stored.
A string variable can be initialized at the point of declaration:
Char date1[8]=”June 14”;
It will copy characters from “June 14” into the date1 aray and add a null character so that date1 can be used a a string.
Date1: in array box: June 14\0
“June 14” not string literal – C views it as an abbreviation for array initializer:
char date1[8] = {‘J’, ‘u’, ‘n’, ‘e’, ‘ ‘, ‘1’, ‘4’, ‘\0’};
if it’s too short – adds extra null characters:
char date2[9]=”June 14”;
date2: in array: June 14\0\0
If initializer is longer than the string variable. That’s illegal for strings, just as for other arrays. C does allow the initializer to have exactly the same length (not counting null).
Char date3[7]=”June 14”;
Date3:array: June 14
Doesn’t store null.
So be sure the length of array is longer than initializer, otherwise compiler will omit the null character and the array will be unusable.
Can omit length.
Char date4[]=”June 14”;
Compiler then sets aside eight characters for date4, enough to store the characters in “June 14” plus a null character. – can’t change length later – once the program is compiled the length of date4 is fixed at eight.
It’s useful especially if initializer is long since computing the length by hand can be error-prone.
Character Arrays vs Character Pointers
Char date[] = “June 14”;
Date is an array of characters
Char *date = “June 14”;
Date is a pointer to a string literal.
Either version of date can be used as a string. There are differences:
Array: characters stored in date can be modified, like the elements of any array.
Pointer: date points to a string literal, and it shouldn’t be modified.
Array: date is an array name
Pointer: date is a vriable that can be made to point to other strings during program execution.
Char *p;
Sets aside memory for a pointer variable, it doesn’t allocate space for a string.
Can point it to a strin variable that already exists:
Char str[STR_LEN+1], *p;
P = str;
P now points to the first character of str, so we can use p as a string.
Char *p;
P[0] = ‘a’; /*WRONG */
P hasn’t been initialized. We don’t know where it’s pointing. May have problems.
Reading and Writing Strings
Writing a string: printf or puts.
Reading – harder because input string may be longer than the string variable into which it’s being stored.
To read in a single step – use scanf or gets. An alternative, we can read strings one character at a time.
Writing
%s – allows printf to write a string.
Char str[] = “Are we having fun yet?”;
Printf(“Value of str: %s\n”, str);
The output will be:
Value of str: Are we having fun yet?
Printf writes the characters in a string one bvy one until it encounters a nul character. (If it’s missing, printf continues past the end of the string, until it finds a null character somewhere in momory. )
To print part of a string we can use: %.ps, where p is the number of characters to be displayed:
Printf(%.6s\n”, str);
Will print:
Are we
A string, like a number can be printed within a field. The %ms conversion will display a string in a field of size m. (A string of more is printed in full – not truncated. If fewer it will be right-justified within the field. Left justification we can put a minus sign in front of the m. %m.ps causes the first p characters of a string to be displayed in a field of size m.
Puts(str);
Only one argument, the string to be printed. No format string. Always writes an additionally new-line character.
Reading
%s – allows scanf to read a string:
scanf(“%s”, str);
Don’t need & operator in front of str in the call of scanf; since str is an array name, it’s treated as a pointer automatically.
Scanf skipts white space, then reads characters and stores them into str until it encounters a whit-space character. Scanf always stores a null character at the end of the string. A string read using scanf will never contain white space. Scanf won’t usually read a full line of input; a new-line character will cause scanf to stop reading, but so will a space or tab character.
To read an entire line of input at a time, we can use gets. Gets reads input characters into an array, then stores a null character.
In other respects gets is different than scanf:
Gets – doesn’t skip white space before starting to read the string (scanf does)
Gets – reads until it finds a new-line character (scanf stops at any white-space character). Gets discards the new-line character instead of storing it in the array. The null character stakes its place.
char sentence[SENT_LEN+1];
printf(“Enter a sentence:\n”);
scanf(“%s”, sentence);
Suppose that after the prompt, the user enters:
To C, or not to C: that is the question.
Scanf will store the string “To” into sentence. , the next call of scanf will resume reading the line at the space after the word To
If we replace it by gets(sentence);
Gets will store the string
“ To C, or not to C: that is the question.” into sentence.
scanf and gets have no way to detect when they’ve filled the array. They may store characters past the end of array. scanf can be made safer using %ns instead of %s, n (number indicating the maximum number of characters to be stored. gets, is inherently unsafe. (fgets is safer).
Gets and puts are faster than scanf and print since the functions are simpler.
Reading Strings Character by Character
Since both scanf and gets are risky and insuffiently flexible for many applications, C programmers often write their own input functions. By reading strings one character at a time these function provide a greater degree of control than the standard input functions.
Need to consider:
Should it skip white spaces before beginning to store the string
What character causes the function to stop reading (a new-line character, white-space character, or some other character? Is this character stored in the string or discarded?
What should the function do if the input string is too long to store: discard the extra characters or leave them for the next input operation.
If create read_line that won’t skip white-space characters, stops reading at the first new-line character and discards extra characters, prototype:
Int read_line(char str[], int n);
Str – array to store input. N maximum number of characters to be read.
Read_line will return the actual number stored into str.
Read_line reads characters one by one.
int read_line(char str[], int n)
{
char ch;
int I = 0;
while((ch = getchar()) != ‘\n’)
if(i<n)
str[i++] = ch;
str[i] = ‘\0’; /*terminates string*/
return i; /* number of characters stored */
}
Accessing the Characters in a String
Since strings are stored in arrays, we can use subscripting to access the characters in a string.
Function to count the number of spaces in a string.
int count_spaces(const char s[])
{
int count = 0, i;
for(i = 0; s[i] != ‘\0’; i++)
if(s[i] == ‘ ‘)
count++;
return count;
}
const – because count_spaces doesn’t change the array.
If s weren’t a string, count_spaces would need a second argument indicating the length of the array.
Now we can rewrite using pointer arithmetic.
int count_spaces(const char*s)
{
int count = 0;
for(; *s != ‘\-‘; s++)
if(*s == ‘ ‘)
count++;
return count;
}
const doesn’t prevent count_spaces from modifying s; it’s there to prevent the function from modifying what s points to. S is a copy of the argument that’s passed to count_spaces, incrementing s doesn’t affect that argument.
Which is better array operations or pointer operations to access the characters in a string?
You can use whichever is more convenient or mix the two. Pointers simplifies the function slightly by removing the need for the variable i. Traditionally, C programmers lean toward using pointer operations for processing strings.
Should a string parameter be declared as an array or pointer?
No difference
S[] or *s affect what can be supplied as an argument?
No. the argument could be an array name, a pointer variable, or string literal – count_spaces can’t tell the difference.
C String Library
Copying a string into a character array using = operator is not possible. Comparing also isn’t possible.
Char str1[10], str2[10];
Str1 = “abc”; /*WRONG */
Str2 = str1 /*WRONG */
Illegal because c interprets these statements as assignments of oine pointer to another.
Char str1[10]=”abc”;
Initializing is legal.
Comparing is legal – but will produce wrong results.
If(str1==str2); /*WRONG*/
Compares them as pointers, not the contents of 2 arrays. Since it has different addresses the value of the expression is 0.
C library provides set of functions for performing operations on strings.
In header
<string.h>
#include <string.h>
parameters are declared to have type char *. Some will be modified so shouldn’t be a string literal.
strcpy - string copy
char *strcpy(char *s1, const char*s2);
copies the string s2 into the string s1. (really – strcpy copies the string pointed to by s2 into the array pointed to by s1)
copies characters from s2 to s1 up to and including the first null character in s2. strcpy return s1 (a pointer to the desitnation string). The strinpointed to by s2 isn’t modified, so it’s declared as const.
can’t do
str1=”abcd”;/*WRONG*/
because an array name can’t be on the left side of an assignment, so instead:
strcpy(sr1, “abcd”); /* str1 now contains “abcd” */
similarly, we can’tassign str1 to str2 directly, but we can call strcpy:
strcpy(str2, str1); /* str2 now contains “abcd” */
strncpy is safer, because strcpy if str1 is larger than strcpy will still copy into str2 since stops when receives null character.
Usually we discard strcpy’s return value. Sometimes it’s useful.
Ex:
Strcpy(str2, strcpy(str1, “abcd”));
/* both str1 and str2 now contain “abcd” */
strcat – String Concatenate function
char *strcat(char *s1, const char *s2);
strcat appends the contents of the string s2 to the end of the string s1; it returns s1 (a pointers to the resulting string).
Ex.
Strcpy(str1, “abc”);
Strcat(str1, “def”); /*str1 now contains “abcdef” */