STAT6250Character FunctionsDr. Fan
Reading assignment: Chapter 12
Changing the Case of Characters
- UPCASE(char. var.) coverts all letters in the char. var. to uppercase
- LOWCASE(char. var.) coverts all letters in the char. var. to lowercase
- PROPCASE(char. var.) coverts all letters in the char. var. to “proper” case, i.e. capitalizing the first letter and converting the rest to lower case for each word
Example:program 12-2
Handling Blanks
All the following functions handle blanks of words in a character variable:
- COMPBL(char. var.)compresses multiple blanks to be one in each string
- COMPRESS(char. var.) removes blanks (or other characters) in each string (p. 219)
- LEFT(char. var.) aligns each string to the left
- TRIM(char. var.) trims off all trailing blanks in each string
- STRIP(char. var.) removes all leading and trailing blanks in each string
Example:programs 12-3, 12-5, 12-6, 12-7
Searching for Characters, Words, and so on
- FIND(string, find-string, modifiers, starting-position) returns the position of the first occurrence of the find-string, if the find-string is found in the string starting at starting-position according tomodifiers (p. 219), and 0 otherwise; i.e. search for the position of the first occurrence of a string
- FINDC(string, find-string, modifiers, starting-position) returns the position of the first occurrence of a character in the find-stringin the string starting at starting-position according to modifiers, and returns 0 if no occurrence; i.e. search for the position of the first occurrence of characters
- FINDW(string, word, delimiters, modifiers, starting-position) returns the starting position of the first occurrence ofthe wordin the string starting at starting-position according to modifiers, and returns 0 if no occurrence; i.e. search for the position of the first occurrence of a word
Example: programs 12-8, 12-9
- ANYALNUM(string) returns 1 if the string contains alphanumerics (letters and digits) and 0 otherwise
- ANYALPHA(string) returns 1 if the string contains alphas (letters) and 0 otherwise
- ANYDIGIT(string) returns 1 if the string contains digits and 0 otherwise
- ANYPUNCT(string) returns 1 if the string contains punctuation characters and 0 otherwise
- ANYSPACE(string) returns 1 if the string contains space characters (blank, tab, etc.) and 0 otherwise
- NOT functions are similar to ANY functions but returns the first position not in the specified class and 0 otherwise; for example, NOTDIGIT returns the position of first non-digit character
- VERIFY(string, valid-string) returns the position of first invalid character and 0 otherwise
Example: programs 12-10, 12-11, 12-12
Extracting Characters/Words
- SUBSTR(string, starting position, length of substring) extracts from string starting at the starting position with the length of substring
- SCAN(string, n, delimiter) extracts the nth word in the string
Example: programs 12-13, 12-14, 12-15
Other Data Cleaning Tools
- CATS(char. var1, char. var2) removes all leading and trailing blanks and joins the var1, var2 together
- CATX(‘separator’, char. var1, char. var2) removes all leading and trailing blanks and joins the var1, var2 together with the given separator in between
Example: program 12-4
- COMPARE(string1, string2, modifiers) compares string 1 and string 2 after applying the modifiers: It returns 0 if matched and non-0 otherwise (the position of the first different character)
- SPEDIS(string1, string2) returns the non-matching percentage of string 1 and string 2
Example: programs 12-16, 12-18
- TRANSLATE(variable, ‘string1’, ‘string2’) substitutes all characters in string 2 in the variable for the corresponding characters in string 1
- TRANWRD(variable, ‘word1’, ‘word2’) substitutes word1 in the variable for word2
Example: programs 12-19, 12-20
1