STAT6250Character FunctionsDr. Fan

Reading assignment: Chapter 12

Changing the Case of Characters

  • UPCASE(char. var.) coverts all letters in the char. var. to uppercase
  • LOWCASE(char. var.) coverts all letters in the char. var. to lowercase
  • PROPCASE(char. var.) coverts all letters in the char. var. to “proper” case, i.e. capitalizing the first letter and converting the rest to lower case for each word

Example:program 12-2

Handling Blanks

All the following functions handle blanks of words in a character variable:

  • COMPBL(char. var.)compresses multiple blanks to be one in each string
  • COMPRESS(char. var.) removes blanks (or other characters) in each string (p. 219)
  • LEFT(char. var.) aligns each string to the left
  • TRIM(char. var.) trims off all trailing blanks in each string
  • STRIP(char. var.) removes all leading and trailing blanks in each string

Example:programs 12-3, 12-5, 12-6, 12-7

Searching for Characters, Words, and so on

  • FIND(string, find-string, modifiers, starting-position) returns the position of the first occurrence of the find-string, if the find-string is found in the string starting at starting-position according tomodifiers (p. 219), and 0 otherwise; i.e. search for the position of the first occurrence of a string
  • FINDC(string, find-string, modifiers, starting-position) returns the position of the first occurrence of a character in the find-stringin the string starting at starting-position according to modifiers, and returns 0 if no occurrence; i.e. search for the position of the first occurrence of characters
  • FINDW(string, word, delimiters, modifiers, starting-position) returns the starting position of the first occurrence ofthe wordin the string starting at starting-position according to modifiers, and returns 0 if no occurrence; i.e. search for the position of the first occurrence of a word

Example: programs 12-8, 12-9

  • ANYALNUM(string) returns 1 if the string contains alphanumerics (letters and digits) and 0 otherwise
  • ANYALPHA(string) returns 1 if the string contains alphas (letters) and 0 otherwise
  • ANYDIGIT(string) returns 1 if the string contains digits and 0 otherwise
  • ANYPUNCT(string) returns 1 if the string contains punctuation characters and 0 otherwise
  • ANYSPACE(string) returns 1 if the string contains space characters (blank, tab, etc.) and 0 otherwise
  • NOT functions are similar to ANY functions but returns the first position not in the specified class and 0 otherwise; for example, NOTDIGIT returns the position of first non-digit character
  • VERIFY(string, valid-string) returns the position of first invalid character and 0 otherwise

Example: programs 12-10, 12-11, 12-12

Extracting Characters/Words

  • SUBSTR(string, starting position, length of substring) extracts from string starting at the starting position with the length of substring
  • SCAN(string, n, delimiter) extracts the nth word in the string

Example: programs 12-13, 12-14, 12-15

Other Data Cleaning Tools

  • CATS(char. var1, char. var2) removes all leading and trailing blanks and joins the var1, var2 together
  • CATX(‘separator’, char. var1, char. var2) removes all leading and trailing blanks and joins the var1, var2 together with the given separator in between

Example: program 12-4

  • COMPARE(string1, string2, modifiers) compares string 1 and string 2 after applying the modifiers: It returns 0 if matched and non-0 otherwise (the position of the first different character)
  • SPEDIS(string1, string2) returns the non-matching percentage of string 1 and string 2

Example: programs 12-16, 12-18

  • TRANSLATE(variable, ‘string1’, ‘string2’) substitutes all characters in string 2 in the variable for the corresponding characters in string 1
  • TRANWRD(variable, ‘word1’, ‘word2’) substitutes word1 in the variable for word2

Example: programs 12-19, 12-20

1