Which of the following is *not* a mapping rule for tokens Canonicalization? 1) Removing characters such as hyphen, periods and accents. 2) Reducing all letters to lower case (case-folding) 3) Collapsing alternate spellings (colour → color) 4) Keeping synonyms as different classes to include more diverse tokens