1
PharmaSUG 2013 - Paper CC30
Useful Tips for Handling and Creating Special Characters in SAS®
Bob Hull, SynteractHCR, Inc., Carlsbad, CA
Robert Howard, Veridical Solutions, Del Mar, CA
ABSTRACT
This paper will discuss various ways of creating and dealing with special characters in SAS. Many people experience
difficulty when reading in excel files and discover that strange "boxes" appear in the data. What these are and how
they can be dealt with will be discussed. Can special characters be saved in the SAS program? How can these
characters be typed if they aren’t on the keyboard? We will also provide examples on how to include special
characters like Greek letters (μ), less than or equal to (≤), and registered trademark (®) into your SAS programs and
RTF output. This paper will help you better understand some ways that special characters can be used within SAS.
INTRODUCTION
It can be said that the relationship between "special characters" and SAS is a tenuous one. Sometimes problems or
errors are encountered when trying to read in external data or even when simply accessing SAS datasets which have
variables containing special characters. As a result, either the file cannot be accessed or unrecognizable characters
appear.
After having to solve some real-life problems, we decided it would be best to summarize some of our solutions for
handling these issues. In this paper, we'll first look at some solutions for reading in data containing special characters
and look at some examples. Next, we'll go over some tricks for writing out special characters to either your SAS
output or RTF files.
READING IN SPECIAL CHARACTERS
By looking at a few examples, we will provide practical solutions for handling issues caused by reading in data
containing special characters.
UNICODE DATA ERROR WHEN READING IN SAS DATASETS
When reading in SAS datasets it’s possible that special characters will prevent you from being able to use the data.
Have you seen this transcoding error in your log?
ERROR: Some character data was lost during transcoding in the dataset DB.LABS. Either the data contains
characters that are not representable in the new encoding or truncation occurred during transcoding.
The data has Unicode characters in it and your SAS session is not set up for Unicode even though it appears the
same as other SAS datasets. Unicode allows for different languages that became available beginning in Version
9.1.3. See the Recommended Reading section for more info on Unicode from SAS.
The ideal solution is to read in the data using SAS with Unicode support. In doing so, the special characters will show
up correctly. However, if that is not available then you will be able to successfully read in the data using the following
code:
data temp;
set db.labs (encoding='asciiany');
run;
However, while we're now able to access the data, closer inspection reveals that the special characters appear in one
of the variables (CUTOFF) which was the root of the problem. The Greek letter "μ" has been converted to " Î ¼".
See Output 1 below for an example of how these values may appear.
Output 1. In this example the data is read in, but the Greek letter "μ" is converted to something indiscernible.