Mathematics Homework Help
Mathematics Homework Help. SAS questions #4-Reading Raw Data; Manipulating Data
Part I-Reading SAS data sets
1. Reading a Space-Delimited Raw Data File
a. Write a DATA step to create a new data set named work.qtrdonation. Read the space-delimited raw data file, which can be named as follows:
Windows |
“&pathdonation.dat” |
UNIX |
“&path/donation.dat” |
z/OS (OS/390) |
“&path..rawdata(donation)” |
Partial Raw Data File
120265 . . . 25
120267 15 15 15 15
120269 20 20 20 20
120270 20 10 5 .
120271 20 20 20 20
b. Read the following fields from the raw data file:
Name |
Type |
Length |
IDNum |
Character |
6 |
Qtr1 |
Numeric |
8 |
Qtr2 |
Numeric |
8 |
Qtr3 |
Numeric |
8 |
Qtr4 |
Numeric |
8 |
c. Write a PROC PRINT step to create the report below. The results contain 124 observations.
Partial PROC PRINT Output
Obs IDNum Qtr1 Qtr2 Qtr3 Qtr4
1 120265 . . . 25
2 120267 15 15 15 15
3 120269 20 20 20 20
4 120270 20 10 5 .
5 120271 20 20 20 20
2. Reading a Delimited Raw Data File with Nonstandard Data Values
a. Write a DATA step to create a temporary data set, prices. Read the delimited raw data file named as follows:
Windows |
“&pathpricing.dat” |
UNIX |
“&path/ pricing.dat” |
z/OS (OS/390) |
“&path..rawdata(pricing)” |
All data fields are numeric.
Partial Raw Data File
210200100009*09JUN2011*31DEC9999*$15.50*$34.70
210200100017*24JAN2011*31DEC9999*$17.80*22.80
210200200023*04JUL2011*31DEC9999*$8.25*$19.80
210200600067*27OCT2011*31DEC9999*$28.90*47.00
210200600085*28AUG2011*31DEC9999*$17.85*$39.40
b. Generate the report below. The results should contain 16 observations.
Partial PROC PRINT Output
2011 Pricing
Sales
Obs ProductID StartDate EndDate Cost Price
1 210200100009 06/09/2011 12/31/9999 15.50 34.70
2 210200100017 01/24/2011 12/31/9999 17.80 22.80
3 210200200023 07/04/2011 12/31/9999 8.25 19.80
4 210200600067 10/27/2011 12/31/9999 28.90 47.00
5 210200600085 08/28/2011 12/31/9999 17.85 39.40
3. Reading a Delimited File with Missing Values
a. Write a DATA step to create a temporary data set, prices. Use the asterisk-delimited raw data file, which can be named as follows:
Windows |
“&pathprices.dat” |
UNIX |
“&path/prices.dat” |
z/OS (OS/390) |
“&path..rawdata(prices)” |
Partial Raw Data File
210200100009*09JUN2007*31DEC9999*$15.50*$34.70
210200100017*24JAN2007*31DEC9999*$17.80
210200200023*04JUL2007*31DEC9999*$8.25*$19.80
210200600067*27OCT2007*31DEC9999*$28.90
210200600085*28AUG2007*31DEC9999*$17.85*$39.40
There might be missing data at the end of some records. Read the following fields from the raw data file:
Name |
Type |
Length |
ProductID |
Numeric |
8 |
StartDate |
Numeric |
8 |
EndDate |
Numeric |
8 |
UnitCostPrice |
Numeric |
8 |
UnitSalesPrice |
Numeric |
8 |
b. Define labels and formats in the DATA step to create a data set that generates the following output when they are used in the PROC PRINT step. The results should contain 259 observations.
Partial PROC PRINT Output
2007 Prices
Sales
Start of End of Cost Price Price per
Obs Product ID Date Range Date Range per Unit Unit
1 210200100009 06/09/2007 12/31/9999 15.50 34.70
2 210200100017 01/24/2007 12/31/9999 17.80 .
3 210200200023 07/04/2007 12/31/9999 8.25 19.80
4 210200600067 10/27/2007 12/31/9999 28.90 .
5 210200600085 08/28/2007 12/31/9999 17.85 39.40