Suppose we have a table or SAS data set that contains duplicate rows/ observation.
We just want to calculate / count the number of duplicate rows from that table / SAS data set .
It would be great if we can generate an output column which could show that how many times each row occurs.
Example :
Suppose we have a table / SAS data set that contain two variable / column known as last name (Lname) and first name (Fname) showing as :
Data DSN;
input Fname $ 1-7 Lname $ 8-15 ;
Cards;
Rebecca Smith
Kathy John
Rebecca Smith
Tom Andrew
Reed Samuel
Rebecca Smith
Kathy John
Reed Samuel
Davis Kerry
;
Run;
and the result would be as :
* DSN is the name of the SAS data set.
We just want to calculate / count the number of duplicate rows from that table / SAS data set .
It would be great if we can generate an output column which could show that how many times each row occurs.
Example :
Suppose we have a table / SAS data set that contain two variable / column known as last name (Lname) and first name (Fname) showing as :
Data DSN;
input Fname $ 1-7 Lname $ 8-15 ;
Cards;
Rebecca Smith
Kathy John
Rebecca Smith
Tom Andrew
Reed Samuel
Rebecca Smith
Kathy John
Reed Samuel
Davis Kerry
;
Run;
The output would be like that :
To generate the number of duplicate column from the table :
Proc SQL;
Select *, count (*) as count
From DSN
Group by Fname, Lname
Having count (*) > 1;
Quit;
and the result would be as :
* DSN is the name of the SAS data set.
No comments:
Post a Comment