Overview of commands    Keyboard ShortCuts   

How to .. with EpiData Analysis. Document version Dec 2016.

Hints and solutions on how to accomplish various tasks. - suggest additions on the EpiData list or e-mail info@epidata.dk
Groups available:

Top Read & Start - Folders and Files
Control set parameters Set < parameter >

Set parameters control how much information you get in the output window (e.g. set statistics), how commands are executed (e.g. set replacedatafile) and other aspects

Set just issue "set" with no words after to see which ones are defined.
Folder changes cd "name/folder"

To make a particular folder the working folder either use a "cd ......." command in the epidatastat.ini file or define this in the lnk where you start epidata analysis.

Top Basic analysis
Select by date - existing variable Select <logical criteria>

Solution: use the built in formulae. Adapt according to your date format (dmy/mdy/ymd)
E.g. to select all born before 12th of february 1988

Select birthday < createdate(12,2,1988) using function createdate(day,month,year)
Select birthday < createdate("02/12/1988", "mdy") using createdate function with "mdy"
Select birthday < createdate("12/02/1988", "dmy") using createdate function with "dmy"

Select by date - create date variable Assume you have age in 1996 in field: age
new var date born;
new var int yearborn := 1996 - age;
born = createdate(1, 7, yearborn);
select born < createdate(12, 2, 1988); //select all born before 12th of february 1988
 begin
   ...
 end
Test if a string variable contains legal dates You have read a text (delimited) file, and wish to convert a string variable to a date
Assumed variable names: The string variable with dates is "txtdate", ID is id number.
read myfile.csv
var
// to see the variables
* Now convert to date:
gen d mydate = date(txtdate)
// Incorrect date values will be "."
gen d mydate1 = date(txtdate,"V")
// Incorrect date values will be "01/01/1980"
gen i mydateok = (mydate = mydateok)
// 0 if they are the same, 1 if different
tables mydateok
// to see the count
select if mydateok = 1
list id txtdate mydate mydate1
// to see the values
select
list id txtdate mydate mydate1 if mydateok = 1
// alternative to select
Check for missing data
in several variables
Generate an indicator variable: m
define m _____________________________
if v1 = . then m = trim(m) + "N" else mv = mv + "V"
if v2 = . then m = trim(m) + "N" else mv = mv + "V"
etc. for all other variables,
freq m
* alternative solution:
m = ""
m = trim(m)+string(mv(sex)) // function mv = 0 for actual value
m = trim(m)+string(mv(age)) // function mv = 1 if variable=.
m = trim(m)+string(mv(km)) // function mv = 2 if variable=defined missing value
tab m
Top Basic graphs
Split Y variable Assume you wish to make a scatter plot of Haemoglobin Values by age where males are circles and females filled circles.
Scatter age Hb by sex would be convenient, but You cannot do that since "by sex" is not allowed.
Solution: Assume sex is 1 for males and 2 for females, Hb is Haemoglobin

gen male = Hb if sex = 1
gen female = Hb if sex = 2
Scatter age male female /ti="My nice graph of variable Y by sex" /sub="With this subtitle"
Add text for Y-axis Solutions. E.g. for scatter plot to add "y=log(age)"
  • Add a textbox at top of Y-axis: scatter x-var y-var /text="15,50,Y=log(age),0"
    Experiment a little with the numbers 15 and 50 (pixels from top left)
  • Use "/edit" and in "edit all" add title to left axis. With this you can change direction of text.
  • Or use a combination of /text and /edit
Same for other graph types.
Top Output
Change Output Colour and design Desing of output is controlled by a "stylesheet" contained in epiout.css which you can edit.
Several web sites and books explain more on how to use stylesheets.
To change background for tables to white:
   1. Open the epiout.css file in the editor found where EpiData Analysis is installed
   2. Change this line:
    table {background-color:black; border: solid #000 3px; width: 500px; }
   to
    table {background-color:white; border: solid #000 3px; width: 800px; }
   
and for the cellcontents change color to black or other color or contents are invisible. E.g.:
     .firstcol {color: black;font-size: 10pt;font-family: verdana; font-weight: normal; text-align: right; padding-right:10px; }
   3. Save the css file as Whitetable.css and use by set stylesheet="whitetable.css"
Submit examples to info@epidata.dk for dissemination to other users.
See further explanation in document: Formatting output   
Top Generate/change variables
Boolean Variables define lonely <Y>
This would define a Boolean variable. To add values use:
lonely = True
lonely = False
Top Clean up & stop
Top Dates and date variables
What are dates In EpiData Analysis dates are used internally as the number of days since 1899/12/30 but shown in a known date format, such as
  • dmy: "01/01/2000"
  • mdy: "12/25/2000"
  • ymd: "2000/01/25"
Below you can find examples of how to handle date variables freq born if born <> date("1,1,1800")
Exclude date from table Assume you have assigned Jan. 1st 1800 as the missing value in a date variable "born"
select born <> createdate(1,1,1800) freq born
Calculate age on day of visit Assume you have date of birth in variable DOB and date of visit in VISIT
new var int agedays := visit - dob // days
new var int ageyear := integer(agedays / 365.25) // years
Calculate age on day of visit - simulation example To see the example you can use this code: close generate 10 gen d dob = dmy(15,1,(1930+5*_n)) gen d visit = dmy(15,_n,2007) gen i age = trunc((visit-dob)/365.25) gen i age1 = round((visit-dob)/365.25) gen age2 = (visit-dob)/365.25 list
Notice that by rounding you get incorrect age, whereas truncating gives the correct one.
Select by date - create date variable Assume you have age in 1996 in field: age
define born <dd/mm/yyyy>
gen yearborn = 1996-age
born = dmy(1,7,yearborn)
Select born < dmy(12,2,1988)
//select all born before 12th of february 1988
Test if a string variable contains legal dates You have read a text (delimited) file, and wish to convert a string variable to a date
Assumed variable names: The string variable with dates is "txtdate", ID is id number.
read myfile.csv
var
// to see the variables
* Now convert to date:
gen d mydate = date(txtdate)
// Incorrect date values will be "."
gen d mydate1 = date(txtdate,"V")
// Incorrect date values will be "01/01/1980"
gen i mydateok = (mydate = mydateok)
// 0 if they are the same, 1 if different
tables mydateok
// to see the count
select if mydateok = 1
list id txtdate mydate mydate1
// to see the values
select
list id txtdate mydate mydate1 if mydateok = 1
// alternative to select
Top Information
Top Further analysis
Regression Logistic Outcome Indicator Confounder ......

You cannot do Logistic Regression or other advanced statistics. Use other software.
Data can be exported with labels to Stata (www.stata.com), SAS, SPSS and as delimited files to other software.
Top Special graphs
Top Special data handling
Top Disk commands