Syntax: command variables [/option] [/option=ab] [if condition] [ ]: optional specification. ab... indicates alternative choices
Top Read Data, Save Data etc.  
read  read [filename[.{recdbfcsv}]] [/close] [/CB] Read a copy of the data file into memory Options:
Note: The default working folder changes when you use the open file dialog. Not if you provide the whole filename  
savedata  savedata filename [field list] [/replace] Save a copy of all variables in memory to a file  to use the data again later Options:


append  append [var1 var2...] [/file=filename[.{recdbfcsv}]] Add records (observations) after all observations in current file Options:
Variables from previous read which are not in the appended file will be set to missing for the appended records  
merge  merge key_{1} key_{2} ....key_{n} [/file=filename] [/table] [/update/updateall] merge the current data file to another data file based on key variables Default: Values from all variables in memory are left unchanged. Options:
After merge the variable mergevar indicates source of information for each observation (records). mergevar is defined with variable labels for these values: 1 : Only in memory (original) 2 : Only in external file 3:In both 

aggregate agg 
aggregate variables /Options /statistics Aggregate  collapse  combine  data when you wish to change from individual to group level. See explanation in Stattables for all Options. Options:


sort  sort variable1 [variable2] [variable3 ...] sort the current dataset by one or more variables 

Top Common analysis commands  
count count if 
count count if [logical statement] Counts number of records. With count if (logical statement) only records for which the logical statement is true are counted. Count is saved as result variable $count var result to see names and structure. Note: if must be used with caution with float variables count if 10*dectime >= 31 // is ok if you wish to count records where dectime>= 3.1 Remember the user has the responsibility to control that complex if statements work correctly 

describe des 
describe [variable1 ] [variable2 ...] [/Options] Descriptive distributional statistics for each numeric field. describe will describe all variables. Options:
Percentiles can be imprecise for small number of observations (< 11)


means mea 
means variable1 [/by=variable2] [/T] [/q] basic descriptive statistics for numeric variable1, optionally stratified by variable2 Options:
 
kwallis  kwallis variable1 /by=variable2 KruskallWallace analysis of variance, where variable2 is a categorical factors Specify design: set table design=line[box][filled][shaded][...] 

regress  regress yvar xvar1 [xvar2 xvar3 ...] linear regression with yvar as the dependent variable Maximum of 5000 records Specify design: set table design=line[box][filled][shaded][...] 

correlate cor 
correlate var1 var2 [var3 ...] Calculate correlation coefficients between all variables. Record limits same as regression. ** if correlation is undefined a floating point error will be generated Specify design: set table design=line[box][filled][shaded][...] 

Top Tables  
freq fre 
freq variable_{1} [variable_{2} variable_{3} ...] [/Options] Frequency distribution for each variable. (Alternative to Freq: Tables ..../F or for counts only "Stattables, e.g. Stab variable_{1} Options:
Notice you can obtain a table and graph of CI for several variables with CIPLOT Notice you can obtain a graph of CI for several variables with CIPLOT 

tables tab 
tables variable_{1} [variable_{2} variable_{3} ...][/Options] The tables (brief: tab) command shows frequency or cross tables for the variables chosen. Default table without Options:
With user specified Options many aspects can be controlled:


stattables stab 
stattables variables /stat="...statistics key words ..." [/by=] [Options] Show a collapsed table with the same summary statistics for all the variables optionally grouped or stratified Options:
Example Stab age weight /by="sex class" /stat="mci min max"


set parameters for Tables 
All tables: set table design=[stat][system][freq][summary]=line[box][filled][shaded][system]... (Design of tables) Tables with percentages (Options: /r /c /to /pct) set table percent format col="P1{}" (Col Percents format, e.g. "P2 %" set table percent format row="P1()" (Row Percents format, e.g. "P0[]" set table percent format total="P0[]" (Row Percents format, e.g. "P0[]" set table percent header="%" (Contents of column header for percents) set table percent header [row][col][total]="%" (Contents of column header for percents row/col/total one at a time !) set table ct or header"= "OUTCOME:,CASE,NON CASE,N,EXPOSED,NON EXPOSED" set table ct rr header"= "OUTCOME:,EXPOSED,NOT EXPOSED,N,N,ILL,RR,AR (%)" Statistics: Specify confidence interval text: set TABLE CI FORMAT [HEADER]="()" set TABLE CI HEADER="(95% CI)" 

Top Life Tables and KaplanMeier Plots  
lifetable ltab 
lifetable outcome time [/by=group variable] [/Options] lifetable outcome TimeStart TimeEnd [/by=group variable] [/Options] The lifetable command creates a standard life table and KaplanMeier curve depending on Options. The time variable is read as integer. If a float variable is used a default interval of 1 will be used.
With the Statistics dialog only the lifetable is shown lifetable outcome Time /NG With the graph dialog lifetable outcome Time /NOLT is used 

set parameters for life tables 
set table design=[stat][system][freq][summary]=line[box][filled][shaded][system]... (Design of tables) set lifetable header="INTERVAL,N_{AT RISK},DEATHS,LOST,SURVIVAL,STD. ERROR" set lifetable interval="0,7,15,30,60,90,180,360,540,720,3600,7200,15000" set TABLE CI FORMAT [HEADER]="()" set TABLE CI HEADER="(95% CI)" 

Top Basic graphs  
bar  bar variable1 [/by=...]
A bar graph shows counts of the given variable with a categorical x axis. Only values included in the variable will be shown. Bars are centered at each tick mark, and value labels (if defined) are shown. Compare with definition of histogram Options:


histogram his 
histogram variable1[/xmin /xmax /xinc] [/by=...] A histogram shows counts grouped into into "bins", but scaled on the Xaxis. Each group (bin) is started at the tick mark, but centered on the tick mark with /by. If By variable has more than four values use bar Compare with definition at bar Options


boxplot  boxplot variable_{1} [variable_{2} ....] [/out] [/by=...] [/R] [/P1090] Box and whisker plot of field. The box shows interquartile range (2575) with median highlighted. Whiskers cover the interval from (p25 1.5* interquartile range) to (p75+1.5*interquartile range), (But only if a data value is present, otherwise the nearest inside value is found Options:


line  line xvar yvar [yvar2] [yvar3 ...] [/by=...] line plot of one or more yvar against xvar; multiple y variables may be plotted against xvar 

scatter sca 
scatter xvar yvar1 [yvar2 ...] [/by=...] scatter plot of yvar against xvar; multiply y variables may be plotted against xvar Options


dotplot  dotplot variable [/by=group variable] [/Options] Dotplot shows one dot per observation. A small displacement is added to the value, such that all observations can be seen. If overlap between groups happen, then extend width of graph by e.g. dotplot var /sizex=600. The variable chosen is used as Yaxis Options:


cdfplot  cdfplot variable [/Options] CDFplot shows a scatter plot of cumulated percentage points (counts) with variable used as Xaxis Options:


ciplot  ciplot outcome variables [/Options] CIplot shows a table and a plot of proportions of outcome with 95% Confidence intervals in strata defined by individual values of the remaining variables. Options:


epicurve  epicurve outcome time [/by=group] [/Options] Epicurve shows development in a possible epidemic as stacked bars on each day from start to end of data. Example: epicurve case dayonset /by=floor /legend /xa /frame /tab Options:


pie  pie var1 pie chart of frequencies of the values of var1 Percentages can be incorrect in some instances 

erasepng  erasepng [/noconfirm] [/all] Erases graph*.png files in current folder shown on statusbar left side. Confirm each erase. Options:


Common Graph Options  /save="file.png[.wmf][.bmp]" /xlabel="variable" /text="x,y,text,box" /ti="title" /sub="subtitle" /noedit /nolegend


set parameters for Graphs  set option graph= "/sizex=value /sizey=value Default Options for all graphs (width of graph, e.g. value=600) set graph savetype=png[wmf][bmp] (Which type of file to save) set graph clipboard=on[off] (copy graph to clipboard after creation) set graph footnote=text (footnote for graphs  default: EpiData Analysis Graph) set graph filename show=off[on] (show name of file below the graph) set graph filename folder=off[on] (Include folder in graph file name) set graph font size=value (font size  default 10. Titles are scaled relatively) set graph colour="1234567890" See above set graph colour text="2133" See above set graph symbol="1234567890" See above 

Top Select observations  
select if select 
select [logical expression] work with selected records  multiple select commands are joined by and  without parameters, clears all select commands. Current select will be shown when running analysis commands Warning: Be careful when you select on float variables, e.g. v1 > 3.1 To test this use "browse ...." Make sure program handles missing data as you expect Note: String variables are compared excluding trailing blanks To exclude leading blanks use trim function: e.g. select trim(User) = "Jorge" or select trim(upper(User)) = "JORGE" Select cannot be used with UPDATE  
temporary select 
follow another command with if (logical_expression) processes the data file using only those records for which logical_expression is true  for complex logical expressions, use parentheses; they are optional for simple expressions Note also: Options must be placed before if Always control what happens with missing data and with float variables with many decimals Note: String variables are compared right trimmed, that is without trailing blanks. e.g "Lion " is the same as "Lion", but not the same as " Lion". But make sure upper and lower case works correctly !!! You can query by asking the user. E.g. "count if sex = ?Write value 1 2?" Current if and select will be shown when running analysis commands  
Top Statistical Process Control  
pareto  pareto groupvar [/Options] SPC: A Paretodiagram shows a bar chart of the variable, where columns are sorted in descending order of frequency. Superimposed is a line showing cumulative percentages. Counts are shown with the left Yaxis and Cumulative Percentage with the right Yaxis. Pareto charts are in particular suited for decision making, when multiple outcomes are possible for a given situation. e.g. To find aspects responsible for 80 percent of errors among many possible causes. Options:


runchart  runchart measurement [time] [/Options]
SPC: A runchart shows the median of the measurements Without a time variable sequence of observations is used as xaxis The chart includes median and indication of tests of special cause Runchart is a process control type graph. Runs and tests adapted in v2 (ref: ) Options: SPC Options, general graph Options and SPC tests for attributable cause 

ichart  ichart measurement [time] [/Options] SPC: Ichart (Individual  also called XMR chart) showing overall mean, control limits and actual measurements Without a time variable sequence of observations is used as xaxis "Measurement" can be any continous measurement or count at that time


pchart  pchart count total [time] [/Options] SPC: A Pchart is created with the proportion of count/total for each time value. Without a time variable sequence of observations is used as xaxis The chart includes overall mean, control limits and proportions Pchart is a process control type graph Options: SPC Options, general graph Options and SPC tests for attributable cause The sampling basis for pcharts is a binomial process. 

xbar  xbar measurement time/sequence A Xbar chart is created with mean value of measurement for each subgroup. A subgroup is one time/sequence or date value used for grouping individual measurements of different samples or observations observed at the same time. Subgroup values serves as Xaxis. If any Xvalue has only one observation there will be a zero value shown for Sigma (or Range) at this X and the point value shown in the Xbar Chart. Xbarchart is a process control type graph. Notice problems can occur with some date variables It is displayed together with either: Range chart which indicates the range between Max and Min measurements within each subgroup or Sigma chart which indicates the process variation using a weighted method. (Xbar and S should always be used when subgroup size >1) The sigma limits for the Schart is calculated with varying limits depending on n in each subgroup. The Sigmaaverage is an aritmetic overall average. Currently we are looking into the optimal way of calculating this, since there is some disagreement in the litterature. Current implementation follows the principle mentioned in Hart & Hart, page 331, but with an aritmetic average of the individual Sigmabar's .


uchart  uchart count volume [time] [/Options] A Uchart is created with the ratio of count/volume. Without a time variable sequence of observations is used as xaxis The chart includes overall mean, control limits for the ratio Uchart is a process control type graph. The basis for the count is "defectives", e.g. falls in a hospital ward. The total is the observation volume, which typically varies btw. time points. Options: SPC Options, general graph Options and SPC tests for attributable cause /Per=x Use x as multiplicator (e.g. to show per 1000) 

Cchart  cchart count [time] [/Options] A Cchart is created on the basis of counts for each data point. Volume/total is assumed to be constant observation volume for all time points Without a time variable sequence of observations is used as xaxis The chart includes overall mean, control limits for the ratio Cchart is a process control type graph. The basis for the count is "defectives", e.g. falls in a hospital ward. The total is the observation volume, which is constant for all time points. Options: SPC Options, general graph Options and SPC tests for attributable cause /Per=x Use x as multiplicator (e.g. to show per 1000) 

Gchart  gchart variable A GChart is used for rare outcomes. E.g. infection following surgical procedures. Typically the variable used contains recorded dateofoccurence and the GChart will calculate and graph the number of days btw. the occurrences. Each measurement in the variable is graphed as an xvalue. A Gchart is a process control type graph. If you have data already summarised use technique such as (assume the presummarised data are in "datavar": gen i sumdata sumdata = datavar if _n > 1 then sumdata = sumdata[_n1] + datavar gchart sumdata Notice that for Gcharts it is desirable to have many observations away from 0. For other SPC charts "good" performance is near the bottom. Options: SPC Options, general graph Options and SPC tests for attributable cause 

Options for process control charts 


Principles and tests 
Many principles exist for SPC graph testing. The principles implemented in EpiData Analysis are documented further in a reference document which also includes
references (see epidata.dk . In short these principles are implemented:
SPC charts use a time variable along the Xaxis and the observation as the Yvalue. I and P Charts have control limits calculated as center line ± 3x Sigma. Where 3x as default is equal to 3, but with the option /tlimit can depend on number of observations in each subsection of the graph. The graphs do not demand a specific number of observations. The assumption for using tests for special variation is that btw. 20 and 30 observations are included in each portion (break). With fewer observations the chance of a Type II error is larger (overlooking an actual special cause) and with larger numbers of observations the chance of a type I error is larger (false positive test). Therefore users should explore the usage of the /tlimit principle. See Hart & Hart for more information. In SPC  Statistical Process Control Charts: Notice Control Limits are NOT the same as Confidence limits. Confidence Intervals indicate limits for the mean (phrased: the mean or central line of the SPC chart is this ..., but we cannot with the current sample decide whether it could be as high as .. (upper CI) or as low as (lower CI). Whereas control limits are indications of the type "Within these limits one should expect that 99.5 % (percent depend on sigma value) of the observations would be contained given the proces is in control". The choice of SPC chart depends on the data at hand. In technical terms which type of process generated the data. For an overview of this open the "graph" menu and choose "spc", which will show a grid assisting you in deciding which graph to choose among the implemented ones. Currently: RunChart, IChart, PChart, XbarS, XbarR, UChart, CChart, GChart, Pareto.


Top Save & Clear output  
cls  cls clears the output screen notice F12 will do the same and can be used during execution if speed slows down 

logopen  logopen [filename[.{htmltxt}]] [/close] [/append] start a log file  without parameters, the open file dialogue is started /Close will close an open logfile. /Append adds to existing file /replace replace existing logfile 

logclose  logclose close the current log file 

Top View data  
browse  browse [variable1 [variable2 ...]] browse values in a spreadsheet for all variables listed  without parameters, browse all variables Note that browse is much faster than list Note that the browse window closes when you move away from "browse", unless you allow the browser to be open by: then: Set display databrowser=on The same way "minimise" browse is equal to "close", unless you have : Set display databrowser=on But remember that the browse window can be quickly opened at any time by F6 key. Notice use of right click on form (sorting, copy to clipboard) 

list  list [variable1 [variable2 ...]] [/no] [/v /vl] [if ... ] show values on the screen for all variables listed, with one record per line and no limit to the width of the display  without parameters, list all variables. Values are shown  not labels. /NO : do not show record (observation) numbers /v /vl: control whether values or labels are shown (or both) Note that browse is much faster than list. The choice of font might make list display incorrect Select or If the sequence is within current select or "if". if you use list with temporary if, the number is not the same as recnumber 

Update  Update [variable1 [variable2 ...]] [/id=variable] Allows grid editing of data  without parameters, works on all variables /id=variable Indicate variable containing unique id Update cannot be combined with SELECT Notice use of right click on update form (sorting, select id, copy to clipboard) 

Top Generate/change variables  
define  define var1 fieldtype [cumulativeglobal] create a new variable based on an EpiData fieldtype (###, ___, "<Y>, <AAAAAA>, or valid dateformat)  var1 will initially be missing in all records  cumulative variables retain their values from one record to the next  not functioning  global variables retain their values following a close command and are like constants (only one value) 

gen  gen var1 = expression  resultvar create a new numeric variable based on the expression, or equal to a constant from a result variable  equivalent to define and let, with the variable type implied by the expression  if the result of expression is boolean, variable1 will be 0 (FALSE) or 1 (TRUE)  Result variables are created by some commands, e.g. means and describe IF the user specifies type, that type of variable is generated (examples): gen s(10) var1 = expression gen d var1 = expression gen i var1 = expression gen f var1 = expression Compare with values from other records: nbsp;nbsp; gen i age = (age  age[_n+1]) if id = id[_n+1] nbsp;nbsp; let bmidif = (bmi  bmi[_n1]) if id = id[_n1] //1 could be 4 +1 etc Notice that integer variables are maximum 4 digits. For larger integers use type float with zero decimals Always verify generation of complex variables or logical statements. e.g. gen .... if ... with define ... if ... then ... .  
generate  generate value creates a new empty dataset with value records. E.g. for simulation or testing.  note the difference to gen command, which creates variables.  
if ... then  if (logical_expression) then [let] ... [else [let] ...] evaluates logical_expression for each record; the else clause is optional  for complex logical expressions, use parentheses; they are optional for simple expressions  some other commands might work, but only let is practical 

let  [let] var1= expression  resultvar assign a value to an existing variable; the word let is optional  if the result of expression is boolean, var1 will be 0 (FALSE) or 1 (TRUE)  only means and describe commands create result variables 

recode  recode variable1 to var2 values1 = newval1 [values2 = newval2 ...] create or change codes for subgroups of records  values1 takes one of three forms: a single value, a series of values separated by commas, or a range of consecutive values like 712 or "A""D" e.g.recode v1 to v2 lo18.499=1 18.50hi=2 values up to 18.50,but not 18.50 gets the value 1 e.g. recode x lo3.00=1 3.00014.0000=4 4.00015.49000=5 5.5hi=7  if to var2 is omitted, variable1 the original values will be lost. recode variable1 to var2 by value (Value must be integer > 0)  the variable1 values will be recoded to numerical variable var2 with value label indicating the limits E.g. recode age to agegroup by 10 to recode age variable to 10 year age groups. Note: define agegroup before recoding define agegrp ###<"/font> Note: EpiData Analysis shows the if ... then and the labelvalue commands doing the recode 

Top Label data  also called metadata  
labeldata  labeldata "text" Assign the descriptive text as a label for the data file. An existing label will be replaced with the new one. To keep the label you must save the data. 

label  label var "text" Assign the descriptive text as a label for the variable. An existing variable label will be replaced with the new one. To keep the variable label you must save the data. 

labelvalue  labelvalue var /x="text with spaces" /y=text2 /z=text3 [/clear] Assign the descriptive text as a value label for the values (x y z) /clear will remove any value not mentioned on the line For several variables in sequence: labelvalue v1v17 /1="Yes" /0="No" Note !!!  If you change valuelabels for a variable, which shares labels with other variables then the label is changed for all the variables !!!! Shared valuelabels are defined as part of dataentry in EpiData Entry Note that valuelabels are automatically created by the command recode 

missingvalue  missingvalue var [var1varx] /x /y /z [/clear] Assign from 1 to 3 values as a defined missing value /clear will remove any previous definition For several variables in sequence: missingvalue v1v17 /9 

Top Clean up  stop  
close  close stop using a dataset  all unsaved variables and changes to existing fields will be lost  global variables will remain in memory 

quit or exit  quit exit Exits from EpiData Analysis. Closes any open output file. NOTE: To save data in memory before closing use the savedata command. Automatic save of command history is done on exit. Filename defined by "set command history filename", default temp.pgm If you write exit or quit in command prompt no confirmation question will be asked. 

savepgm  savepgm filename[.pgm] saves recent commands in a program file  without a parameter, the save file dialogue is opened Automatic save of command history is done on exit. Filename defined by "set command history filename", default temp.pgm 

clear output window  cls Clear the output screen with results  when output slows down press F12. F12 (=cls) can be used in the middle of other commands running. 

clear command buffer  clh Clear the buffer of previous commands.  It is the same list shown when pressing F7, right click on "F7" window to clear Notice  set commands on history. See below. 

Top Set parameters  
set  set [parameter=value]
 
 
Top Information  
newpage  newpage When printing the output force top of page after this line will not added to output as "hidden" information. 

type  type "Text to display [@$result1] " [/class=x] [/style=" "] [/h1] [/h2] [/h3] [/h4] [/h5] echo Text to display [@$result1] display text on the screen; if Options are not used the text will added as a standard paragraph (html: < p >) Options adds html specifications: /class: (html: <p class= > text </p>). (h1..h5: <hx> text </hx>) /style="valid css style definition", e.g. /style="color:blue; Fontsize=0.6em" result or globally defined variables may be displayed by putting @ before the variable name USE ' ' to include text in type commands, e.g. for < href=' ....'> 

title  title "Text to display [@$result1] " Display text on the screen as (html: <h1> text </h1>) result or globally defined variables may be displayed by putting @ before the variable name 

show  show filename" Add the contents of "filename" to the output window The file must be plain text, e.g. NOT a word processor file, but may contain HTML formatting blocks without header. 

View  View filename" View an html file in the viewer. The file must be HTML formatted. 

HelpView  Helpview filename" View an html file in the help file viewer. The file must be HTML formatted. 

rename  rename oldname to newname rename the variable from "oldname" to "newname" 

var variables 
variables or var list currently defined variable names, types, formats and labels 

drop var drop 
drop variable1 [variable2 ...] remove the listed variables from memory 

keep var keep 
keep variable1 [variable2 ...] Remove all variables not listed from memory 

result var result 
result list all current result variables and their values  means, describe, tables and other estimation commands create result variables, e.g. $mean1 or $count All result variables are cleared when running a new command, except for $assert and $assert_error, See var temp clear and runtest 

var temp clear  var temp clear Removes ALL result variables and all tempory global variables defined as global $assert, $assert_error and other internal variables are also cleared 

Version  Version" Compare current version of EpiDataStat.exe with latest version (requires internet) Note: No information is transferred from your PC Latest version is read from Http://www.epidata.dk/version/epidatastat.version if you are connected to internet. 

assert  assert if (logical statement) Check if the statement is correct (will not test all observations !! Return text "Assert failed" if statement failed E.g. assert ((pregnant = "Yes" and age < 40) or (pregnant = "No")) if id = 1  
?  ? (statement) Show result of statement, e.g. a calculation or logical check. Does not depend on or check any data. E.g. ? 241/34 ? (23>19) ? "a " + "b " + "c" ? findfile("myfile.pgm")  
Top Obsolete commands  
output  output {describe ...  means ...} Command replaced by new command aggregate  
route  command replaced by SAVEDATA and LOGOPEN commands  
write  command replaced by SAVEDATA and LOGOPEN commands  
Top Disk commands  
cd  cd "directory name" change the working director 

copyfile  copyfile "filespec1" "filespec2" copy file specified by filespec1 to new file specified by filespec2  filespec must identify only one file  do NOT include wild cards (* or ?) To overwrite: ../replace Could overwrite your data !! 

erase  erase filename permanently erase file specified by filename  filename must identify only one file  do NOT include wild cards (* or ?) 

rename file  use copy and erase To rename a file use copyfilefrom the existing file Afterwards you can erase the existing file with erase. 

dir  dir [filespec] list files in a directory  filespec may include wild cards (* or ?) Define design by set table design system=line[box][filled][shaded][system]... 

dos ! 
dos text execute any valid MSDOS command and return to EpiData  dos command will open an MSDOS window /open : Keep window open after execution !works only on XP+ Pc's  
Top Programming aids  not normally used in interactive mode  
*  * [any text] Use to document programs, usually as the first character in a line. * is not recognized in interactive mode.  
\  \ Any command can be extended on next line, e.g. to specify many Options for graphs  
;  ; to specify more than one command on a given command line in prompt or pgm  
//  [any command] // [any text] Use to document programs and may appear anywhere on a line.  
imif  IMIF (logical condition) then ..... [else] ..... endif Use to divert course in a pgm file depending on parameters, which could be acquired by "? ?"  
closehelp  closehelp Will close the help window if this is open.  
? ?  [any command] [parameters] "?Prompt to user? [parameters] The text between the two ? will be a prompt to the user to type a response, followed by <Enter>. The response will then be treated as part of the command. For select if age<=?Maximum age to include? if the user types 50 then EpiData sees select if age<=50 EpiData does no checking of the typed response before making the substitution. 

run  run [filename[.pgm]] Execute sequence of commands saved in a pgm file  without parameters, the open file dialogue is started 

runtest  runteset [filenamefolder name] Run all pgm's /single pgms to verify function.  suited for testing of correct estimation etc. 
In the following, takes indicates the variable type for each parameter and result indicates the type
of the result of the function:
s: string; b: boolean; d: date; i: integer; f: floating point; n: any numeric
parameters may be variables read from fields, new created variables, or any expression that evaluates to the correct type
Top String functions  
function  takes  result  example 
length(str)  s  i  length("Abcde") => 5 
lower(str)  s  s  lower("Abcde") => "abcde" 
pos(instr,findstr)  s  i  pos("Abcde","cd") => 3 pos("Abcde","z") => 0 
substr(str,start,len) copy(str,start,len)  s,i,i  s  substr("Abcde",2,3) => "bcd" copy("Abcde",2,3) => "bcd" 
trim(str)  s  s  trim("Abcde ") => "Abcde" 
upper(str)  s  s  upper("Abcde") => "ABCDE" 
Top Arithmetic functions (including Random numbers)  
function  takes  result  example 
abs(x)  n  n  abs(12) => 12 
exp(x)  n  f  exp(1) => 2.71828182845905 
frac(x)  f  f  frac(12.34) => 0.34 
int(x) trunc(x)  f  f  int(12.34) => 12.0 trunc(12.34) => 12.0 
integer(x)  f  i  integer(12.34) => 12 
ln(x)  n  f  ln(2.71828182845905) => 1 ln(0) => missing 
log(x)  n  f  log(10) => 1 log(0) => missing 
power(x,a)  n,n  f  power(2,3) => 8 
round(x,digits)  f  f  round(12.44,1) => 12.4 round(12.5,0) => 13 
sqr(x)  n  f  sqr(4) => 16 
sqrt(x)  f  f  sqrt(4) => 2 
ran(x)  n  n  Random integer from 0 to x. gen integer x = ran(100) 
rnd(1)  1  f  Random float from 0 to 1. gen float x=rnd(1) 
rang(mean,sd)  f,f  f  Random based on mean and sd. Gen float=rang($mean1,$sd1) 
Top Trigonomety functions  
function  takes  result  example 
arctan(x)  f  f  arctan(1) => pi/2 
cos(r)  f  f  cos(pi/2) => 6.12303176911189E17 cos(pi) => 1 
pi    f  pi => 3.14159265358979 
sin(r)  f  f  sin(pi/2) => 1 sin(pi) => 6.12303176911189E17 
Top Date functions  
function  takes  result  example 
today    d/i  returns today's date; may be assigned to a date variable or an integer 
date(datestr)  s  d  date("31/12/04") => "31/12/2004" datestr must be of form <dd/mm/yy> or <dd/mm/yyyy> 
date(datestr,fmtstr)  s,s  d  date("12/31/04","%mdy") => "31/12/2004" fmtstr must be "%mdy" or "%dmy". Date separator can be anything e.g. "31122004" is accepted 
day(d)  d  i  day("31/12/2004") => 31 
dayofweek(d)  d  i  dayofweek("31/12/2004") => 5 Monday=1, Sunday=7 
dmy(d,m,y)  i,i,i  d  dmy(31,12,2004) => "31/12/2004" 
month(d)  d  i  month("31/12/2004") => 12 
weeknum(d)  d  i  weeknum("22/02/2001") => 8 
year(d)  d  i  year("31/12/2004") => 2004 
Top Logic functions  
function  takes  result  example 
b1 and b2  b,b  b  (1=1) and (2=2) => TRUE (1=1) and (1=2) => FALSE 
b1 or b2  b,b  b  (1=1) or (1=2) => TRUE (1=2) and (2=3) => FALSE 
not(b)  b  b  not(1=1) => FALSE not(1=2) => TRUE 
iif(b,x,y)  b,any,any  b  iif(1=1,2,0) => 2 iif(1=2,sqrt(4),sqr(4)) => 16 
Top Conversion functions  
function  takes  result  example 
boolean(x)  n  b  boolean(x) => TRUE, for any nonzero x boolean(0) => FALSE 
integer(x)  f  i  integer(1.23) => 1 
integer(s)  s  i  integer("12") => 12 
float(i)  i  f  float(1) => 1.000 
string(x)  n  s  string(1.23) => "1.23" 
Top Test and special functions  
function  takes  result  example 
lre(x,y)  n  n  lre($mean1,1.23456789123456) returns number of digits precision of $mean1 
samenum(x,y)  n  b  samenum($mean1,1.23456789123456) returns true or false indicating if x = y 
samenum(x,y,z)  n  b  samenum($mean1,1.23456789123456,107) returns true or false indicating if (xy) < z 
mv(var)  variable name  0,1,2  Returns 0 if variable has a valid value, 1 if system missing (.), and 2 if a defined missing value 
var[recnumber]  n  data value  Not a function, but a way to get a value for a given record. E.g. gen i x=age[recnumber] = age[recnumber1] or gen i x=age[_n] = age[_n1] 
findfile("filename.ext")  s  1 or 0  Checks if the file exists and returns a 1 if so otherwise a 0.
use e.g. imif findfile("myexport.csv") then ....... endif 
Top Operators used in EpiData Analysis  
operator  syntax  result  meaning  example 
+  n+n  n  addition  1+2 => 3 
+  s+any any+s  s  concatenation  "A"+"B" => "AB" "A"+1 => "A1" 
+  d+n  d  date addition  "30/11/2004"+31 => "31/12/2004" 
  nn  n  subtraction  21 => 1 
  dd  n  date subtraction  "31/12/2004""30/11/2004" => 31 
  dn  d  date subtraction  "31/12/2004"31 => "30/11/2004" 
*  n*n  n  multiplication  2*3 => 6 
/  n/n  n  division  5/2 => 2.5 5/0 => missing 
div  n div n  i  integer result of division  5 div 2 => 2 5 div 0 => missing 
^  n^n  f  exponentiation  5^2 => 25 4^0.5 => 2 
( )  group expressions  (5*(2+4))/2 => 15 5*2+4/2 == (5*2)+(4/2) => 12  
<  n<n  b  less than  1<2 => TRUE 
>  n>n  b  greater than  1>2 => FALSE 
<=  n<=n  b  less than or equal  1<=2 => TRUE 2<=2 => TRUE 
>=  n>=n  b  greater than or equal  1<=2 => FALSE 2>=2 => TRUE 
<>  n<>n  b  not equal to  1<>2 => TRUE 1<>1 => FALSE 
@  @var1  value substitution  used in any command, replaces @var1 with the contents of var1 before executing the command  
$  $resultvar  result value  used in let or gen, takes content of $resultvar as a constant 