EpiData Analysis: Command and Function Reference Guide          Document version 2.7

Commands only available in EpiData Analysis Classic:
Download Software
  • Descriptive analysis & further Statistical Tests
  • Life table & Kaplan Meier plot
  • Graphs
  • SPC graphs - Pareto Charts, Ichart etc.

Syntax for all commands: command <variables> [!option] [!option := a|b]
[ ] : optional specification.
{a|b|...} : indicates alternative choices
<...> : indicates user specified name/identifier

If you are in doubt of when to use double quotes "" and when not, the rule is:
Use "..." for all external references (e.g. "file names.ext") or assignments of text values ( e.g. set "COMMANDLOG" := "ON")
Double quotes are NOT needed for variables, defined value labels, dataset names etc


Read & Save Data
read read [{"<filename>" | <expression>}.{rec|csv|epx|epz|dta}" ] [!options ...]
Read a copy of the data file into memory. Note: name must be contained in " " and extension must be supplied.
If you use a case sensitive operating system (MAC or Linux) then case of filename and extension is important.
  • <filename> An optional filename may be given. The file format is detected based on the file extension (rec|csv|epx|epz|dta). If no filename is given the open file dialog is started.
  • <expression> An optional expression (instead of <filename>) that resolves to a string containing the filename. E.g. a concatenation of strings or a global variable. (see examples)
  • !force Will force reading a locked epx/epz file
    Note: Use with caution since the project may be used by someone else
  • !c Will close current project. It is only needed if you have changed the content of the previous project. E.g. data, labels, variables, valuelabels, etc.
  • !d := "<single character>" Will force the delimiter used when reading .txt/.csv files
    This option is only valid when importing delimited files. During import the delimiter will be validated and if the structure of the file does not fit with the selected delimiter, it will be rejected.
    If no delimiter is set for importing delimited files, the program will guess which delimiter is used based a fixed set of possible delimiters.
  • !q := "<single character>" Will force the character used for recognizing quoted strings when reading .txt/.csv files.
    This option is only valid when importing delimited files. If quotechar is set to empty/nothing (e.g. !quote := "") then complete content of the file is used in guessing/validating delimiters.
    If no quote characters is set, the default " (double quote) is used to identify strings.
  • !vn := {true|false} Will force the program to read first line in a .txt/.csv file as either variable names or as data.
    true forces it to read first line as variable names, with no regards to the actual content.
    false forces it to read first line as data, with no regards to the actual content.
    If the option is not used when reading delimited files, the program will guess based on the content if first line is data or headers.

    Note: In files with all string data and where first line is known to be variable names, the program has a high probability of guessing incorrectly on whether first line is headers or not. In such cases please use this option to ensure correct reading of data.

  • !pw := "<string>" If you use this option with a password protected file (either .epx/.epz or .rec file) reading will occur directly, otherwise you will be prompted for a password.
  • !login := "<string>" If a EpiData project in an EPX/EPZ file has Extended Access enabled, you will be prompted for login with a password and a username. With this option, the file is opened with the supplied login. A password for that user may be supplied with option !pw := "<string>"
Examples:
read "bromar.rec"; // complete filename provided read "bromar" + ".epx"; // expression using two strings in concatenation new global fn string := "bromar.epx"; read fn; // expression where the variable fn is used, the file "bromar.epx" will be read.
save save [{"<filename>" | <expression>}.{csv|epx|epz|dta}"] [!replace] [!format:="{stata|epidata|csv}"] [!force]
Save a copy of all variables in memory to a file - to use the data again later
  • <filename> An optional filename may be givin. The file format is detected based on the file extension (epx/dta/etc...). If no filename is given the save file dialog is started.
  • <expression> An optional expression (instead of <filename>) that resolves to a string with the filename. This can be e.g. a concatenation of strings, a global variable or something else. See read for examples.
  • !output [:= "html | text"] Instead of saving to the project, this options saves the output to a file. If no format is specified (text/html) then the current output format is used.
  • !replace Overwrite existing file
  • !format := "{stata|epidata|csv} Save the project as a specific type. Note: This will ONLY change the content. The user must secure correct extension for a given type.
  • !force Will force overwriting a locked epx/epz file
    Note: Use with caution since the project may be used by someone else
  • !d := "<single character>" Will force the delimiter of values/names used when writing .txt/.csv files
    This option is only valid when saving delimited files. The default value is "," (comma)
  • !q := "<single character>" Will force the character used for writing quoted strings.
    This option is only valid when saving delimited files. The default value is " (double quotation mark)
  • !version := <integer> Specify which stata version to save the data as. Accept values range from 4 -> 14, default is 14
  • !vn := {true|false} If true the first line in the .csv/.txt file is the variable names. Default value is "true"
  • !dated := "<single character>" Will force the character used for writing date delimiters. Default value is "/" (slash)
  • !timed := "<single character>" Will force the character used for writing time delimiters. Default value is ":" (colon)
  • !decd:= "<single character>" Will force the character used for writing decimal seperators. Default value is "." (dot)
  • !nl := "<single character>" Will force the character used for writing newlinex. Default value is that of the operating system:
    Linux: #10 MacOS: #13 Windows: #13#10
  • !memonl := "<single character>" Will force the character used for writing the linebreaks from a memo variable. Default value is " " (space)
  • !fixed Forces the writing of the .csv/.txt in a fixed format, where the length of each variable is used to specify the length of each column.
    Using !fixed ignores the use of the option !d
  • !bom This options adds the UTF-8 Byte Order Mark (wikipedia) to the .csv/.txt file.
Append, Merge & Aggregate
append append [<var1> <var2>...] [!ds := <dataset>] [!fn := <filename>]

Add observations after all observations in current file

Options:
  • !fn := <filename> Append this file - without !fn the file open dialog is shown.
    All known file types may be used and all options from read associated with reading of data can be used too.
    e.g. !d or !h options for reading csv files.
  • !ds := <dataset> Optional! Specifies which dataset to use from the external file. It is only needed
    if the external file contains multiple datasets.
- only fields with same name as variables in memory will be read. Variables from previous read which are not in the appended file will be set to missing for the appended observations

See variables on using referenced variables for this command

merge merge [<key1> <key2> ...] [!fn [:= "<filename>":]] [!ds := <dataset>] [!table] [!combine | !update | !replace]

Merge the current data file with another dataset file based on key variable(s). The result is a NEW dataset which is added to the top level of the project.

Options:
  • !fn [:= <filename>] Merge with a dataset in this file. Without !fn it is assumed that current used dataset should be merge with a related dataset.
    If no <filename> is provided then a dialog is shown to open the external file.
    All known file types may be used and all options from read associated with reading of data can be used too.
    e.g. !d or !h options for reading csv files.
  • !ds := <dataset> Optional! Specifies which dataset to use. It is only needed if the multiple datasets exist (either as related datasets or in external file).
  • !combine All non-missing values in the external dataset replaces all MISSING values for common variables
  • !update All non-missing values in the external dataset replaces ALL values for common variables
  • !replace All values in the external dataset replace all values for common variables
  • !table The external file is used as a lookup table.
    E.g. to add person information to a file with clinical results
  • !r Specify a different name for the resulting dataset. Default naming is a concatenation of the two used datasets
To keep information in variables with an identical name (e.g. mergevar or name) from all files rename (using edit) the variables before you merge the next file. E.g. read "pt.epx"; edit variable name !rename := "ptname"; merge hospid !filename := "hospital.epx" !r := MergeDs; use MergeDs; edit variable name !rename := "hospname"; etc ....

After merge the variable mergevar indicates source of information for each observation (records).
mergevar is defined with variable labels for these values:

  • 1: In main dataset only
  • 2: In merged dataset only
  • 3: In both datasets
Note: if mergevar already exist in the dataset an error will occur and the merge will be stopped. Drop the mergevar variable first.
Examples:
// Load a project read "Clinical Example.epx"; // A: Internal merge (1 related dataset) // Since the current used dataset only has a single related dataset, there is enough information provided by the key variable to combine to two datasets. merge; // B: Internal merge (2+ related datasets) // The currently used dataset have 2 or more related datasets, so we need to use the option !ds := <id> to specify which dataset we want to merge with. // Use Command list ds // to see which dataset id's you have in the project merge !ds := labdata // C: External merge (1+ dataset in external file - e.g. a .csv file) // In order to merge with an external file the key variables to merge on must be provided merge patientid !ds := "PatientNames.csv":labdata // D: External merge (2+ datasets in external file) // In order to merge with an external file that has 2+ dataset, both the key variables AND the dataset you with to merge with must be provided! merge patientid !ds := firstdataset !filename := "PatientNames.epx"

See variables on using referenced variables for this command

aggregate
agg
agg [<var1> <var2>...] [!options]

Aggregate - collapse - combine - data when you wish to change from individual to group level. Options:

See variables on using referenced variables for this command

  • Overall options:
    • !m Include observations with missing data (.)
    • !q Hides all output
    • !ncDo not show counts of observations for each variable
    • !ntDo not show total counts of observations
    • !ds := <dataset> Save the aggregated dataset as it's own dataset with given name
    • !replace use in combination with !ds to replace an existing dataset
    • !caption := ">string>" give the dataset a caption (used both in output and with !ds)
    • !h := >global string vector> give a custom label for the generated variables
      and show these as column descriptors in table head
    • !u use in comvination with !ds. Use the generated dataset after this command completes. Cannot be used with an active select!
    • !full Expands the resulting dataset with ALL possible value-combinations from <var1> <var2>...
      All entries with no data will contain system missing!
    • See labeling for options on changing between labels/values
    • See formatting for options on formatting percentages
  • Summary statistics:
    All options must have exactly one variable, but each option can be used multiple times with different variables
    • !mv Count of missing values (system and user defined)
    • !mean Calculates mean of the variable
    • !sd Calculates standard deviation
    • !sv Calculates standard variance
    • !min Calculates minimum value
    • !med Calculates median value
    • !max Calculates maximum value
    • !pXX Calculates XX percentile (XX = 1, 5, 10, 25, 50, 75, 90, 95, 99)
    • !sum Calculatest the sum of values
    • !des = Min, Median and Max
    • !iqr = p25 and p75
    • !idr = p10 and p90
    • !isr = p5 and p95
    • !mci = mean and 95% CI (low + high)
Example
// define a global vectore for column texts: new global columnhead[5] string; columntxt[1] := "Group"; columntxt[2] := "N Total"; columntxt[3] := "n (observed)"; columntxt[4] := "Mean"; columntxt[5] := .; agg sex age !by:=family !header:=columntxt !mci:=economy !mci:=children ;
Using datasets & Sorting
use use <dataset>

Change the active dataset of a project.

See variables on using referenced variables for this command

read "Clinical Example.epx"; list dataset; use datafile_id_2;
sort
sort variable1 [variable2 ...] [!descending]

Sorts the current dataset based on the given variables. Sort respects current select!

  • !descendingSorts the dataset in decending order
  • !dSame as above

See variables on using referenced variables for this command

Create new content
new project
new p
new project

Creates a new empty project, e.g. for simulation or testing.

  • !size:= <integer> Adds an initial dataform with <size> observations in it. If omitted, no dataform is created and must be created manually.
  • !title:= "<text>" Adds a title to the project. If not used a default title is given.
  • !c Closes and open project if modified
  • !pw := <string> Encrypts the data of the project with a single password. All data is encrypted with the AES/Rijndael algorithm, but metadata is not encrypted.
new dataset
new ds
new dataset dataset1 [!options...]

Create a new dataset for the project. Use the options to specify relations between datasets
If the command completes successfully, the newly created dataset is automatically used

  • !parent := "<parentform id>" Used for creating parent-child relations. If omitted the dataform is created
    as a top-level dataform.
  • !label := "<text>" Assign the descriptive text as a caption/label for the dataset.
  • !childobs := <integer> Used only in combination with !parent. Gives the number of allowed child observations in the child dataset
    (0 = no limit)
  • !afterobs := <integer> Used only in combination with !parent. Tells EntryClient what happens after entry of one complete observation
    0 = new observation, 1 = return to parent, 2 = return on max number of observataions, 3 = stay on the current observation
  • !statusbar := "<text>" Sets the "content string" of a dataform (see manager for formatting).
  • !size := <integer> Initialize the dataform with <size> empty observations.

See variables on using referenced variables for this command

new variable
new var
new v
new variable variable1 <type> [:= expression] [!options...]

Create a new variable of a given type and optionally assign the value in expression.
The variable type and expressions type must be compatible or else an error will occur. Variables contain a value for each observation.

  • !label := "<text>" Assign the descriptive text as a label for the variable.
    An existing variable label will be replaced with the new one.
  • !valuelabel := <valuelabel name> Assign an existing valuelabel set to the variable.
    An existing assignment will be replaced but not deleted. To delete the existing valuelabel set see deleting content
  • !length := <integer> Changes the entry length of a variable
  • !decimal := <integer> Changed the decimal entry length for floating point variables. Changing the decimal length for other variable types have no impact
  • !rangelow := <value> Set the lower bound for a range of values. Must be used in combination with !rangehigh
  • !rangehigh:= <value> Set the upper bound for a range of values. Must be used in combination with !rangelow
  • !entrymode := <integer> Changed the entry mode used in EpiData EntryClient
    0 = default, 1 = must enter, 2 = no enter
  • !confirm If used, the variable has the "confirm entry" flag set. Used in EpiData EntryClient
  • !key Adds the variable to be part of the key for the current dataset
  • !cmpX := variable Where "X" is replaced with one of GT, LT, GE, LE, EQ, NE. Adds comparison between the new variable and the assigned variable
  • !u | !memo When creating a string variable it is possible to specify the sub type using one of the above options. !u specifies this is an uppercase string variable. !memo specifies this a memo variable
  • !dmy | !mdy | !ymd When creating a date variable it is possible to specify the sub type using one of the above options. !dmy is the default type if no option is used else the specified sub type is used
  • !auto [:= <integer>] When creating a variable that supports automatic content (date, time or integer) using this option changes the default type to the automatic type. Integers -> AutoIncrement, Dmy -> AutoDMY, etc..
    For time and date variables it is possible to specify a number 0 (default), 1 or 2 which specifies when the variable is updated:
    0 = When obervation is created, 1 = When observation is first saved, 2 = Each time the record is saved after being edited
Examples where all observations get the same value:
new variable v1 integer := 1 + 2 * 3 - 4; new variable v2 float := (2 * pi) * 5; new variable v3 string := "Hello World!"; new variable v4 time := now(); new variable v5 boolean := (2 > 3); new variable v6 date := today();

Examples where a value depends on other variables: new variable v1 integer := v14 + v17; // v1 is equal to sum of v14 and v17 new variable age date := integer((today() - dateborn)/365.25) // calculated age

See variables on using referenced variables for this command

new global
new g
new global variable1 <type> [:= expression] new global variable1[<integer expression>] <type> [:= expression]

Create a new global parameter variable with a given type and optionally assign the value given in an expression.
The global variable or parameter has only one value, whereas a standard variable has one value for each observation. The global variable type and expressions type must be compatible otherwise an error will occur.
Global variables can for most parts be used like as a regular variable, but they cannot be evaluated as a vector.

If the name is post-fixed with square brackets [...], then a global vector is created, where each entry can be individually addressed using e.g. "g[3] := 20". If a value is assigned when creating a new global vector all entries of the vector will have the same value!

Examples: new global g1 integer := 1 + 2 * 3 - 4; new global g2 float := (2 * pi) * 5; new global g3 string := "Hello World!"; new global g4 time := now(); new global g5 boolean := (2 > 3); new global g6 date := today(); new global g7[10] integer := 10;

See variables on using referenced variables for this command

new valuelabel
new vl
new valuelabel valuelabel1 <type> (<value> , <label>) (...) [!m := <value>]

Create a new value label set with a given type (boolean not supported) and assign at least one (value, label) pair. Each (value, label) pair will be added to the newly created set. The datatype of the value MUST match the defined datatype for the value label set itself.
It is not possible to create an empty valuelabel set.

Note: An empty set will restrict data entry to system missing only!

  • !m := <value> Marks the given value in the value label set as missing. If the value is not part of the (value, label) pairs or the datatype does not match an error will be reported.
    This option can be used multiple times with different values.
Examples: // "normal" value label new valuelabel _VL1 integer (1, "Value A") (2, "Value B") (9, "Missing") !m := 9; // using expression new valuelabel _VL2 integer (0 + 1, "This " + "is " + "value " + 1) (1 + 1, "This " + "is " + "value " + 2) (2 + 1, "This " + "is" + "value " + 2);

See edit valuelabels for more advanced use of variables and loops to create additional valuelabels

See variables on using referenced variables for this command

List content
browse browse [variable1 [variable2 ...] ] [options]

Show the variables mentioned in a spreadsheet grid
- without parameters, browse all variables
- after browse has started you may Right Click and see how to close or adapt columns
- browse will as a default follow the show formats setting

  • !caption := "<string>" Give the browser window a custom caption!
  • !c Close all currently open browsers
  • !a Arrange all browsers in a cascade
  • !vn Show variable names instead of following the
    set "FORMAT VALUE LABEL" and set "FORMAT VARIABLE LABEL" setting

See variables on using referenced variables for this command

Note that browse is much faster than list

list data
list d
list data [variable1 [variable2 ...]]

Show values on the screen for all variables mentioned, with one observation per line (not limited by the width of the display)
- without variable names: list all variables.

  • See labeling for options on changing between labels/values

Note that browse is much faster than list.

Select: the sequence number is within the current select if you used a select statement, not for the whole dataset.

See variables on using referenced variables for this command

list project
list p
list project

Shows a brief overview of the project

  • !info Also outputs the study information
list dataset
list ds
list dataset

Shows a list of datasets for the project

  • !all Outputs additional information about the listed datasets
list variable
list var
list v
list variable

List currently defined variable names, types, formats and labels

list valuelabel
list vl
list valuelabel

Shows the full list of all valuelabel sets. Each set is listed individually as value/label pair and marked whether a value is considered missing or not.

list results
list res
list r
list results

List all current result variables and their values - means, describe, tables and other estimation commands create result variables, e.g. $mean1 or $count
All result variables are cleared when running a new command.

list global
list g
list global

List currently defined global parameters and their types and value. Global parameters can only contain a single value. But you may give them varying values and use functions to create content values.

Edit definitions of projects, datasets, variables and labels
edit project
edit p
edit project

Edits a project.

  • !title:= "<text>" Adds a title to the project. If not used a default title is given.
edit dataset
edit ds
m edit dataset dataset1 [!options...]

Edits an existing dataset in the project.

  • !label := "<text>" Assign the descriptive text as a caption/label for the dataset.
  • !childobs := <integer> Used only if dataset is related to a parent. Gives the number of allowed child observations
    (0 = no limit)
  • !afterobs := <integer> Used only if the dataset is related to a parent. Tells EntryClient what happens after entering the whole observation
    0 = new observation, 1 = return to parent, 2 = return on max observation, 3 = stay on current observation
  • !statusbar := "<text>" Sets the "content string" of a dataform (see manager for formatting).
  • !size := <integer> Changes the size of the dataset to <size> amount of observations.
  • !r := <new dataset name> Changes the name of the dataset. If the name is already in use an error will occur.
  • !noparent Moves the current dataset (and all related datasets) to be a top-level dataset.
    Notice than one must create a new empty child dataset to restore the relate situation followed by a merge of data in child datasets.

See variables on using referenced variables for this command

edit variable
edit var
edit v
edit variable variable1 [!<options>...]

Edit the metadata of variable1. The options specify which metadata are changed, multiple options may be used at once

  • !label := "<text>" Assign the descriptive text as a label for the variable.
    An existing variable label will be replaced with the new one.
  • !vl := <valuelabel id> Assign an existing valuelabel set to the variable.
    An existing assignment will be replaced but not deleted. To delete the existing valuelabel set see deleting content
  • !novl Removes an existing valuelabel set from the variable.
  • !l := <integer> Changes the entry length of a variable
  • !d := <integer> Changes the decimal entry length for floating point variables. Changing the decimal length for other variable types have no impact
  • !min := <value> Set the lower bound for a range of values. Must be used in combination with !max
  • !max:= <value> Set the upper bound for a range of values. Must be used in combination with !min
  • !norange Removes an existing defined range for the variable
  • !entry := <integer> Changes the entry mode used in EpiData EntryClient
    0 = default, 1 = must enter, 2 = no enter
  • !cmpX := variable Where "X" is replaced with one of GT, LT, GE, LE, EQ, NE. Adds comparison between this variable and the assigned variable.
  • !confirm If used, the variable has the "confirm entry" flag set. Used in EpiData EntryClient.
  • !noconfirm If used, the variable has the "confirm entry" flag unset. Used in EpiData EntryClient.
  • !key Adds the variable to be part of the key for the current dataset
  • !nokey Removes the variable from being part of the key for the current dataset
  • !r := <new variable name> Changes the name of the variable. If the name is already in use an error will occur.

Note: Data values are NOT changed! Even if the new length or decimals is shorter than actual content.

To keep the changes made you must save the data.

See variables on using referenced variables for this command

edit valuelabel
edit vl
edit valuelabel valuelabel1 [(<value> , <text>) ...] [!m := <value>] [!delete := <value>] [!nomissing := <value>]

Edits an existing value label set and optionally assign any number of (value, label) pairs.
If a (value, label) pair already exist, the new label will replace the old label. Otherwise the (value, label) pair will be added to the set. The datatype of the value MUST match the datatype for the value label set itself.

  • !m := <value> Marks the given value in the value label set as missing. If the value is not part of the (value, label) pairs, an already existing pair or the datatype does not match an error will be reported.
    This option can be used multiple times with different values.
  • !d := <value> Deletes the value label pair with the given value. If no such pair exists and error will be reported.
    This option can be used multiple times with different values.
  • !nom := <value> Removes the marks on the given value label pair that it should be considered missing. If no such pair exists and error will be reported.
    This option can be used multiple times with different values.
  • !r := <new value label name> Changes the name of the value label. If the new name is already in use an error will occur. Note: All variables already using this value label will continue to have the same label. To remove a valuelabel from a variable, see edit variable
Examples: // Create a new simple valuelabel new vl _VL1 int (1, "A"); // Simple edit: add another valuelabel edit vl _VL1 (2, "B"); // Simple edit: replace an existing valuelabel edit vl _VL1 (2, "Replaced B"); // Create a valuelabel set using a loop. new valuelabel _VL2 int (1, "This is the first value label"); // create a new valuelabel set new global i integer; // we also need a loop variable // now create 5 pairs (2, "... 2") (3, "... 3") ... for i := 2 to 5 do edit valuelabel _VL2 (i, "This is valuelabel no: " + i);

See variables on using referenced variables for this command

edit data
edit d
edit data [!md] [!nomd] [!mv] [!nomv]

Edit the status of observations

  • !md / !nomd Marks / Unmarks the current select observations for deletion
  • !mv / !nomv Marks / Unmarks the current select observations as verified
Deleting content
drop dataset
drop ds
drop dataset dataset1 [dataset2 ...]

Remove the listed datasets (and related datasets) from memory

See variables on using referenced variables for this command

drop variable
drop var
drop v
drop variable variable1 [variable2 ...]

Remove the listed variables from memory

See variables on using referenced variables for this command

drop global
drop g
drop global [variable1 ...] [!all [:= <type>]]

Remove the listed global variables from memory

  • !all Removes ALL global variables. If this option is used, then no variables must be listed.
  • !all := <type> Removes ALL global variables of a certain type . If this option is used, then no variables must be listed.
Example:
new global i1 int; new global i2 int; new global f1 float; new global f2 float; new global s1 string; drop global i2; // drops i2 drop global !all := "float"; // drops f1 and f2 drop global !all; // drops i1 and s1 (because they are left)

See variables on using referenced variables for this command

drop valuelabel
drop vl
drop valuelabel valuelabel1 [valuelabel2 ...]

Remove the listed value label sets from memory. If the set is assigned to a variable, then this assignment is automatically removed

See variables on using referenced variables for this command

drop data
drop d
drop data [!del]

Drops all data within current select from memory. Save the data first if you wish to keep any changes.

  • !del Drops all observations marked for deletion. Note: Make sure to test whether this creates a problem in a related dataset with the check command
Examples:
read "bromar.epx"; select (id > 1000) do drop data; // Drops all observations where id > 1000, but keeps the rest. read "bromar.epx"; drop data !del ; // drop all observations "marked for deletion"
Select observations
select ... do
select <logical expression> do <statement>

Work with selected observations (subgroup of data)

Select requires a command or a begin ... end segment following the expression. After the command(s) are executed, the dataset is no longer reduced.

Note: The commands merge, append, save cannot be used with the select ...do ...; statement.

It is possible to make nested selects which will combine the selects. Example:

select (V1 > 3) do begin count; // This command counts number of observations where (v1 is > 3) select (V1 < 4) do begin count; // This command counts number of observations where ((v1 is > 3) and (V1 < 4)). end; end;
Changing flow of the script
if ... then ... else ... if <logical_expression> then <statement or command> [else <statement or command>]

This is a "flow" control statement - which will evaluate logical_expression once and execute the command(s) following then when the logical expression is true. Note: This statement is not normally used to change values of variables (see the example below). The else clause is optional and is only executed if evaluation of the logical expression is false.
- for complex logical expressions, use parentheses for clarity.
Note: The if is changing flow of programming NOT values in variables unless the logic in the if ... ; defines this.

Examples:
if (dayofweek(today()) = 1) then freq v12 // will show variable v12 on mondays else freq v13; // will show variable v13 on other days of week.

Note: The flow control of "if ... then .... was completely different in EpiData Analysis Classic v2.2.

// In Analysis 2.2 to change all system missing values in V1 with the value 99 if (V1 = .) then V1 := 99; // To do the same in current Analysis use this syntax: select (V1 = .) do V1 := 99;
for .. to .. do
for <var> := <start value> to|downto <end value> do <statements>;

Loops through the integer values from start to end either in ascending order (to) or descending order (downto).
If the values are in the wrong order (e.g. start value > end value, and the order is ascending) then the statements are not executed.
<var>: Must be global variable (single or vector) and integer datatype!
The result of <start value> and <end value> must be an integer value

Examples:
new variable ID integer; // this creates the new ID variable new global i integer; for i := 1 to size(@dataset[1]) do ID := i; // This gives the ID variable the sequential number that each observation has in a dataset. You may sort the data before doing so.

See variables on using referenced variables for this command

for .. in .. do
for <var> in [<value1>, <value2>, ...] do statements;

Loops through the values one by one, assigning it to <var>

The only restriction in this loop is that the datatype of <var> and <value> must be the same!

Examples: new global I integer; for I in [1, 3, 5, 7, 9] do ? I < 10; // "regular" integer loop. new global F float; for F in [1.1, 3.3, 5.5] do ? fraction(F); // Looping with floating values new global S string; for S in ["Denmark", "Norway", "Sweden", "Finland"] do ? S + " is a nordic country";

See variables on using referenced variables for this command

Reordering variables and ?
reorder reorder var1 [var2 ...] [!options]

Reorders the variables specified. This can be used to place specific variable together and used with variable expansion

  • !before := <var>Places the variables before this variable
  • !after:= <var>Places the variables after this variable
  • !lastPlaces the variables at the end of the list

If no options are specified, the default is to place the variables before all other variables in the list. Use F3 or "list variable" to show the current order of variables.

Examples: read "bromar.epx"; // Load the project reorder kmgrp agegrp decgrp; // Moves the variables kmgrpm, agegrp, and decgrp to the front of the list reorder age km !before := agegrp // Moved the variables age and km in front of agegrp

See variables on using referenced variables for this command

? ? <expression>

Show result of an expression. It is posible to use all types of variables (standard, results or global) in the expression.
Note: if you are using standard variables it is possible to change a given specific observation with [x] (x is an integer).
Examples:

? v1[5] + 10 // this will use the 5th observation in the standard variable v1 ? g1 - 10 // this will use the content of the global variable g1 ? 241/34 ? (23 > 19) ? "a " + "b " + "c"
Change variable content
  Assign the value given in an expression. The variable type and expressions must be compatible otherwise an error will occur.

Here all observations get the same value: v1 := 1 + 2 * 3 - 4; v2 := (2 * pi) * 5; v3 := "Hello World!"; v4 := now(); v5 := (2 > 3); v6 := today();

It is also possible to assign a value to individual entries of a variable:

v1[1] := 3; v2[2] := 31.41596; v3[3] := "It works!"; v4[1] := Createtime(12, 34, 56);

If you specify a conditional (depend) rule based on other variables the change only occurs for the subgroup defined by the select statement:

select (v1 = 0) do v17 := 17; select ((v1 = 0) and (v2 = .)) do v17 := 27;

Functions may be used:

select (age = .) do age := integer((today() - dateborn)/365.25) // calculated age
Tables & Frequencies
freq
fre
freq variable1 [!<option> ...]

Frequency distribution for variable1

See variables on using referenced variables for this command

  • !m Include observations with missing data (.)
  • !cum Add cumulative percentage
  • !r Add row percentage
  • See labeling for options on changing between labels/values
  • See formatting for options on formatting percentages
tables
tab
tab <column variable> <row variable> [!<option> ...]

The tables (brief: tab) command shows a cross tables for the variables chosen.

See variables on using referenced variables for this command

The default sorting for the cross tables is increasing value, regardsless of any valuelabel for the variables. Use options below to change the sort order!

  • Data and output:
    • !m Include observations with missing data (.)
    • !w := <variable> Use number of observations in the variable as frequency weight
    • !by := <variable> Stratify the data by this variable.
      If multiple !by options are used, each unique combination of values from the by-variables will have it's own sub-table.
    • !q Hide all output! Result variable are still calculated
    • !nc Hide combined/unstratified tables
    • !nb Hide sub/stratified tables
    • !ns Hide summary table
    • See labeling for options on changing between labels/values
    • See formatting for options on formatting percentages
  • Percentages:
    • !pr Show row percents for each table cell and col/row totals
    • !pc Show col percents for each table cell and col/row totals
    • !pt Show total percents for each table cell and col/row totals
  • Sorting:
    • Indicate by !Sxxx where the x indicate:
      R:row C:Column A:Ascending D:Descending T:Total L:label (else numerical)
    • !sa Sort col & row in ascending value order
    • !sd Sort col & row in descending value order
    • !sla Sort col & row in ascending label order
    • !sld Sort col & row in descending label order
    • !sca := <integer> Sort col ascending value order in given index
    • !scd := <integer> Sort col descending value order in given index
    • !sra := <integer> Sort row ascending value order in given index
    • !srd := <integer> Sort row descending value order in given index
    • !scta Sort on col totals ascending order
    • !scta Sort on col totals descending order
    • !srta Sort on row totals ascending order
    • !srtd Sort on row totals descending order
  • Estimation and testing:
    • !t Chi2
Means and Count
count
count

Counts number of observations. Count may be used with select to count within a subgroup


Result variable: $count. Use list results for details
means means variable1 [!by=variable2] [!t]

Basic descriptive statistics for variable1, optionally stratified by variable2

See variables on using referenced variables for this command

  • !by: Stratify by this variable
  • !t: Test for homogeneity of the mean across strata (same or different mean)
    Including Bartletts test for homogeneity of variance with more than two strata
    F-test is provided, can be interpreted as T-test with two strata
  • Without !by and with !t : Test that mean=0
    (e.g. as a paired T-test for difference in before and after measure)
  • Estimates are saved as result variables. Use list results for details
  • See labeling for options on changing between labels/values
Note: Confidence Intervals given are based on the T-distribution with N-1 degrees of freedom.
Consistency and Validity Check of data
check data check data [var1 ...]

Use this command to perform a check of the data in selected variables (if no variable are specified, then ALL variable are checked).
The data is checked for:

  • Data length: Is the number of characters used in data within the length specified for the variable
  • Range/Valuelabel: Is the data within the specified range and/or is it a legal value label
  • Must Enter: Does the variable have data for all observations if it is marked as Must Enter
  • Jumps: If a variable has jumps assigned, do the skipped fields have the correct values
  • Comparison: If a variable is compared to another variable, is the comparison uphold.
Examples: read "bromar.epx" check data // checks all variable check data dectime kmgrp age // Only checks the variables dectime, kmgrp and age

See variables on using referenced variables for this command

check key check key [var1 ...]

Use this command to perform a check of the data in specified variables whether the data is unique and represent a key.
If no variables are specified and a key is already present in the current dataset, this key is check.

Examples: read "bromar.epx" check key id // checks if the variable ID represents a unique key

See variables on using referenced variables for this command

check relate check relate

Performs a check on data and whether all observations have a valid parent observation

Examples: read "related_data.epx"; // Load the project use child_dataset; // Change dataset to a related dataset check relate; // Perform the check from the child dataset "upwards" to the parent. Must be repeated if you have more levels
check study check study

Performs a check on the study information of whether it is filled or not.

Examples: read "samplev3.epx"; // Load the project check study; // Perform the check
Reports
report users report users

If a project is using Extended Access control, this command will show a condensed report of the log entries and a list of failed login attempts.

If the project is not using Extended Access control, an error will be displayed.
report val report validate [var1 var2 ...] [!options]
Compares two dataset / projects against eachother, validating the data content and outputs a report of diffenrences based on the comparison.
The variables var1 .. varn denotes the sorting variables. This is required if not comparing whole projects OR if the datasets does not contain and key variables.
  • !fn := "<string>" Opens an external file to compare with.
  • !ds := <dataset id> Specifies a single dataset (internal/external) to compare with.
  • !nos Excludes all string types from comparison
  • !nodt Excludes all date and time types from comparison
  • !noauto Excludes all auto types from comparison
  • !noc All text comparisons are done case in-sensitive
  • !nol Only show the condensed report - do not show the list of observations
  • !val All records that pass the comparison will be marked as verified. The pass is based on the option chosen from above!
Examples: read "bromar.epx"; // Load the project // Run a report based on the two internal datasets (1st is currently used, 2nd is the one marked with !ds :=...) report val id !ds := ds2; // Run a report based on the two datasets, one internal and one external (1st is currently used, 2nd is the one marked with !ds :=...) report val id !fn := "double_entry.epx" !ds := ds1 // If you have two projects there are two ways compare there. If you wish to compare individual dataset, use the options above. // If you have two project you wish to make a complete validation on, use following: // Run a report based on the two complete projects, one internal and one external report val !fn := "double_entry.epx" // The last example is a special case where both the internal and external project only contains a single dataset each. // In this case you only need to specify the sorting variable(s) and the external file. The dataset option is not need // since the external project only has a single dataset. report val id !fn := "double_entry.epx"
report cby report cby [var1 var2 ...] [!options]
Compares the combination of variables across several datasets. The variables var1 .. varn is considere a "key" and each unique combination of this key is counted across all the specified datasets.
The output is a report with a condensed table of the found keys and a complete table with the found unique key values and the count of these in each dataset.
  • !fn := <global string vector> This option accepts a global vector with the filenames that is included in the report
  • !ds := <global string vector> This option accepts a global vector with the dataset name that is included in the report
  • !nol Only show the condensed report - do not show the list of observations
Examples: // Setup the input for the report: new global filenames[5] string; filenames[1] := "count_file_1.epx"; filenames[2] := "count_file_2.rec"; filenames[3] := "count_file_3.dta"; filenames[4] := "count_file_4.csv"; filenames[5] := .; // NOTE: the filename are not required to be the same format, only the variable names MUST be the same // if an entry in the vector is sys.missing, the command assumes the dataset is in the currently opened project. // The number of entries in the dataset variable MUST be the same as the filenames new global datasets[5] string; datasets[1] := "ds1"; datasets[2] := "ds1"; datasets[3] := "ds1"; datasets[4] := "ds1"; datasets[5] := "ds1"; // Run the report: report cby id !fn := filenames !ds := datasets
Disk commands
cd cd ["<directory path>"]

Change the working directory (folder) to the specified path.
If no path is given a dialog is shown to select the working directory.

ls
dir
ls ["<directory path>"]

list files in a directory
- directory name may include wild cards (* or ?)
If no path is given a dialog is shown to select the working directory

erase erase "<file name>"

Delete the file from disk.
- directory name use of wild cards is restrictive due to operating systems (* or ?)
If no path is given current working directory is used.

Warning: The file is deleted (if the file exist) with no confirmatory question

Set parameters
set set ["parameter"] := ["value"]
  • Change the value of a EpiData "set" parameter

    - without parameters, provides a list of available parameters and their current values
    Set commands will be implemented as development continues, to see those currently defined issue "set" without parameters. .
    - All set ["parameter"] definitions may be added to the file startup.pgm, such that you define your own default standard. An example of this is colour or font selection. Notice that placement can depend on operation system, but for most will be where your exe file is placed
  • values can be a number, text, ON/OFF or a hexadecimal font colour ( See examples here (requires internet) ).
    see table below
  • to see current value: set ["parameter"] e.g.: set "echo"
  • For set's with ON/OFF or a text value include this in " " e.g.: set "echo" := "off"  set "COMMANDLINE FONT COLOUR" := "#FFF000"

Option Possible values Default Value Comments or function
BROWSER BG COLOUR <hex colour code> "#FFFFFF" Adjust the colour of the background. e.g. #000000 is black.
BROWSER FONT COLOUR <hex colour code> "#000000" Adjust the colour of the font. e.g. #FFF000 is yellow.
BROWSER FONT NAME <string> (depends on the operating system) Name of the font used in the browser.
BROWSER FONT SIZE <integer> 10 Adjust the size of the font in the browser.
BROWSER FONT STYLE <fsBold/fsItalic/fsUnderline> " " Adjust the style of the text in the browser. Eg. underlines text, bold text.
BROWSER OBS DEFAULT COLOUR <hex colour code> "#F0F0F0" Adjust the colour of "obs" column for normal/default observations
BROWSER OBS DELETED COLOUR <hex colour code> "#FF0000" Adjust the colour of "obs" column for observations marked for deletion
BROWSER OBS VERIFIED COLOUR <hex colour code> "#008080" Adjust the colour of "obs" column for verified observations
BROWSER VALUE LABEL L/V/LV/VL V Default option for output of variable data (value and/or label). See Valuelabels for options.
This options applies to "list data" and "browse" only
BROWSER VARIABLE LABEL VLA / VLN / VN / VNL VN Default option for displaying variable name and/or label. See Variable labels for options
This options applies to "list data" and "browse" only
COMMANDLINE BG COLOUR <hex colour code> "#FFFFFF" Adjust the colour of the background. e.g. #000000 is black.
COMMANDLINE FONT COLOUR <hex colour code> "#000000" Adjust the colour of the font. e.g. #FFF000 is yellow.
COMMANDLINE FONT NAME <string> (depends on the operating system) Name of the font used in the commandline edit.
COMMANDLINE FONT SIZE <integer> 10 Adjust the size of the font in the commandline edit.
COMMANDLINE FONT STYLE <fsBold/fsItalic/fsUnderline> " " Adjust the style of the font, e.g. bold, underline etc
COMMANDLOG ON/OFF ON When "ON" a complete list of executed commands is saved to a file in current active dir.
COMMANDLOGFILE <string> commandlog.pgm Name of the file to save the executed commands
COMMANDLOGLINES <integer> 1000 The number of lines kept in the commandlog file. If the number of lines is exceeded, the lines are dropped from the beginning
CSV DELIMITER <any desired delimiter> , The separator used between variables when you export to the clipboard from the browser.
ECHO ON/OFF ON When = ON show results, OFF: "silent"
NOTE: output from error ignore this setting! Use "SHOW ERROR" := "OFF" if you wish to suppress errors too!
EDITOR FONT NAME <string> (depends on the operating system) Name of the font used in the editor.
EDITOR FONT SIZE <integer> 10 Adjust the size of the font.
EXITSAVE YES/NO NO If "YES" the user is prompted on closing the program for saving if a project is open and has been modified
EXITSAVE YES/NO NO If "YES" the user is prompted on closing the program for saving if a project is open and has been modified
INCLUDE DELETED ON/OFF OFF If "ON" then observations marked for deletion is also included in calculations
OUTPUT BG COLOUR <hex colour code> "#FFFFFF" Adjust the colour of the output background. e.g. #000000 is black.
OUTPUT FONT COLOUR <hex colour code> "#000000" Adjust the colour of the output font. e.g. #FFF000 is yellow.
OUTPUT FONT NAME <string> (depends on the operating system) Name of the font used in the text output.
OUTPUT FONT SIZE <integer> 10 Adjust the size of the font in the text output.
OUTPUT FONT STYLE <fsBold/fsItalic/fsUnderline> " " Adjust the style of the font, e.g. bold, underline etc
OUTPUT FORMAT TEXT/HTML TEXT Format of the output window (currently only TEXT is implemented)
SHOW COMMAND ON/OFF ON If "ON" then each line that is run (from command line or editor) is added to output as ".<command...>". "OFF" = no output
SHOW DEBUG ON/OFF ON If "ON" then lines containing debug information is shown. "OFF" = no output
SHOW ERROR ON/OFF ON If "ON" then lines containing error information is shown. "OFF" = no output
SHOW INFO ON/OFF ON If "ON" then lines containing informational output is shown. "OFF" = no output
SHOW WARNING ON/OFF ON If "ON" then lines containing warning information is shown. "OFF" = no output
STATISTICS VALUE LABEL L/V/LV/VL L Default option for output of variable data (value and/or label). See Valuelabels for options
This options applies to commands not covered by "BROWSER VALUE LABEL"
STATISTICS VARIABLE LABEL VLA / VLN / VN / VNL VLA Default option for displaying variable name and/or label. See Variable labels for options
This options applies to commands not covered by "BROWSER VALUE LABEL"
Labeling and formatting data
Valuelabels
  • !v: Show only the value, (fallback if no label to corresponding value)
  • !l: Show only the label (default)
  • !vl: Show the value then the label
  • !lv: Show the label then the value
Variable Labels
  • !vn: Show only the name, (fallback if no variable label assigned)
  • !vla: Show only the label (default)
  • !vnl: Show the name then the label
  • !vln: Show the label then the name
Formatting
  • !d0 Formats percentages with 0 decimals
  • !d1 Formats percentages with 1 decimal
  • !d2 Formats percentages with 2 decimals
  • !d3 Formats percentages with 3 decimal
  • !d4 Formats percentages with 4 decimals
  • !d5 Formats percentages with 5 decimals
Variable types
integer
int
i
A variable (standard, result or global) that contains an integer value.
float
f
A variable (standard, result or global) that contains an floating point value.

Note: all floating points shown on screen appear in the current national setting (locale), but input (from editor or command line) must always use "." (period) as the decimal separator. The saved data in a given project can be used in different national settings without giving problems or need for conversions.
string
str
s
A variable (standard, result or global) that may contain any string
boolean
bool
b
A variable (standard, result or global) that contains only true or false
time
t
A variable (standard, result or global) that contains a time value.
date
d
A variable (standard, result or global) that contains a date value. All new date variables created will be a DMY type, but this may change in the future.
Using Variables and references
Variable Expansion
var1 - var4
va*
var?

Any command that accepts more than 1 (one) variable can use the scheme for variable expansion.

  • Using "-" (dash) tells the program to use all variables between the two specified variables (both including).
  • The "*" (asterisk) is used as a replacement for 0 to many characters
  • The "?" (questionmark) is used as a replacement for exactly 1 character
Please Notice If there are no variables matching the result, e.g. t? and you have no variables with t, then you get an error

It is possible to combine "*" and "?" for more elaborate expressions, but neither can be combined with "-"
Variables for expansion cannot start with the "*" or "?", but must start with a "normal" character.
Example:
// Consider the following set of variables (and in that order): // V1, V2, V3, V4, V10, V11, V100 list data V2 - V10; // V2 - V10 is expanded to the variables V2, V3, V4 and V10 list data V1* ; // V1* is expanded to V1, V10, V11, V100 because * can be "" (empty character) and "0", "1" and "00" list data V? ; // V? is expanded to V1, V2, V3 and V4, because ? can be be replaced by "1", "2", "3" and "4" but not any other. list data V1??; // V1?? is expanded to V100 only!

Referenced variable may also be used in the expansion. These will be evaluated before the expansion!

Variable
variable1
Any command that accepts variables can use this variant. This is the "default" way to provide a variable to a command.
Referenced Variable
@{variable1}

With a referenced variable, you essentially use the content of another variable (global, result) as the variable.

Examples: new global gvar1 string := "sex"; read "bromar.epx"; freq sex; // Outputs a frequency table for the variable "sex" freq @{gvar1}; // Does the same as above, because the content of gvar1 is "sex"

This can be combined with eg. indexing of a variable. Using some of the builtin result variables like $dataset and $variable:

new global i integer; for i := 1 to size($variable) do begin // Here we output the name of all the variables: // - not using the @{..} because we want the content of the $variable result var. ? $variable[i] // Here we do a frequency table of the variable. // - using @{..} because "freq" needs a variable and not the content of $variable freq @{$variable[i]} end;

The Variable inside the @{..} may itself be another reference (with or without index), making it possible to combine multiple levels of references.

new global f string := "age" new global g[3] string := "f"; // All entries have the value "f", but that is fine for this example. new global h[3] string := "g"; // All entries have the value "g", but that is fine for this example. // The line below is a valid construction, which evalutes the following way // 1: h[1] is evaluated into the string "g" // 2: "g" is used in @{"g"}, which means - use the content of g as a variable // 3: g[1] is evaluated into the string "f" // 4: "f is used in @{"f"}, which means - use the content of f as a variable // 5: @{f} is evalued to the variable AGE // 6: The command freq is run on the variable AGE. freq @{ @{h[1]}[1] }; Example of how to loop over a referenced variable: // you wish to estimate the time for parts of an analysis and have created a number of time stamps: new global tx t:= now(); // where x is 1 , 2, 3 etc. // now to display these and the difference: - assume you had five of these: new global i i; for i:= 1 to 5 do begin ? i + " time: " + @{"t" + i}; // this works becaus the parenthesis will be t1 t2 t3 etc. end; // now also calculate the difference in time between the two: new global tdif t; // tdif is a time difference for i:= 2 to 5 do begin tdif := (@{"t" + i} - @{"t" + (i-1)}); // notice again the (tdif = t2 -t1 ) when i was = 2 ? i + " difference : " + tdif; end;
Programming aids - not normally used in interactive mode
runtest runtest ["<directory path>"]

Run all pgm's in a given directory (folder) to verify function.
- suited for testing of correct estimation etc.
If no path is given, a dialog is shown to select the working directory.

run run ["<filename.pgm>"]

Execute sequence of commands saved in a pgm file
- without parameters, the open file dialogue is started

Clean up - clear screen and history
close close
Stop using a project
- all unsaved variables and changes to existing variables and labels will be lost
- global variables will remain in memory
cls cls
Clear the output screen
clh clh
Clear the history of commands
reset reset

Complete reset of all parameters of the program!
This is almost equivalent of doing:

close; drop global !all; cls; clh;
But this also clears all result variables!
Functions available in EpiData Analysis
In the following, takes indicates the variable type for each parameter and result indicates the type of the result of the function:
     s: string; b: boolean; d: date; t: time; i: integer; f: floating point; n: any numeric; v: variable
parameters may be variables read from fields, new created variables, or any expression that evaluates to the correct type.
String functions
function takes result example
length(str) s i length("Abcde") => 5
pos(instr, findstr) s, s i pos("Abcde", "cd") => 3
pos("Abcde", "z") => 0
substring(str, start, len) s, i, i s substring("Abcde", 2, 3) => "bcd"
trim(str) s s trim("Abcde ") => "Abcde"
trim(" Abcde") => "Abcde"
lower(str) s s lower("Abcde") => "abcde"
upper(str) s s upper("Abcde") => "ABCDE"
concat(X, s1, s2, ..., sn) s, any, ... s Concat(...) concatinates values s1 -> sn into a string. If any of the sx parameters return sys. missing it will be replaced by the value of X
concat("X", "a", v1[_n]) => "aX" if v1 is missing, else a + the value of v1

NOTE: the concat function only adds the first parameter for system missing (.), for user defined missing the actual value is added for that variable.
Arithmetic functions (including Random numbers)
function takes result example
abs(x) n n abs(-12) => 12
exp(x) n f exp(1) => 2.71828182845905
fraction(x) f f fraction(12.34) => 0.34
ln(x) n f ln(2.71828182845905) => 1
ln(0) => missing
log(x) n f log(10) => 1
log(0) => missing
round(x, digits) n, d, t f round(12.44,1) => 12.4
round(12.5,0) => 13
sqrt(x) n f sqrt(4) => 2
random(x) i i Random integer from 0 to x
sum(n1, n2, ..., nn) n, ... n Sums that values n1 => nn, but ignores the entries if they are either sys. missing or user defined missing
Trigonomety functions
function takes result example
tan(x) f f tan(0) => 0
arctan(x) f f arctan(1) => pi/2
cos(r) f f cos(pi/2) => 6.12303176911189E-17
cos(pi) => -1
arccos(r) f f arccos(0) => pi / 2
sin(r) f f sin(pi/2) => 1
sin(pi) => 6.12303176911189E-17
arcsin(r) f f arcsin(0) => 0
Date functions
function takes result example
createdate(datestr) s d createdate("31/12/2016") => 31/12/2016
The form of datestr is automatically detected, but if the string is ambiguous the preference is always DMY over MDY.
If parts of the datestr are omitted, then these parts are filled with todays values.
createdate(datestr, fmt) s, s d createdate("31/12/2016", "dmy") => 31/12/2016
createdate("12/31/2016", "mdy") => 31/12/2016
createdate("2016/12/31", "ymd") => 31/12/2016
createdate(d, m, y) i, i, i d createdate(31, 12, 2016) => 31/12/2016
today() - i returns today's date; may be assigned to a date variable or an integer
day(d) d i day(31/12/2004) => 31
dayofweek(d) d i dayofweek(31/12/2004) => 5
Monday=1, Sunday=7
month(d) d i month(31/12/2004) => 12
week(d) d i week(22/02/2001) => 8
year(d) d i year(31/12/2004) => 2004
Time functions
function takes result example
createtime(timestr) s t createtime("12:34:56") => 12:34:56
The form of timestr is automatically detected.
If parts of the timestr are omitted, then these parts are filled with 0 (zero).
createtime(h, m, s) i, i, i t createtime(12, 34, 56) => 12:34:56
now() - f returns the time right now. It can be assigned to a time or float variable
second(t) t i second(12:34:56) => 56
minute(t) t i minut(12:34:56) => 34
hour(t) t i hour(12:34:56) => 12
Logic functions
function takes result example
b1 and b2 b,b b true and true => TRUE
true and false => FALSE
false and true => FALSE
false and false => FALSE
b1 or b2 b,b b true or true => TRUE
true or false => TRUE
false or true => TRUE
false or false => FALSE
b1 xor b2 b,b b true xor true => FALSE
true xor false => TRUE
false xor true => TRUE
false xor false => FALSE
not(b) b b not(true) => FALSE
not(false) => TRUE
Conversion functions
function takes result example
boolean(x) any b boolean(x) => TRUE, for any non-zero x
boolean(0) => FALSE
boolean("true") => TRUE, "true" text is case in-sensitive
boolean(x) => FALSE, for any text other than "true"
integer(x) any i integer(1.23) => 1
integer(31/12/2016) => 42735
integer("2") => 2
integer("a") => .
Any input x that cannot be interpreted as an integer returns missing "."
float(x) any f float(1) => 1.00
float("12,34") => 12.34
Any input x that cannot be interpreted as a float returns missing "."
string(x) n s string(1.23) => "1.23"
Identifier functions
function takes result example
exist(x) v b Returns true/false whether the provided identifier exist
idtype(x) v i Returns the type of the identifier provided. This function cal be used on all valid identifiers and the integer value returned have the following associations:

0: Global variable
1: Global vector
2: Regular Variable
3: Dataset
4: Valuelabel
5: Result Variable
6: Result Vector
7: Result Matrix

Note: if using idtype(x) with the eval function "?", the output will be in text.
datatype(x) v i Similarly as to idtype(x) this function takes any variable, but in this case returns the type of date stored in the variable. The integer value return have the following associations:

-1: Variable has no data type - e.g. a dataset variable.
0: Boolean
1: Integer
2: Auto Increment
3: Float
4: DMY Date
5: MDY Date
6: YMD Date
7: DMY Auto Date
8: MDY Auto Date
9: YMD Auto Date
10: Time
11: Auto Time
12: Uppercase String
13: String
14: Memo

Note: if using datatype(x) with the eval function "?", the output will be in text.
size(x) v i Size returns the size/length of an identifier (if applicable). The function works as follows:
Global & Result variables always have size 1
Global vector, Result Vector, Variable & Valuelabel return the length/size/count of elements/data
Result Matrix is not implemented yet - it returns -1;
Dataset returns the total number of observations (even if a select is applied).
label(v) v s Return the descriptive label of the identifier. This is only possible for variables and datasets.
Test and special functions
function takes result example
lre(x,y) n n lre($mean1, 1.23456789123456) returns number of digits precision of $mean1
iif(b, x, y) b, n, n n iif(..., true value, false value) evaluates the boolean expression (b) inline, and based on the result either returns the true value or false value.
iif(2 = 3, "This is true", "This is false") => "This is false"
samevalue(x, y, z) x,y = n, d, t
z = i
b samevalue($mean1, 1.23456789123456, 10-7) returns true or false indicating if |(x-y)| < 10z
Best used for comparing floating point values. Since internal binary representation of two seamingly even numbers may differ, using x = y can fail.
samevalue(x, y) n, d, t b samevalue($mean1, 1.23456789123456) returns true or false indicating if x = y
Essentially the same as calling samevalue(x, y, 15)
cwd() - s Returns the current working directory
deleted([index]) [i] b Returns true/false whether the record is marked for deletion. If no index is supplied the current record number is tested:
select deleted() do edit data !nomd // selects current records marked for deletion and unmark them
verified([index]) [i] b Returns true/false whether the record is marked as verified. If no index is supplied the current record number is tested:
select verified() do edit data !nomd // selects current records marked for deletion and unmark them

Operators used in EpiData Analysis
operator syntax result meaning example
+ n+n n addition 1+2 => 3
+ s+any
any+s
s concatenation "A"+"B" => "AB"
"A"+1 => "A1"
+ d+n d date addition "30/11/2004"+31 => "31/12/2004"
- n-n n subtraction 2-1 => 1
- d-d n date subtraction "31/12/2004"-"30/11/2004" => 31
- d-n d date subtraction "31/12/2004"-31 => "30/11/2004"
* n*n n multiplication 2*3 => 6
/ n/n n division 5/2 => 2.5
5/0 => missing
div n div n i integer result of division 5 div 2 => 2
5 div 0 => missing
^ n^n f exponentiation 5^2 => 25
4^0.5 => 2
( )

group expressions (5*(2+4))/2 => 15
5*2+4/2 == (5*2)+(4/2) => 12
= n = n b equal 1 = 2 => FALSE
< n < n b less than 1<2 => TRUE
> n > n b greater than 1>2 => FALSE
<= n <= n b less than or equal 1<=2 => TRUE
2<=2 => TRUE
>= n >= n b greater than or equal 1<=2 => FALSE
2>=2 => TRUE
<> n <> n b not equal to 1<>2 => TRUE
1<>1 => FALSE
$ $resultvar result value ? $count => 4027