Notes on the BASIC programs for Windows

These programs don't do much, but some of them provide tests which aren't readily possible in SPSS or Stata, or are a slight hassle to do in a SEM program. Others do calculations which would be onerous to do manually. These notes provide some information about the programs and references so you can learn more about what they do.

The programs are stored in Dropbox. In order to avoid problems with anti-virus programs, which are understandably wary of .exe files, the filenames end with .dum rather than .exe. When you left-click on the link of the program you want to download (the links are the headings of the descriptions given below), Dropbox will tell you that .dum files can't be previewed. It will then allow you to download the file:









In order for the programs to run on your computer, they need to be renamed so the filenames end in .exe. Probably the best time to do this is during the download. When Windows shows this display







use the opportunity to change .dum to .exe before clicking on Save.

If this seems a huge hassle, you're probably right. If you're still interested in using a program, email me at and I'll send you a link so you can download the .exe file directly.

When you finally double-click on the .exe Windows may (very reasonably) warn you against running these programs. You can click on Learn more then Run anyway. You should only have to do that once.



All the programs (with one exception) are text-only console programs, and run in a command-line interface, in this case the command prompt (DOS prompt) in Windows 10. There's no point-and-click, just entry from the keyboard and pressing Enter to continue. In some cases the data can be read in from a text file which follows the format described. Some programs allow results to be saved in a text file. Notepad++, a free text editor, is highly recommended for creating and reading text files.

When telling the programs what you want to do, you typically have to enter a number, often '1' for 'yes', or simply press Enter (which enters 0) to accept a default. Instructions are provided by the programs.

The programs don't offer any opportunity to correct data-entry errors. This is partly because such facilities require tedious programming and also because none of the programs require a great deal of data entry. So, if you make a mistake, just press Ctrl-Pause/Break to crash out of the program and start again.

In some programs, if you specify the name of a file from which you want to read data and get the filename wrong, the program exits without warning: no 'sorry, try again'. That is, unless the programmer has built in some code to catch the error and offer a second chance. This has been done in some of the programs in this collection.

The sections of programs which calculate p-values using the normal and chi-squared distributions were taken from Cooke, Craven & Clarke (1982). A bonza book if you're writing these sorts of programs.

Cooke, D., Craven, A. H. & Clarke, G.M. (1982). Basic Statistical Computing. London: Edward Arnold.

Calculate Cohen's (unweighted) kappa

This can be applied to ratings of N objects by a single pair of judges, or to ratings by N pairs of judges who are responding on the same scale to a given issue. The data are entered in the form of a contingency table, each entry being the number of pairs of judgements which fall into cell (i,j) of a K by K table, where K is the number of categories in the rating or judgement scale. Confidence limits and tests that kappa is zero are based on variances calculated according to the methods given by Fleiss, Cohen & Everitt (1969). There are free utilities which carry out this task and will also calculate weighted Kappa. Some of my favourites may not work under Windows 10.

Fleiss, Cohen and Everitt (1969) Large sample standard errors of kappa and weighted kappa, Psychological Bulletin, 72, 323-327.

Calculate Cohen's kappa for more than two judges using the method of Fleiss (1971)

You'd use this in the situation where a fixed number of judges rate each of a number of targets but they're not necessarily the same judges for each target. The data can be keyed in or read from a file laid out in the way described by the program.

Fleiss and Siegel & Castellan (1988) describe the basis of the test and give examples.

Fleiss, J.L. Measuring nominal scale agreement among many raters (1971). Psychological Bulletin, 76, 378-382
Siegel, S. & Castellan, N.J. (1988). Nonparametric statistics for the behavioral sciences. Second Edition. McGraw-Hill


Calculate cumulative odds ratios

This is just a calculation aid. When there are more than two categories in a frequency table, the cells can be partitioned in various ways when calculating odds ratios. In ordinal logistic regression, the proportional odds model assumes that the ORs are more or less the same no matter how the cells are partitioned. With this program you can calculate the odds ratios with various partitions with a categorical IV. The calculations are explained in the program itself. You can save a file of results.

Calculate intraclass correlation, deff from means, SDs and Ns of clusters

This is more of a self-demonstration program than a tool for analysis, but it could come in handy. You can easily fiddle around with the differences between the means of the clusters (between-group variation) and the SDs of the observations in the clusters (within-group variation), and the numbers of cases in the clusters and see the effects on the intraclass correlation and the DEFF, which are indices of the lack of independence of observations.

The greater the intraclass correlation, the greater the departure from the assumption of the independence of observations which underlies most statistical inference. If a sample is clustered, standard errors calculated on the assumption of independence will be smaller than they should be, and results of statistical tests could be misleading. The DEFF, or the design effect, gives the researcher the extent of the bad news about the effects of lack of independence. The DEFF shows how much larger a clustered sample has to be to have the same accuracy (as reflected by confidence intervals) as a simple random (unclustered) sample. DEFF =1 + icc(k-1), so it becomes larger with larger intraclass correlations (icc) and larger clusters (k). For example, with an icc of .10 and 10 people per cluster, DEFF = 1.9, so the clustered sample would have to be almost twice the size of a simple random sample to achieve the same accuracy. This can be neatly demonstrated with this program. Results from various analyses can be saved in a file.

Donner, A., & Koval, J. (1980). The estimation of intraclass correlation in the analysis of family data. Biometrics, 36, 19-25
Donner, A., & Koval, J. (1980). The large sample variance of an intraclass correlation. Biometrika, 67, 719-22

Calculating the sample size needed to estimate a proportion with given accuracy

Enter the approximate value of the proportion you hope to estimate with your sample, together with the accuracy you'd like (in terms of the confidence interval), and the program gives you a sample size. You can ask it to take lack of independence (expressed as the DEFF, see above) into account, and also apply a finite population correction.

It's interesting to see how the required sample varies under various conditions. For example, for a proportion of .5 and a 95% confidence interval of .45-.55, a sample size of 384 is needed; with a DEFF of 2, the required sample is 768 (as noted above). If, in addition, the population you're sampling from numbers only 5000, the sample size drops to 666. If you wanted to be really accurate, and specified a CI of .49-.51, the sample size is 9604. Whoops, perhaps that's a bit ambitious. I could live with +/- .025 perhaps? Yes, a more manageable 1536 or so.

The results can be saved in a file, and the program remembers the last value entered for each specification, which facilitates experimentation.


A number of people (e.g., Brown, Cai & DasGupta, 2001, 2002; Goncalves, Oliveira, Pascoal & Piries, 2012) have questioned the accuracy (the coverage) of the confidence intervals calculated by the so-called Wald method that's the basis for this program. It turns out that the problems with the Wald method start cutting in when the proportion to be estimated is less .05 (or greater than .95), when the sample sizes suggested are way too small. There are also wobbly estimates with some larger proportions and small samples.

While doubting that many users of this program will want to estimate proportions as small (or as large) as the values above, I've included a better estimator along with the original one for simple random samples, thanks to a closed form first order approximation of the Wilson (1927) method, provided by Goncalves, de Oliveira, Pascoal & Piries (2012). The sample size based on their formula pops up beside the original one in the output, and can serve as a warning when it disagrees with the Wald result.

Below is some output from the program showing the Wald sample size (srs_size) and the Wilson (srs_Wilson) for progressively smaller proportions. A hint of trouble to come is evident with a proportion of .05 (.95) and things go seriously ahoo after that until .001, where the Wald estimate of sample size is 6 versus the (more accurate) Wilson estimate of 76. The value of .025 for the half confidence interval was used in this example to follow Goncalves, Oliveira, Pascoal & Piries (2012), who used a full CI width of .05.

prop 1/2ci z srs_size srs_Wilson

0.500 0.025 1.960 1537 1533
0.400 0.025 1.960 1475 1471
0.300 0.025 1.960 1291 1288
0.200 0.025 1.960 983 982
0.100 0.025 1.960 553 556
0.050 0.025 1.960 292 304
0.025 0.025 1.960 150 176
0.010 0.025 1.960 61 108
0.005 0.025 1.960 31 89
0.002 0.025 1.960 15 81
0.001 0.025 1.960 6 76

Brown, L.D., Cai, T., & DasGupta, A. (2001). Interval estimation for a binomial proportion. Statistical Science, 16, 101-7
Brown, L.D., Cai, T., & DasGupta, A. (2002). Confidence intervals for a binomial proportion and asymptotic expansions. Annals of Statistics, 30, 160-201
Goncalves, L., de Oliveira, M.R., Pascoal, C., & Piries, A. (2012). Sample size for estimating a binomial proportion: comparison of different methods. Journal of Applied Statistics, 39, 2453-2473
Wilson, E.B. (1927). Probable inference, the law of succession and statistical inference. Journal of the American Statistical Association, 22, 209-212

Chi-squared test of independence with partitioning and residuals

This is based on a program by Castellan (1988). It's worth having because it does orthogonal partitioning of the table (and of the chi-squared statistic). The partitioning is based on the order of the categories as entered, so the order of entry could be varied to obtain partitions which are appropriate for the hypotheses under test. I've added the calculation of adjusted standardised residuals (the same as those provided by SPSS), which can also help when following up a significant chi-squared. It's worth noting that the expected frequencies used for testing the sub-tables are based on the whole table, not the sub-tables.

The table frequencies are entered from the keyboard. The results can be saved in a file, which is helpful if you've produced a large number of sub-tables.

Siegel, S & Castellan, N J Jnr (1988). Nonparametric statistics for the behavioral sciences, 2nd edition. McGraw-Hill

Comparing Cronbach's alpha over two or more independent groups

Enter the number of groups, the number of cases in each group, the number of items in each scale (which can differ over groups) and the alphas, and you get an overall test of the difference in alpha over the the groups and pairwise tests. Probably harder to find an occasion to use this program than it is to run it. The method was described by Feldt, Woodruff & Salih (1987).

Feldt, L. S., Woodruff, D.J., & Salih, F. A. (1987). Statistical inference for coefficient Alpha, Applied Psychological Measurement, 11, 93-103.

Deciding which of two variables adds more to the prediction of the criterion

This implements an analysis given by Olkin & Finn (1995) in a terrific paper called Correlations Redux. The program is for their Model B and compares the models y = a + b and y = a + c to see whether there's a difference between what b and c add to the prediction of y.

The program needs the correlations ya, yb, yc, ab, ac and bc and the number of cases in the group.

Olkin, I., & Finn, J. (1995). Correlations redux, Psychological Bulletin, 118, 155-164.


Independent groups t-test and Cohen's d from means, SDs and Ns

This one has a GUI. Fun to use, but not to program. The t-test and Cohen's d results for equal and unequal-sized groups are given.


Inverse-F enter a p-value and the degrees of freedom and obtain the corresponding F-ratio

For example, p = .05, df = 3, 40, F = 2.88. What more do you want?

Cooke, D., Craven, A. H. & Clarke, G.M. London: Edward Arnold (1982). Basic Statistical Computing, 1982

Inverse normal enter a p-value and obtain the corresponding Normal deviate (z-value)

For example, p = .975, z = 1.9599.

Based on an algorithm by J.D. Beasley and S.G. Springer, first published in Applied Statistics, 1977, 26, p. 118.
Reprinted in Applied Statistics Algorithms, P. Griffiths & I.D. Hill (Eds), 1985

Randomisation test oneway analysis of variance (independent groups)

A translation of a program given by Edgington (1987) program 4.3, page 73. It does a random set of permutations of subjects, not every possible permutation.

Compiled in PowerBASIC, and with an i7 processor, a large number of permutations is done in a flash.

With randomisation tests, the null hypothesis is taken seriously the results obtained are simply a result of the random assignment of cases to groups. OK, so what if we make a large number of possible assignments of cases to groups at random: how likely are we to obtain the result we actually got? If it, or a less likely result, is highly unlikely to happen by chance (.05 or .01, etc), we suspect the null hypothesis is incorrect.

The data can be entered from the keyboard or from a file, formatted as specified by the program. If you enter the data using the keyboard, you can save them to use again. Optionally, the sufficient statistic generated from each permution can be saved in a file so that they can be shown in a histogram after reading the file into SPSS. The syntax is available on the same site as this program. To speed things up, the program doesn't create the usual statistic, F, but a number which is equivalent to F for the purposes of the analysis.

Edgington, E. S. (1987). Randomization tests (Second edition), 1987

Randomisation test oneway repeated measures analysis of variance

A translation of a program given by Edgington (1987) program 5.2, page 113. It does a random set of permutations of observations, not every possible permutation.

The data are read from a file formatted as specified by the program. The test statistic for each permutation can be saved in a file.


Randomisation tests SPSS syntax to produce a histogram of the test statistics

This is an SPSS syntax file which reads the data file produced by either of the randomisation programs, prints the observed value of the test statistic and produces a histogram of the statistics produced from the permuted dataset. The user has to insert the name of the data file in the appropriate part of the syntax.

RobsRand: Producing Random Sequences

This program was written for Robyn Boyle when she was a postgraduate student at Macquarie. It's probably of limited usefulness now, given the number of programs available on the web, but here it is.

One feature is that if the number of possible values in a sequence is greater than the number of items in the sequence, RobsRand will equalise the numbers of each value over a collection of sequences, if that's possible. If it's not possible, RobsRand will tell you.

For example, the following default specs produce the collection shown below:

In this case, each of the 10 sequences contains all possible values, 1-10.

Now let's say that we ask for sequences which have 10 values ranging in size from 1 to 15. We can't have all 15 values in a sequence of 10 but RobsRand

produces a collection of ten sequences where, as shown in the listing below the sequences, each of the 15 values occurs with equal frequency.

The sequences can be saved, and the user can ask for information about what the program does within the program itself.


Significance of the difference between correlations for dependent measures

At time 1, the correlation between measures X and Y is obtained for a group of subjects; at time 2, the correlation between the same variables is obtained for the same subjects. This program tests whether the two correlations are significantly different. It would also apply if the correlation between X and Y was obtained for two groups of subjects who were matched one to one. It was given by Jaccard, Turisi & Wan (1990).

The program requires all six correlations between X1,, X2, Y1, and Y2, and the number of cases or pairs.

Jaccard, J., Turrisi, R., & Wan, C.K. (1990). Interaction effects in multiple regression. Newbury Park, California: Sage

Significance of the difference between correlations for two or more independent groups

A common situation we have the correlation between X and Y for two or more different groups of subjects; do the correlations differ over the groups? Look no further. The program needs the number of groups and the correlation and number of cases in each group. A file of results can be saved. The program is based on the method given by Edwards (1976), a gem of a book.

Edwards, A.L. (1976). An introduction to linear regression and correlation. San Francisco: W.H. Freeman

Significance of the difference in correlations between two variables (X1 and X2) and Y for the same cases

Based on formulae given by J. H. Steiger (1980). Needs the correlations among all three variables and the number of cases.

J. H. Steiger (1980). Tests for comparing elements of a correlation matrix. Psychological Bulletin, 87, 245-251


Testing the difference between independent or related intraclass reliability coefficients

Uses the methods of Alsawalmeh & Feldt (1992, 1994).

The number of items (measurements, judges) used for two INDEPENDENT groups may be different. For RELATED comparisons, it is assumed that the same group of subjects (targets) is used to obtain the twointraclass correlations. However, the number of judges/items can differ. For RELATED comparisons, the correlation between the two measures is needed.


Alsawalmeh, Y. N., & Feldt, L. S. (1992). Test of the hypothesis that the intraclass relability coefficient is the same for two measurement procedures. Applied Psychological Measurement, 16, 195-205 (independent subjects and judges)

Alsawalmeh, Y. N., & Feldt, L. S. (1994). Testing the equality of two related intraclass reliability coefficients. Applied Psychological Measurement, 18, 183-190 1994 (related subjects and/or judges/items)





The programs in the BASIC folder were originally written in Microsoft QuickBASIC, which was a nifty version of BASIC with a nice (for those days) programming environment or IDE. Unfortunately programs compiled in QB4.5 don't run under Windows 10, so many of the programs have been re-compiled using PowerBASIC. Others have been re-compiled in QB64, 'BASIC for the modern era', which retains QB4.5 compatability. A lovely piece of work.

Why console versions? Mainly because they're easier to write than programs with GUIs and most people who use them will only do so only a few times and won't be too upset by the relatively primitive interface. Look, I grew up with dumb terminals. The command line interface is still cool with me.


These programs were produced while I was working at Macquarie University. Thanks to the people who provided the occasions for me to write them it was a pleasure to do so and to see them used.


Alan Taylor