ISI - International Statistical Institute
Newsletter Volume 26, No. 1 (76) 2002

Free Statistical Tools on the WEB

There is a great deal of statistical information available for free on the Web. Information includes data or data sets, and also general statistical textbooks, e-mail lists, software, and many sites about special topics, such as epidemiology, forecasting, data presentation, data editing, multiple imputation, and propensity score analysis. This article is a brief review of some useful sites covering these topics.

 The best place to start for learning about statistics is HyperStatistics Online, at http://davidmlane.com/hyperstat/index.html. This is the best place because aside from being a nice statistics book, it is a comprehensive list of other on-line statistics books. Most of these are basic to intermediate. One book, the Statsoft text, http://www.statsoft.com/textbook/stathome.html, has fairly advanced topics. 

The best place to start for epidemiology is of course EpiMonitor, http://www.epimonitor.net/index.htm, which has a very comprehensive list of links. Another very good place to start is http://www.epidemiolog.net/. This site also has a fairly comprehensive listing of epidemiology sites, as well as an on-line textbook. Another on-line textbook is Epidemiology for the Uninitiated, at http://www.bmj.com/epidem/epid.html. Finally, one more interesting site is the Epidemiology Supercourse, http://www.pitt.edu/~super1/, which is a set of on line lectures on various epidemiology courses. These lectures can be downloaded and used, whole or in part, in your own lectures. 

After analyzing data, it is very helpful to know how to best present the results. Very good sites are: Conventions for the Graphical Display of Data , and Washington Statistical Society Methodology Seminars. Data Presentation: A Guide To Good Graphics http://www.science.gmu.edu/~wss/methods/zawitzg.html. Also the BTS Guide to Good Statistical Practice at http://www.bts.gov/programs/statpol/btsguide.html has a useful section on presenting results.

 There is also a great deal of free software on the net. The best place to find free statistical software is the Free Statistical Software site at http://members.aol.com/johnp71/javasta2.html. This site lists general-purpose software, as well as software devoted to specific purposes, such as curve fitting, epidemiology, surveys, and programming. There are also brief descriptions of each package. 

One general site, especially helpful for students, is the American Statistical Association http://www.amstat.org/. They have a very good list of links, including to other statistical societies, electronic resources and granting agencies. They also have a job site. For people who are interested in surveys, check out the Survey Research Methods Section, http://www.amstat.org/sections/srms/, especially their What is a Survey series. 

The best place to start for e-mail lists is Allstat, at http://www.ltsn.gla.ac.uk/allstat/. This is the best because, besides hosting a nice e-mail list itself, it is a comprehensive list of other statistical e-mail lists. Probably the most popular general statistics list is stat-l, at http://www.cmh.edu/stats/faq/faq.htm.

The best general place to look for sources of data is Statistical Resources on the Web http://www.lib.umich.edu/govdocs/stats.html. This is a comprehensive guide to data on many topics, including health, demographics, labour, economics, environment, and much more. Another such site is Statistics, at http://www.statistics.com/. A starting point for social, political and economic data is the Social Change data page http://gsociology.icaap.org/data.html, which also links to a number of other data link sites. Another very good starting point is the Social Policy Virtual Library data page http://users.utu.fi/thepap/world3.htm

There are dozens of special topic sites. One is the Multiple Imputation FAQ page, at http://www.stat.psu.edu/~jls/mifaq.html. Multiple imputation is a method of filling in missing data by using other variables to predict the missing values. Another site is a paper by Rubin explaining propensity score analysis, at http://www.symposion.com/nrccs/rubin.htm. Propensity score analysis is a method of dealing with self-selection bias. The Federal Committee on Statistical Methodology, at http://www.fcsm.gov/spwptbco.html, has some interesting papers, especially RL2. Record Linkage Techniques - 1997: Proceedings of an International Workshop and Exposition. (This is RL2, not RL1.) Another interesting special topic sit is the multilevel modelling project at http://multilevel.ioe.ac.uk/. A site about data mining is kdnuggets at http://www.kdnuggets.com/ (a newsletter and general links to links site). A forecasting site is the Federal Forecasters Conference, at http://nces.ed.gov/surveys/ffc/. Conference proceedings can be downloaded from this site. Another useful site is Statistical Data Collection and Processing at http://www.unece.org/stats/archive/02.02.e.htm. Reports and working papers can be downloaded. 

Finally, many faculty have class notes posted on the web. Two of these are Bob Nau's class notes on forecasting at http://www.duke.edu/~rnau/411out00.html, and Hossein Arsham's Time Series Analysis and Forecasting Techniques, at http://obelia.jde.aca.mmu.ac.uk/resdesgn/arsham/opre330Forecast.htm.

Gene Shackman* 
Research Methods Website Manager 
http://gsociology.icaap.org/methods 

 

* Neither Dr. Shackman nor ISI endorse any of the sites listed here, and do not assume responsibility for content of the Websites listed in this article. This article is solely presented for educational and reference purposes.


Back to Home Page

Back to Newsletter