ISI - International Statistical Institute
A short version of this article first appeared in the International Statistical Association newsletter, Vol 26, Number 1 (76), 2002, and is at http://isi.cbs.nl/NLet/NLet021-04.htm and http://isi.cbs.nl/FreeTools.htm
There is a great deal of information about statistics available for free on the WEB. Information includes data or data sets, general statistical textbooks, email lists, software, and many sites about special topics, such as epidemiology, forecasting, data presentation, data editing, multiple imputation, and propensity score analysis. This article is a brief review of some useful sites covering these topics.
To start with, World Statistics Day http://unstats.un.org/unsd/wsd/Default.aspx
was
recently
celebrated,
on October 10, 2010. According to the UN, the
goal of this day was to "pay tribute to statisticians’ outstanding work
in producing and disseminating the necessary data to respond to the
every day new challenges and to measure progress in people’s lives."
(World Statistics press release, http://unstats.un.org/unsd/wsd/docs/WSD_18Oct2010.pdf
.) This was billed as the first World Statistics Day, so perhaps
there will be more.
When looking for statistical information, there are several sites that are general links. One is the Intute statistics page, http://www.intute.ac.uk/statistics/ which has sub-pages on demography, international and national indicators, and statistical theory. The Intute site has a variety categories, such as data, educational material, government sites, mailing lists and societies. Two other general sites are Betty Jung's statsites http://www.bettycjung.net/Statsites.htm and statsci http://www.statsci.org/index.html
The best place to start for learning about statistics is HyperStatistics Online, at http://davidmlane.com/hyperstat/. This is the best place it is a a nice statistics book, and it is a comprehensive list of other on line statistics books. Most of these are basic to intermediate. One book, the Statsoft text, http://www.statsoft.com/textbook/ has the basics as well as fairly advanced topics. Another, Statistics at square one http://resources.bmj.com/bmj/readers/statistics-at-square-one/statistics-at-square-one is a fairly introductory book, but from 1997. Another approach is a site is Robert Niles' site Statistics Every Writer Should Know http://www.robertniles.com/stats/ with plain English explanations for many basic statistical concepts.
People can also take free on line training classes on statistics,
for example, from the North Carolina Center for Public Health
Preparedness Training Web Site, http://nccphp.sph.unc.edu/training/index.php,
or
University
of
Minnesota's
Midwest
Center for Life-Long Learning in
Public Health http://www.sph.umn.edu/ce/mclph/
These
classes
offers
a
certificate at the end of the
training. StatTrek http://stattrek.com/
also has a couple of
on line tutorials. Another project,
from Claremont Graduate University is the Web Interface for Statistical
Education http://wise.cgu.edu/
also
with
some
on
line
tutorials
and
links
to
resources. An
open
course
from
Carnegie
Mellon
http://oli.web.cmu.edu/openlearning/forstudents/freecourses.
is
basically
presenting
material
used
in the course taught at the Univerity.
Since statistics is difficult to learn and it is not always clear to
the general public how statistics may be useful, there is one
project aimed at educating the public: the International Statistical Literacy Project
http://www.stat.auckland.ac.nz/~iase/islp/home
The
mission
of this project "is to support, create and participate in
statistical literacy activities and promotion around the world."
A similar project is Statistical
Literacy
http://www.statlit.org/
which
basically
is
a
central
resources
for
events,
and
links to
presentations and other information. A kind of related project is stats.org
http://www.stats.org/
from George Mason University. This project describes basic statistical
terms but the main focus seems to be discussing news stories and how to
understand the statistics in
those news stories.
Two government websites also try to help the public understand
statistics. The Australian
Bureau of Statistics http://www.abs.gov.au/websitedbs/a3121120.nsf/home/Understanding%20statistics
has
an
on
line class and a page defining statistical terms. US's National Atlas has a page on
Understanding
Descriptive Statistics
http://www.nationalatlas.gov/articles/mapping/a_statistics.html
There are a number of statistical
associations. An international association is the
International
Statistical
Institute http://isi-web.org/
.
Some
other
associations
are
the
American
Statistical
Assocation
http://www.amstat.org/
the
International
Chinese Statistical Association http://www.icsa.org/ and
the International Indian Statistical Association http://www.intindstat.org/ . Statsci has a list of associations http://www.statsci.org/soc.html
as does the International Statistical Institute http://isi-web.org/statsoc/nsslist
.
There are a number of email lists. Allstat, at https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=allstat
is a general list, although a great deal of the postings appear to be
postings about jobs or training courses. Another list
stat-l, at
http://lists.mcgill.ca/archives/stat-l.html
focuses
more
on
statistical
questions.
Another
useful
list,
not
on
Allstat,
is
Epidemio,
at http://www.listes.umontreal.ca/wws/info/epidemio-l
This
list
is
about
epidemiology.
Another
form
of
discussion
group
is
the
forum. TalkStats http://www.talkstats.com/
is
one
forum,
with
discussions
about
basic
to advanced, homework to
theory. A smaller forum is from Statistics.com http://www.statistics.com/resources/discussionboards/
with
only
two
general
categories,
statistical
methods and homework.
There are a number of comprehensive places to look for data.
One
starting
point for social, political and economic data is the Global Social
Change Research Project http://gsociology.icaap.org/,
which
has
both
links
to
a
very
large
number
of
other
data
link sites,
and a page of data sets compiled or created from other data sets. Many
of the data sets listed on this project site are public domain.
All of the data are free to use. This UN
site http://data.un.org/
has
data
on
nearly
every
topic,
from
the
UN
and
it's
various
associates.
The Worldbank
also has a data page http://data.worldbank.org/
Most
of
the
data
on the Worldbank site and all of the data on the UN
site may be used freely. This UN site http://unstats.un.org/unsd/methods/inter-natlinks/sd_natstat.asp
and
this
BLS
site
http://www.bls.gov/bls/other.htm
link
to
national
statistical
centers
of
most
countries
of
the
world.
There are a number of statistical journals on the web with free
content. Many of these are listed at the Directory of Open Access Journals
http://www.doaj.org/doaj?func=subject&cpid=59
page
on
statistics.
Some
of
the journals listed here include the Latin
American Journal of Probability and
Mathematical Statistics http://alea.impa.br/english/index_v7.htm
, the Electronic Journal of
Applied Statistical Analysis http://siba-ese.unisalento.it/index.php/ejasa/index
, and the Journal of Official
Statistics http://www.jos.nu/
There are resources about dozens of specific topics on the web. Some of these topics include epidemiology, graphical analysis and presentation, missing data, forecasting, gathering data and meta-analysis.
Epidemiology: The two best places to start for epidemiology
are EpiMonitor, http://www.epimonitor.net/index.htm,
which
has
a
very
comprehensive
list
of
links
and
the
WWW
Virtual
Library:
Epidemiology http://www.epibiostat.ucsf.edu/epidem/epidem.html
another
gateway.
Another
very
good
place
to
start
is
epidemiolog, at http://www.epidemiolog.net/.
This
site
also
has
a
fairly
comprehensive
listing
of
epidemiology
sites,
as
well
as
an on-line textbook. First time visitors should start
at http://www.epidemiolog.net/evolving/
. Another free on-line textbook is Epidemiology
for the Uninitiated, at http://resources.bmj.com/bmj/readers/epidemiology-for-the-uninitiated/epidemiology-for-the-uninitiated-fourth-edition
(from 1997)
A very good place to find world
epidemiological data, reports, issues and information is from
WHO http://www.who.int/topics/epidemiology/en/
which
includes
for
example
the
10
leading
causes
of
death,
and
the
Weekly
Epidemiological Record.
There are also two interesting sites for learning
epidemiology. One is the Epidemiology Supercourse,
http://www.pitt.edu/~super1/,
which
is
a
set
of
on
line
lectures
on
various
epidemiology
courses.
These
lectures
can be downloaded and used, whole or in part, in your
own
lectures. The North Carolina Center for Public Health
Preparedness
Training Website http://nccphp.sph.unc.edu/training/
has free on line training for biostatistics, epidemiology, other
topics.
You can get certificates for each class you complete. Each class is 1/2
to 1 hour.
Presenting Results: After analyzing data, it is very helpful
to know
how
to best present the results. Very good sites are: Informative
Presentation
of
Tables,
Graphs
and
Statistics,
at http://www.reading.ac.uk/ssc/publications/guides/toptgs.html
,Washington
Statistical Society Methodology Seminars, Data Presentation: A
Guide
To Good Graphics http://www.scs.gmu.edu/~wss/methods/zawitzg.html
and
Presenting Data
http://lilt.ilstu.edu/gmklass/pos138/datadisplay/
. Also BTS’s Guide to Good Statistical Practice has
a useful section on presenting results, at http://www.bts.gov/publications/guide_to_good_statistical_practice_in_the_transportation_field/index.html
. For some interesting good and bad examples, see the
Gallery
of Data Visualization, at http://www.math.yorku.ca/SCS/Gallery/
More
recently,
there
are
sites
showing
moving
charts,
like
Gapminder
http://www.gapminder.org/
or
mapping
international
data
like
Show
http://show.mappingworlds.com/world/
Missing Data: Two sites that are overviews of missing data page are the University of Texas Statistical Services FAQ page, #25, at http://www.utexas.edu/its-archive/rc/answers/general/gen25.html and Professor von Hippel's faq page http://www.sociology.ohio-state.edu/people/ptv/ where he talks about whether data are missing at random or not, and how to deal with the missing data. Also see the first couple of paragraphs of Dr. Howell's page http://www.uvm.edu/~dhowell/StatPages/More_Stuff/Missing_Data/Missing.html One way to deal with missing data is multiple imputation, described at the Multiple Imputation FAQ page, at http://www.stat.psu.edu/~jls/mifaq.html. Multiple imputation fills in missing data by using other variables to predict the missing values. This method is also described at Joseph Schafer’s site, in a 1999 article "Multiple imputation: a primer". at http://www.stat.psu.edu/~jls/index.html. One software program for estimating missing data is AMELIA, at http://gking.harvard.edu/stats.shtml
Forecasting: Two faculty members have lectures about
forecasting
on the web. These are Bob Nau's class notes on forecasting
at http://www.duke.edu/~rnau/411out00.html,
and
Hossein Arsham's Time Series Analysis and Forecasting Techniques,
at
http://home.ubalt.edu/ntsbarsh/Business-stat/stat-data/Forecast.htm
Also, another forecasting site is the Federal Forecasters Consortium,
at
http://www1.va.gov/vhareorg/ffc.htm
Conference proceedings can be downloaded from this site.
Methods of gathering data: There are a number of sites on gathering data. Two places to start are Resources for Methods in Evaluation and Social Research, at http://gsociology.icaap.org/methods/ and The World Wide Evaluation Information Gateway http://www.policy-evaluation.org/ These site are link to other sites about methods, quantitative and qualitative. Some sites are about specific tools in data gathering. The Statnotes site has a section on survey methods, at http://faculty.chass.ncsu.edu/garson/PA765/survey.htm Tom O'Connor's lecture notes, at http://www.drtomoconnor.com/3760/default.htm covers various issues such as measurement, validity and reliability, and scales in indexes.
Meta-analysis: There are several introductions to
meta-analysis.
One is a supercourse http://www.pitt.edu/~super1/lecture/lec1171/index.htm
.
One
link
is
to
an
on
line
book
Meta - Analysis: Methods of Accumulating Results
Across Research Domains, by Larry C. Lyons, at http://www.lyonsmorris.com/MetaA/index.htm
(this link sometimes doesn't work).
One of the Epi
Supercourses
is about meta-analysis, How to conduct a Meta-Analysis http://www.pitt.edu/~super1/lecture/lec1171/index.htm
Another site is The Meta Analysis of Research
Studies http://echo.edres.org:8080/meta/
which is an overview and links
to documents and resources.
Other topics include propensity score analysis http://www.epa.gov/caddis/da_advanced_5.html . Propensity score analysis is a method of dealing with self selection bias. Also, the Federal Committee on Statistical Methodology, at http://www.fcsm.gov/reports/ , has some interesting papers, especially RL2. Record Linkage Techniques - 1997: Proceedings of an International Workshop and Exposition. (This is RL2, not RL1.) Another interesting special topic sit is the Centre for Multilevel Modelling at http://www.cmm.bristol.ac.uk/ One site about data mining is kdnuggets at http://www.kdnuggets.com/ (a newsletter and general links to links site).
Gene Shackman** Neither Dr. Shackman nor ISI endorse any of the sites listed here, and do not assume responsibility for content of the Websites listed in this article. This article is solely presented for educational and reference purposes
Last updated and verified 11/10/2010.
Back to Home Page