Dec 4
Smerconish
on death penalty. -
Capital
Punishment and Homicide -
Myths of
Murder and Multiple Regression.
Police Crackdowns
and Slowdowns.
December 4 - Field Research
Margaret
Mead, the only anthropologist (or
sociologist)
to get her own postage stamp, won fame through field work, primarily
her
book Coming
of Age in Samoa. Later, this book was denounced by
anthropologist
Derek Freeman in his book Margaret
Mead and the Heretic : The Making and Unmaking of an Anthropological
Myth.Anthropologists
have come to Mead's defense, and
have restudied the case, but I would have to agree with your text
that
"had Mead come back from Samoa with an accurate ethnographic report, it
would not have made her famous."
More recently, there has been a raging controversy about the book Darkness
in El Dorado about research on the Yanomamo in Venezuela is the
latest
ethical controversy, which also raises important methodological
questions.
Many of the book's allegations, however, have
been contested by the National Academy of Sciences.
The combining
of fiction with factual research is increasingly common both in
anthropology
and in biographies. Sometimes this is
openly
done as a literary form, in other cases such as that of Rigoberta
Menchu,
it is only admitted when
critics discover it.
The
Rigoberta Menchu Controversy by Arturo Arias.
There
are many problems with field research: ethical issues, problems
of
reliability and validity when data are gathered by only one researcher,
etc. A controversial book is Laud Humphrey's Tea
Room Trade, which raises ethical issues. He studied gay sex in a
men's
room in a park in St. Louis, without informing the participants what he
was doing.
Field researchers sometimes seem to find examples that fit their
preconceptions,
and their work is often ignored by those who do not like the results,
e.g.,
Leon Dash's book When
Children Want Children and
Rosa Lee which are just ignored by welfare advocates who prefer
more sympathetic treatments. One of the best field studies is
Kathryn Edin's book Making
Ends Meet. which is highly sympathetic to the mothers.
However,
Edin collected statistical data as well her illustrative
observations.
The statistics showed that almost none of the mothers actually lived
off
their grants alone. Eli Anderson's book Streetwise
on men in a Philadelphia ghetto has been well received, in large part
because
goes beyond one-sided advocacy.
James Flatley, Etienne Jackson and Robert Wood's
Video version of
Down
Germantown Avenue.
A great strength of field work is observing behaviors that the people
themselves
don't understand or aren't even aware of., or at any event, are unable
or unwilling to talk about. Anthropologist Jules
Henry spent a week living in each of the homes of several children
who had grown up mentally ill,
trying
to discern patterns in the family interactions that contributed to the
illness. Myra Bluebond-Langner's book The
Private Worlds of Dying Children has been very influential;
she
has just published a sequel called In
the Shadow of Illness : Parents and Siblings of the Chronically Ill
Child
Field reserch offers a richness of description and possibility of new
insights
that is unparalled by any other method. Unless it is supplemented
with other methods, it does not provide statistical data, and it is
hard
to replicate.
Myra Bluebond-Langner of our Anthropology Department wrote a classic, The
Private Worlds of Dying Children, and more recently, In
The Shadow of Illness.
Coming
of Age in New Jersey.
The
Corner. Memoirs: Frey
dispute with Oprah.
Black
American Students in an Affluent Suburb. by John
Ogbu.
Elijah Anderson, The
Code of the Street and others.
Commentary
on Ogbu's research.
Many scholars who have disputed those findings rely on a
continuing survey of about 17,000 nationally representative students,
which is conducted by the National Center for Education Statistics, an
arm of the federal government. This self-reported survey shows that
black students actually have more favorable attitudes than whites
toward education, hard work and effort.
But that has by no means settled the debate. In the February
issue of the American Sociological Review, for example, scholars who
tackled the subject came to opposite conclusions. One article (by three
scholars) said that the government data were not reliable because there
was often a gap between what students say and what they do; another
article by two others said they found that high-achieving black
students were especially popular among their peers.
"It's difficult to determine what's going on," said Vincent
J. Roscigno, a professor of sociology at Ohio State University who has
studied racial differences in achievement. "'I'm sort of split on Ogbu.
It's hard to compare a case analysis to a nationally representative
statistical analysis. I do have a hunch that rural white poor kids are
doing the same thing as poor black kids. I'm tentative about saying
it's race-based."
Indeed, Professor Mickelson of the University of North
Carolina found that working class whites as well as middle-class blacks
were more apt to believe that doing well in school compromised their
identity.
All these years later, Professor Fordham said, she fears that
the acting-white idea has been distorted into blaming the victim. She
said she wanted to advance the debate by looking at how race itself was
a social fiction, rooted not just in skin color but also in behaviors
and social status.
"Black kids don't get validation and are seen as trespassing
when they exceed academic expectations," Professor Fordham said,
echoing her initial research. "The kids turn on it, they sacrifice
their spots in gifted and talented classes to belong to a group where
they feel good."
Frey
Dispute with Oprah Dutch:
Fictionalized Reagan Bio. NY
Times review.
November 29 - The powerpoint on Unobtrusive Research is in
SAKAI.
Content Analysis - "unobtrusive
data" Data created
by a bureaucratic system, e. g. police records, or often by the
media.
Television or Newspapers either because that is our interest, the
media,
or as a way of getting information, e.g., on crime reported in the news.
Similar to survey research, except
that you do coding
instead of interviewing. Coding means that you assign numbers to
phenomena that you observe. Counting things. Each of your
variables
is coded from the published information.
Conceptualization.
Measurement. Reliability and Validity.
Manifest Content - what's it's about on
the surface
Latent Content - things that we infer
about the content,
e.g., does the writer sound angry? Indignation, sexy?
A Content
Analysis Study of Editorial Cartoons.
A
Content Analysis of Internet-Accessible Written Pornographic Depications.
Our
interview data: Coding scheme in Microcase.
Frequencies. How to interpret?
November 27
Regression equation. This equation is used with two variables
measured on an interval scale, or at least an ordinal scale that
approximates the interval. It assumes a linear relationship
between variables, i.e., that if you plot them on a scatter plot they
will approximate a straight line. If they do not, then the
regression equation is not an appropriate tool, at least not unless the
data can be modified in such a way that the relationship appears linear
(e.g., by converting to logarithms). This can best be illustrated
with the scatterplot tool in Microcase.
The equation is used to predict the Dependent
variable (Y) with the Independent variable (X).
Y = a + b X where:
X is the independent variable
Y is the dependent variable
b is the "unstandardized regression coefficient"
a is the "intercept"
In the exercise we will be doing, year will be the independent
variable. The advantage of this is it allows us to project into
the future, assuming that trends are linear and will continue to be so.
November 13: Powerpoint on Survey Research is in SAKAI. .
Questionnaire
design.
Tips
for Writing Questionnaire Items.
National Criminal
Victimization survey as an example.
Interviewing
Guidelines.
Survey
Interviewing.
November 8.
Critical
article from NY Times on ethics panels. Carol Tavris,
The High Cost
of Skepticism. The test we took in class today is in SAKAI in
the resources folder with the filename IRBtest.doc.
November 6. Experimental Methods. Two
powerpoints are in SAKAI: ExperimentalResearch.ppt and
ExperimentalMacroSociology.ppt
Experimental
Designs. See the graphs in the book
or on Trochim's WEB site:
Types of
Designs.
Essential
characteristics:
- Two or more groups are matched, usually by random
assignment,
sometimes by a kind of stratified random selection, e.g., an equal
number
of men and women or black sand whites in each group. But the key
is random assignment so that the groups can be assumed to be the same
on
all variables. "Quasi-experiments" are when we use groups that
are
pretty much the same but we didn't assign people at random
- The Independent Variable is "manipulated," i.e.,
it is applied
to one group and not to the other
- Change in the Dependent Variable is measured
Experiments can be done:
- In laboratory settings with volunteers, e.g.,
student volunteers
- The Milgram
Experiment on Obedience to Authority
- In institutional settings such as prisons,
hospitals, rehabilitation
centers, etc., where people are assigned to treatment groups
- New drugs and medical treatments generally must
be shown
to work in experiments before they are approved for use. Often,
treatment
is compared to a placebo. These experiments are usually
"double-blind,"
to control for the psychological effects of knowing one is getting
treatment.
This is a way of controlling subject bias and experimenter bias/
- In criminal justice, one might do an experiment
comparing
a "half way house" to drug treatment program to a prison term for
offenders.
To do this, you would have to get the judge to assign offenders to
different
programs at random. Ethical issues are raised here and there are
likely to be objections
- Occasionally in natural settings, for example
- welfare reform
experiment, assign some recipients to the
old program, some to the
new. This didn't work very well, there
were
errors in the group assignments and the women often forgot which group
they were in anyway
- vaccination experiments
- guaranteed annual income experiments
- Kansas
City Patrol Experiment.
Although logically experiments are the most rigorous
way
to test causal hypotheses, there are practical problems:
- It may be hard to manipulate the independent
variable effectively,
it may not have enough importance to people that they notice it
- Experimental conditions may not be realistic
enough, e.g.,
the Milgram experiments having people apply electric shock to people,
experiments
that simulate being in prison. An experiment is not the real
world
and people know it. This is called external validity, does the
experiment
match real world conditions
- There may be problems of internal validity,
difficulties
in carrying out the experiment:
- "History" effects - the world changes during
the experiment,
people get older, more mature, they are effected by things in the real
world
- Maturation, people get older, learn more
- Testing effects, taking the pretest measure
effects people,
causes them to change. Sometimes we have a matched but untested
control
group that is measured only after the experiment.
- Instrument effects, the testing instrument may
change.
You can't use the same exact test sometimes because people will
remember
it, so items change
- Regression to the mean, just by chance the
people who got
extremely high or low scores on a pretest are likely to get more
average
scores on the second test.
- Subject "mortality" - we may lose people.
This is especially
a problem in testing things like drug rehabilitation, it works for the
people who stick with it, the failures drop out
- Ethical concerns: people may not be willing
to be experimented
on, or it may be harmful to subject them to experimental conditions,
e.g.,
- Tuskeegee syphillis experiment denied some men
penicillin.
You can only deny an experimental drug if you are not "certain" that it
works or if the condition is not serious, e.g., common cold research
- A big strength of experiments is resolving
questions that
involve different recollections of events, e.g., children's reports of
abuse. You don't know what "really" happened and people disagree
on how well they accept the recollections of different people. In
an experiment, you know what really happened, so you can check the
accuracy
of perception. We find that children often remember things that
didn't
really happen. "20/20 report on Child Abuse experiments
(VIDEO shown in class from an ABC News 20/20 show aired October 22,
1993, hosted by Hugh Downs. Transcript available at
www.transcriptstv.com) demonstrates false memory because we know what
really happened since it happened in a controlled experimental
setting. This is much more difficult to establish in real life
case histories: Loftus:
Who Abused
Jane
Doe? There is other information
online on the Kelly
Michaels case and other cases.
Another example we can look at is an experimental study of internet
downloads. This was published in Science magazine because it
demonstrates a sociological principle with rigorous experimental
data. Several documents from this study are in WEBCT,
the most accessible summary is in a file called "Experimental
Macrosociology".
October 30 - Sampling - The
powerpoint used in this class is in SAKAI with the title Sampling
Powerpoint.ppt. However, you are not responsible for everything
in the powerpoint or in Babbie, just for the material included in these
notes. Use the books to help you understand anything that is not
clear.
SAMPLING
is
used when we are
interested in studying a population that is too large for us to study
each individual. The first step is to define the
population
we wish to make statements about, e.g. adults in New Jersey, probable
voters, people convicted of felonies, graduates of our
department. We might want to study the entire population of the
USA. If we try to collect data from everyone, this is a
census. The Census Bureau does this once every decade, and misses
a lot of people. Everyone else does sampling, we select a
cross-section to represent the population. If you
try to study the whole population, you often fail to do a good job.
Gallup:
How Polls are Conducted.
Terms we need to know:
Sample statistic - data
from our sample, e.g., a percent or mean score
population parameter - the
true value for the whole population for the same percent or mean score
Margin of Error: How much
a sample statistic is likely to vary
from the population parameter. We say that we are 95% sure that
the sample is not off by more than the margin of error. How this
is presented in
NY Times. "19 out of 20" is another way of saying
95%. 95% is our
confidence
level. - we could use another one, but the convention is to use
a 95% confidence level.
Standard Error - this is
actually half of what is usually called the "margin of error" - it is a
66% confidence level, we can be 66% sure that the statistic is within
one standard error of the paramater value. Many times statistical
reports do report a standard error instead of a margin of error.
In that case, look to see if the result is more than two standard
errors from the null hypothesis (often zero).
.
Confidence interval: the range within which we think a
sample
statistic might differ from the
population
parameter,.
We get this by taking the
margin of
error and substracting it from the sample statistic, then adding
it to the sample statistic. e.g., if the margin of error is 3% and the
sample
statistic is 67%, the confidence interval is from 64% to 70%. We
are 95% sure that the true figure or
population parameter is within this
limit.
Note that all of this assumes random sampling, it does not compensate
for errors due to non-response or deficiencies in the
sampling frame, the actual list from
which we select respondents.
The mathematics is simpler if we use a
simple random sample, which means
that each
person (or other sampling unit) in the population has the same chance
of appearing in the sample. In practice, however, we often do not
use simple random samples, for several reasons: (see page 208 in
Ayers for a condensed summary of these techniques)
- we may not have a list of the population. If we do not, we
first divide the sample into sub-groups of some kind (census tracts,
blocks, classrooms, organizations, depending on the nature of the
study). We then sample the subgroups and list the populations in
them . This is called multistage
cluster sampling. It is often used
for interviewing people at home instead of over the telephone.
- We may be interested in differences between sub-groups of the
sample and need to make sure we have enough of them. In this case
we select random samples of each of the relevant sub-groups, and weight
the results appropriately. This is called stratified
sampling, e.g., the NY
Times explains that its surveys oversample black
respondents. Most surveys today are on the telephone. This
is still random sampling, but it is random within the strate. The
results have to be weighted to
get accurate popular figures.
- Sometimes we just go down a list, which is called systematic
sampling. This gives the same results as simple random
sampling,
unless there is some systematic ordering to the list that causes a
distortion
- Sometimes we use non-random
or "quota" sampling. This
is
done for convenience, or because we just want to know what the range of
differences is without putting numbers on them. Internet surveys
tend to be of this sort, although there are some randomly selected
Internet panels.
These things are all explained in Babbie. The main terms are also
discussed in the powerpoint that has been posted. Babbie also
discusses some other terms, but I am going to limit the test to terms
covered in these notes.
.
Babbie does not do sampling statistics, but we will do some in this
course. You need to consult these notes for the statistics (you
could find them in any statistics book, of course, along with much
more). I am focusing on some practical questions that come
up. The first is:
How big of a sample do
I
need?
Size
of the sample does not depend on the size of the population.
It depends on two things:
1. How much error you can tolerate.
2. Whether you need to calculate statistics for sum-groups of the
population (in that case, each one becomes a separate sample for
statistical purposes.
To get the sample size needed.
1. Decide how many groups you need to calculate statistics
about. If you need statistics for only the whole population, the
number is one. If you want statistics for five counties in South
Jersey, it is five. If you want statistics for freshmen,
sophmores, juniors and seniors, it is four, etc.
2. Decide on the margin of error you can tolerate, e.g., 5%, 3%,
10%, for statements about each subgroup
3. convert the margin of error to a proportion, e.g, 5% becomes
.05
4.
The sample size,
n, is computed with the following formula where m is the margin of
error expressed as a proportion n = 1/ m2
For
example, for a 5% margin of error n = 1/.052
which is 1/ .0025 or 400.
5. Multiply this result by the number of groups. Thus if
you needed results for black, white and Hispanic respondents
separately, you need a sample of 1200 to get a 5% margin of error for
each group.
6. Ignore any information about the size of the population
or the size of the groups
If you already have a sample and need to compute the margin of error,
just reverse the formula
m = 1/sqrt(n).
or in words
Margin
of error is equal to
one
divided by the square root of the
sample size.
Sample of
400,
the square root is 20. 1/20 = .05 or 5%. If you interviewed
400, 300 were white, 50 were black and 50 were others. If you have subgroups, you need to use only the number
of respondents in each group to compute the margin of error for that
group For the
blacks,
with a sample of 50, we would have a 14% margin of error. For the
whites, with a sample of 300, we would have a 5.8% margin or
error.
An example: Take 300, the square root
of 300 is =
17.32
1 /17.32 = .0577 * 100 = 5.8%
This formula above is is for proportions or percents
(if you move the decimal over two). We use this if we are not
told otherwise. However, sometimes we need margins of error for
mean scores instead of percentages.
If you need a margin
of error for a mean score (an average such as income
in dollars or scores on a test), you need to know the standard
deviation
(sd) and the sample size (N). Ignore any other
information
you are given, including the size of the population.
Use the following
formula:
M
= 2 * sd / SQRT(N)
Suppose
I sample 457 Camden residents and the mean income is $27,541 and
the standard deviation is $3452
M
= (2 * 3452 )/sqrt(457). This result will be in dollars, not
percentages.
M
= 6904
/21.378 =
$322.95.
Confidence
Interval: I am 95% sure that the population figure is
between: $27,218.05 and $27,863.95
Some margin of error problems:
Suppose
I did a sample of 400,selected from the 7,357,218 people living in New
Jersey. What is the margin of error?
M = 1 /SQRT(N). N is the sample size, not the
population size.
N = 400. Sqrt of N = 20. 1/20
= .05 or 5%. If I find that 42%
agree, that is my population "statistic." The
population paramater
is the true value, and I would say that I am 95% sure (my confidence
level) that the paramater is between 42% - 5% and 42% + 5%.
The true
value should be between 37% and 47%.
Suppose I go to 1000, what is my margin of error?
M = 1/SQRT(1000). = 1/ 31.62 = .0316
or 3.2%. The confidence interval is between 38.8% and
45.2%.
This applies to statements made about the whole sample. 42% of
the respondents said yes, the margin of error is 3.2%.
For statements about a subgroup, the N is the number of people in that
sub group (genders, races, sports fans).
We have a sample of 1200, of whom 800 are white, 300 are black and 100
are Hispanic. 57% of the Hispanics said yes to the item.
What is the
margin of error for this percent? Since it says "of the
Hispanics" our
N is the number of Hispanics, or 100. M = 1/SQRT(100) = .10
or 10%.
For the black respondents, our margin of error is M=1/SQRT(300).
= 1 / 17.32 = .0577 = 5.8%
For the white respondents M = 1/SQRT(800) =
.03535 or 3.5%.
How large a sample do I need to get a 5% margin of error, with a
population of 485,321? N = 1/M2 M
must be expressed as a proportion, not a percent. M =
.05. .05 * .05 = .0025.
Sample size = 1/.0025 = 400
Suppose I wish to study the black, white and Hispanic populati0n and I
need a margn of error of 5% for each group. How large a sample do
I
need?
The other thing we need to deal with is margins of error for mean
scores. Thein a survey of 300 county residents, the mean
income is
$45,321. We need to have the standard deviation. The
Standard
Deviation is a measure of variation. The standard deviation is
$3521.
M = 2 * sd/sqrt(n). N = 300. 2 * 3521/17.31 =
$203.29.
|
October 25 - Note: The Powerpoint used for this class is in
SAKAI with the title IndexandScaleConstruction.ppt.
Scaling and
index construction are techniques for measuring abstract concepts with
a number of items. This can be an attitude, e.g, "conservatism"
or
"authoritarianism." Or it can be a category of behavior, e.g,
"violent crime" which is measured by averaging together the frequency
of a number of crimes. Magazines like to do this to come up with
ranking, e.g, of the best community to live in or the worst one, the
best colleges, etc. Usually this is done just by adding up a
number of characteristics (in which case your
texts would call it an "index", although many people still use the term
scale). This gives is a rank order of sorts, and some idea of how
high or low cases are, but we really don't know how big the differences
are or what the scores really mean. What does it mean that Camden
is the "most dangerous city"? Is it really that much more
dangerous than the next dangerous? Or than an average city?
Most of the measures we actually use are what your textbooks
call indexes. Our midterm exam is an example. We just add
up the
point to measure the general
variable "knowledge of research methods as covered in the first part of
the course." This is an index. To make a true scale we
would have to rank the items from easy to hard. This is tricky,
because items that are hard for some people are easy for others.
When we
make an index or scale, we get measures that can be treated as
interval, even if they are not strictly interval. Scaling methods
can be more precise, but these are not used as often in sociology or
CJ because they are more difficult and the added information is not
always needed or useful.
Attitude Scaling methods include Thurstone
and Guttman
Scaling. and true Likert
scaling. Usually Likert-type items are just used to
make an index. Thurstone scaling tries to get true interval
measurement. Guttman scaling gets ordinal measurement.
Index construction approximates interval
scaling in some ways, but it is hard to know how well.
For an example of true scaling in criminal justice, we could
scale the seriousness
of crimes. There are various methods of
measuring this. - paired comparisons means asking a sample of
people to rate crimes based on their perceived seriousness.
New
Zealand Study on Attitudes to Crime. Crime
Victims United (Oregon).
Indexes work well
IF the items actually measure a common trait in the minds of the
respondents (not just in the mind of the researcher). We
determine this by measuring the correlations between answers to the
specific items. If the items
intercorrelate well, an index will work well. The exercises in
Ayers illustrate this process.
A very popular
test is the Myers-Briggs
Type Indicator, based on Jungian personality theory. You can
take several free versions of this and related tests online (Wikipedia article).
Several are available from similarminds.
A problem with this is that it sorts people into categories although
the measure is really continuous. This makes it understandable
and it is very widely used.
October 23. The midterm exams will be returned together
with the answer keys. We can discuss any questions about them on
Thursday. The
deadline to
withdraw with a W is November 12.
For our next
assignment, we will be using the General
Social Survey 1972-2006 data file. This very large data set is
described in
an article
from NY Times. To access the
data we can use the Survey Data and Analysis program at the University
of California, Berkeley. Just click on
SDA.
to go to the SDA Frequencies/Crosstabulation program.
Your task is to produce a multiple bivariate table with one column
variable and at least three (or at most six) row variables.
(NOTE: you
will get a little EXTRA CREDIT for doing more than three
variables).
The percentages should be row percentages. You should write a
paragraph summarizing the results in which you correctly describe some
of the percentages. The finished product should look like this
Sample
Assignment. You should prepare the table in a word
processor, print it out and bring it to class on October 30.
You may use any variables you wish for this paper - you can search for
them with the SEARCH facility. If you do not have any idea, I
suggest
you use one of the ones selected by the New York Times for the column
variable (your dependent variable). These include premarital sex,
trust in others, frequency of prayer, marijuana legalization, exciting
life, fear at night, the afterlife, spanking children, confidence in
institutions (just select one institution), happiness, abortion (pick
one item), gun permits, happiness of marriage, newspaper readership,
x-rated movies, homosexual teachers, social class memberhsip. You
should NOT use women & politics because we used that for the sample
assignment.
Then you should pick three to six independent variables which you will
use for row variables. These can include such things as age, sex,
religion, political party, year of interview, region of residence,
education, marital status, etc. HOWEVER, this data set has not
been
prepared for this purpose, so many variables need to be RECODED
first. To do this, click on "create variables" and
"recoding rules".
It shows how to recode AGE for this purpose. You need to
check each
of your row variables to see if it needs recoding. This
gives you the
flexibility to recode them as YOU wish rather than relying on someone
else's choices.
Then you need to do a cross tabulation for EACH of your row
variables.
You will need three for three variables, six for six. You may
wish to
print these out for your convenience or you can work on the
screen.
You take the results for all the variables and type them into one big
table for your assignment.
If you choose "year" as a variable, you should get the same results the
New York Times got. You could just select key years rather than
doing
them all.
A number of students have asked if they need to include the results for
the total sample in their tables for the multiple bivariate
assignment. The answer is YES.
Where do you get these? If you look at the bottom of any of your
cross-tabulation tables you will see them. These may vary
slightly
from table to table, however, because of missing data. Another
way to
get them, for the whole sample, is to type the name of the column
variable in your table in the ROW box on SDA with nothing in the column
box. This will get you the frequencies.
October 16 Review.
1. These notes should be helpful, but they do not offer full
explanations. For that, you need to refer to the books.
2. The Ayers book has good brief summaries at the beginning of
each chapter before it goes into the exercises. The pages
you should review carefully are:
1-3 on the probabilistic nature of social science and the nature of
variables and attributes (chapter one in Babbie)
29 -31 on theories and concepts (but not paradigms) (the second
part of chapter two in Babbie)
38
to 39 on deduction and induction, observations, generalizations
and
abstract theoretical propositions (also in chapter two of Babbie)
59 on necessary, sufficient and probabilistic causes. (the
rest of the topics are also in Chapters Four and Five of Babbie).
60 on independent and dependent variables
67 and 68 on spurious relationships, antecedent variables and
multivariate analysis
93 on units of analysis
96 on the ecological fallacy
102-105 on longitudinal studies: trend, cohort and panel
121-122 on conceptualization and measurement, definitions, indicators
123-128 on reliability and validity.
145-150 on levels of measurement and statistics.
3. Chapters Four and Five in Babbie are very important.
Read them. Also check the "main points" and "key terms" at
the end.
4. The
Statistics
Overview chart is important. We have covered everything up to
the Multiple R.
5. The statistics questions will be very similar to those on
Exercises Five and Six: You should be able to compute row,
column, and total percents and use them in a sentence. You should
understand and be able to compute expected frequencies. You
should be able to compute means, medians, ranges and standard
deviations as explained on the
Descriptive
Statistics page.
You may find the Companion Site to the Babbie book helpful, especially
for Chapters Four and Five. You can find it at
www.thomsonedu.com/sociology, by clicking on research methods and then
on our book. Or you can try this
deep
link which may go directly to it.
October 11 -
Graphs
and Charts to Communicate Statistical Data.
Exercise
Six is available for download, but the answers should be put in
SAKAI for immediate feedback. Look in the Tests and Quizzes
module on SAKAI for MethodsExercise_Six.
October9 We will focus on Exercise Six in the Ayers
book. We will also begin reviewing for the midterm.
Assignment Six, which is in SAKAI, is designed to prepare you for the
midterm. The reading in Babbie covers the same material and may
be easier to read. Chapters four and five in Babbie are
key. We will also be covering some statistics items. The
Statistics
Overview summarizes these. We have covered the mean, median,
mode, range, standard deviation, interquartile range, observed and
expected frequencies, row, column and total percents, chi square,
ANOVA, correlation coefficient. The midterm will not cover
regression or sampling statistics. Note that ANOVA was not on the
Statistics Overview handed out in class, we will discuss it
today. The Descriptive Statistics page explains how to compute a
standard deviation. You should be able to do a mean, median and
standard deviation and row, column and total percents and expected
frequencies. You should understand the difference between
descriptive and inferential statistics.
The example worked in class today:
45 men like spinach
85 women like spinach
65 men do not like spinach
80 women do not like spinach
Men women
Like
45
85 130
Don't
65
80
145
110
165
275
130 respondents like spinach
expected frequencies answers the question: if men and women did
not
differ in their liking for spinach, how many men would have liked
spinach.
Easiest way to get it is to multiply the row total by the column total
and divide by the grand total
How many men would we expect to like Spinach? row
total for the like
spinach row 130 column total for the men column 110
grand total 275
139*110/275 = 52.0 This is not a percent, it is an
expected frequency.
What percent of the men like spinach? the number of m en
who like spinach/the number of men 45/110= 40.9% of the men
like spinach
What percent of the people who like spinach are men? the number
of men
who like spinach/the number of people who like spinach 45/130 =
34.6%
of the people who like spinach are men.
Midterm coverage.
The syllabus has been structured around the Ayers
book, we have covered the first six chapters. It has good concise
summaries at the beginning of each chapter.
Oct 4 We will go through pages 121-129 in the Ayers manual which
illustrates the following concepts (among others?):
conceptualization
dimensions of a concept
indicators of a concept
operationalization
operational definitions
interchangeability of indicators
reliability
- test-retest method
- split-half method
- internal consistency method (best for questionnaires
with lots of items, Cronbach's alpha can be used or item-whole
correlations
validity
- face validity
- convergent validity (very similar to internal-consistency
tests of reliability)
- discriminant validity - indicatoers of each
dimension of the concept should correlate more than indicators of other
dimensions
- criterion (or predictive) validity is best if you have a
good criterion
- content validity
- construct validity, this is the most subtle and difficult
to understand.
An example: a study of UFO Abduction
Status.
Oct 2 Note: Chapter 5 in Babbie covers chapters 5 and
6 in Ayers but in reverse order. It may be best to read the
Babbie book first. I will go through the topics in Babbie's order
first, then go back to Ayers to review and to do the exercises.
Conceptualization. Thinking through what our conceptions mean,
defining them.
Defining the Concepts, or Conceptualization. This means thinking
through what we mean by the words we use (concepts are expressed with
words). Some are fairly obvious, e.g., suicide. Others are
not at all, e.g., race or poverty. Is there a difference between
"sex" and "gender" Babbie distinguished between real
definitions (which he does not believe in, but some philosophers do,
e.g, Plato) , nominal definitions, operational definitions.
Babbie follows a nominalist philosophy as opposed to realist, concepts
are things we make up, not things that really "exist". This is a
useful way to approach empirical research.
An example: the
measurement of romantic love.
defined three dimensions affiliative and dependent need, a
predisposition to help, and an orientation of exclusiveness and
absorption.
A particularly controversial one has been the concept of
"intelligence" What does this mean? Is it one thing or does
it have multiple dimensions:
Example:
Multiple
Intelligence
Operationalizing the Concepts. A
lot of effort goes
into this. Quantitative research means you have to measure
your variables and a lot depends on having good measurement.
Sometimes
this is difficult, e.g., measuring "intelligence" or
"liberalism-conservatism"
or "mental illness" or "crime rates (various kinds)". Often we
use
standard measures created by the government agencies that collect
statistics.
The point of conceptualization and operationalization is to measure
things. This means putting things into categories that correspond
to attributes of variables, e.g., men and women, upper class, middle
class, lower class (or UU, LU, UM, LM, UL, LL), annual income
$37,541.23. The critical thing in social research is how we
do this and what our measures apply. We can understand this in
terms of
Levels of Measurement
The first and most important question is: is the measure
continuous or
categorical? This is
important because continuous variables are required for the use of
statistics such as the mean, standard deviation, correlation and
regression. With continuous measurement we have precise distances
between the items measured, with categorical we just have them sorted
into discrete categories.
If a variable is
continuous,
we can ask whether it is "interval" or "ratio". Both
of these have precise distance measurement between points. In
addition, ratio measures have a logically meaningful zero point.
With ratio measures, we can talk about ratios between variables, e.g.,
say that $50 is twice as much money as $25. With interval
variables, such as fahrenheit temperatures, we cannot make such
statement.
If a variable is
categorical,
we can ask whether it is "dichotomous," "nominal" or "ordinal"
Dichotomous variables have only two categories. These can be two
natural categories such as "male' and "female" or they can be
artificial "dummy" variables, such as: are you a Catholic
or not;. With dichotomies you can use regression and correlation.
Nominal variables have more than two categories, but not in any order
or with a measured distance between them. We can do percentages
and chi-square significance tests with nominal variables.
Nominal Measurement. Categories that could be put in any order.
Catholic, Protestant, Jewish, Moslem,
LDS, Buddhist, Episcopalian, Baptist
variable one, category of religion, variable two denomination.
Mental illnesses (DSMIV) e.g., adjustment disorder, borderline
personality disorder, paranoid schizophrenic
Crimes: burglary, assault, murder. What do these
terms mean? Look at the US Criminal Code.
Each individual should go into one and only one category on a
variable, one value on a variable. For example: What
is your
favorite food, we have a long list, but each person is allowed only one.
Sorting people into categories
must be as reliable and accurate or valid as possible. One of the
things we do is evaluate how accurate our measurement is.
Ordinal variables have the categories in a logical order (from
"lower" to "higher"). We can compute a median and a range.
Ordinal Measurement. Here we have categories in a logical
order. Very short, short, medium,
very tall, tall . Often we
take continuous variables and make them ordinal.
Income: Under
$20,000 $20 to 40,000 $40 to 60,000
$60000 plus.
Interval Measurement: TEMPERATURE IN FAHRENHEIT OR
CENTIGRADE, 0
degrees is not the absence of heat. How about the day that the
"
temperature
doubled" in New York City?
Ratio Measurement: Income in dollars: a
continous numerical value PLUS a meaningful zero point. Height in
inches.
In answering questions about measurement, give the highest or best
level of measurement that is justified. Any variable that meets
the criteria for a ratio variable also meets the criteria for an
interval variable, but the criteria for a ratio variable are more
stringent so we would say that it is ratio measurement. Any
ordinal variable also meets the criteria for a nominal variable, but if
it meets the criteria for ordinal we say it is ordinal.
It is
important to understand that many variables can be measured at
different levels. Thus I could take height and put it into
categories such as short, medium, tall in which case I would be using
ordinal measurement because they are in order. I could also
measure it in inches or centimeters, which would be ratio
measurement. It is also important to understand that each of the
statistics is appropriate for variables measured in some ways but not
others. Doing percentages and cross-tabulations makes sense for
nominal or ordinal data. Chisquare is for nominal or ordinal data.
Doing correlation or regression or means and standard deviations
requires interval or ratio data. We can make a broad distinction
between categorical (nominal or ordinal) or continuous (ratio or
interval) data. The dichotomy is a special case because we can
use correlation and regression with dichotomies, but we can also do
percentages, cross tabulations and chisquares.
Quality
of
Measurement - Reliability and Validity.
Reliability - you get the
same thing
over and over. Consistency.
inter-rater
- two different raters get the same answer.
test-retest, if you take it twice the answers are the
same.
internal consistency - are the items on a test
consistent? This can be calculated by looking at the inter-item
correlations. Chronbach's alpha is a statistic that measure
inter-item reliability. Example, correlate the ABORT variables in
the GSS data file. We see that all the correlations are positive
and significant. We can then make an index of them by adding up
scores on the six variables.
Correlation Coefficients
N: 1603
Missing: 1229
Cronbach's alpha: 0.874
LISTWISE deletion (1-tailed
test) Significance Levels: ** =.01, * =.05
ABORT
DEF ABORT WANT ABORT
POOR ABORT RAPE ABORT
SING ABORT HLTH
ABORT DEF
1.000 0.447 ** 0.439
** 0.618 ** 0.443 **
0.641 **
ABORT WANT
0.447 ** 1.000
0.816 ** 0.435 ** 0.840
** 0.332 **
ABORT POOR
0.439 ** 0.816 ** 1.000
0.442 ** 0.827
** 0.337 **
ABORT RAPE
0.618 ** 0.435 ** 0.442
** 1.000 0.437
** 0.636 **
ABORT SING
0.443 ** 0.840 ** 0.827
** 0.437 ** 1.000
0.330 **
ABORT HLTH
0.641 ** 0.332 ** 0.337
** 0.636 ** 0.330 **
1.000
Correlation Coefficients
N: 1603 Missing: 1229
Cronbach's alpha: 0.796
LISTWISE deletion (1-tailed test) Significance
Levels: ** =.01, * =.05
ABORT INDX
ABORT DEF
0.734 **
ABORT WANT
0.858 **
ABORT POOR
0.855 **
ABORT RAPE
0.728 **
ABORT SING
0.859 **
ABORT HLTH
0.645 **
Validity is it "really"
measuring
what it is supposed to measure.
Face Validity - does it look right?
This is often related to fairness, people will object to the use of
measures that do not have face validity even though they may have
predictive validity, e.g., using the frequency of moving as a criterion
for loaning money.
Predictive or criterion validity - does it predict what we want to
predict,
some "true" measure. SAT test predicts college or law or medical
school grades.
Convergent
validity - do several measures give the same result.
Construct
validity - does the measure perform as our theory says it
should.
We use this when we have no criterion.
This is the most difficult, it is used when things are inherently
difficult to measure.
Essentially, it asks whether the
results are consistent with what we would expect based on theory and
past experience.
Camden
schools report. Brim
school report, see pdf page 14 for tables. Story
on Brim with graph.
An example: the
measurement of romantic love.