Writing Assignment 4, due Friday, December 2:

In Chapter 14.1 we're studying graphical descriptions of data. Bar graphs, histograms (i.e, frequency bar graphs and relative frequency bar graphs) and pie charts are topics elementary school students study; see this web site, for example. We'll also look at line graphs. It's important, especially for the teachers, to know which situations are appropriate for each kind of graph. It's also important for students to become sensitive to the ways that graphs can be misleading.

So your assignment is to find an example of a misleading or inappropriate graph and write a page discussing what is misleading or inappropriate about the graph you selected. Attach a copy of the graph itself to your write up. If possible, I would prefer that you use a graph that appeared in a print publication. You can however use graphs from the internet, but do not use graphs from web sites that have stockpiled examples of misleading and inappropriate graphs. You can find a lot of these sites just by googling "misleading graphs". They are useful resources for teachers, but I'd like you to find your own example.

Here's some background for this writing assignment. We went over this in class, but I've included some additional examples we didn't have time to look at during class.

Bar graphs are the most flexible kinds of graphs. Whenever you have a numerical value for a set of categories, you can use a bar graph to display the data graphically. The data could be student scores on a quiz. Here the categories could be the individual students, and the data value could be the score, if you want a graph showing how each student did on the quiz. Or the categories could be ranges of scores (like scores in the 0 to 10, 11 to 20, etc., up to 90 to 100) and the data value for each category would be the number of students whose quiz scores fell in each range.

Pie charts are appropriate for showing data which represent parts of a whole. The slices of the pie chart should be proportional to the numerical values being graphed. Look at this pie chart; it appeared in the Friday magazine of an Italian newspaper. It shows the result of a poll (sondaggio means poll in Italian) about how Italians feel about various aspects of Italian life: 30% felt that national unity is important, 55% felt that political parties are divisive, 40% felt that the Mafia was the most shameful aspect of Italian life and 30% felt that Italians' ability to get along (or make do) was their best quality. Note that the slices of the pizza which make up the pie chart are not proportional to the data. The slice for the 50% value is way bigger than the slice for the 40% figure. Also, the percentages do not add up to 100%. The data does not represent parts of a whole. Presumably each respondent answered each of the four questions, so the some of the 40% who feel the Mafia is the most shameful aspect of Italian life are included in the 55% who feel that political parties are divisive.

Line graphs are appropriate for showing data that has a natural order where you want to emphasize a trend. For example, the Dow Jones Industrial Average is a weighted average of the stock prices of a sample of stocks consisting of 30 large publicly owned companies, used as a proxy for the whole stock market. For a graph (like this one) showing the Dow Jones Industrial Average over time, you may want to emphasize how the stock market is trending over time (are stocks becoming more or less expensive). For a graph showing how long people live on average based on their height, you may want to emphasize how longevity changes as people's height increases (see the graph given in Figure 1, on the third page of this article titled "Impact of height and weight on life span" by Samara and Storms, on a National Institutes of Health web site; it shows how long baseball players lived, ordered according to their height, with the clear trend being that the taller you are the less long you live). When the data has no natural order, such as a graph showing the fraction of the population for each possible eye color, it makes no sense to use a line graph.

Here are various things that can make a graph misleading.

Using incomplete scales or cropping can be misleading. For example, CNN showed the results of a Gallup poll on the Terri Schiavo case using the this bar graph. The poll asked whether or not respondents agreed with a judge's decision to terminate life support for a woman, Terri Schiavo, who was in a vegetative state. The graph makes it look like support is very high among Democrats and very low among republicans and Independents, because the graph is cropped to show only the very top of the bars in the bar graph. The next day CNN published this less misleading graph.

Presenting information in unusual ways can be misleading, since readers might not notice the nonstandard presentation and so interpret the data as if it were presented in the usual way. For example, look at this line graph, taking from a 2002 report on American education (see slide 3). The point of the graph is to show that 4th grade math scores (on some standardized exam) increased dramatically over the period from 1990 to 2000. By using cropping as discussed above, in which the bottom half of the graph is cut off, the increase is made to look much bigger than it really is (notice that the increase is only 15 points out of 200 over 10 years). But what's really non-standard is that the 2 year interval from 1990 to 1992 is spaced the same way as the 4 year intervals from 1992 to 1996 and from 1996 to 2000. This makes the last two line segments look steeper than they really are. As another example, in a pie chart it is standard for the slices to be proportional to the data. In this pie chart on nuclear weapons, which appeared in Time magazine, they're not. If you just pay attention to the size of the slices, you'd think that the Russians had more nukes than we do.

Another thing that can be misleading is the use of special effects or distracting or irrelevant details. For example, this graph appeared in the World-Herald. It shows the percentages of Nebraskans under correctional supervision in 1982 versus in 2007. It turns out the percentage has about doubled. The graph is really a bar graph; what's relevant is only the height of the thumbprints. If you pay attention to the size of the thumbprint you might have the impression that the percentage was about four times as big in 2007 as it was in 1982, since the size of the thumbprint for 2007 (in terms of area) is about 4 times as big as the one for 1982, but the area is an irrelevant detail. As another example, look at this graph from USA Today. It looks like a bar graph, but the bars are irrelevant; they're not proportional to the data. If you just pay attention to the lengths of the bars you might think that Illinois and California have the same number of athletes. Also, the balls at the end of the bars are irrelevant. The Texas data is not about just football, and the California data is not just about basketball!