Interpreting Categorical and Quantitative Data

Conceptual Understanding of Line of Best Fit

Here’s a conceptual question (taken from the Shell Centre) that provoked some solid responses from students:

Here are a few of the responses:


5 replies on “Conceptual Understanding of Line of Best Fit”

The word we need is correlation. And possibly causation. Maybe that’s why people in general don’t seem to understand the difference.
There must be a sample response with a LOBF that’s straight, yes? – I like how this one deviates around that lovely line of points. Is this because we tell the students that a LOBF dies not have to go through points, which becomes “must not go through any points.”
A reminder that the words we use are important.

I don’t care for the question. The context of the question gives no clear use for the the LOBF – there is nothing obvious to predict as you have all of the scores for each student. You could say the data was from last years class and give the test A results for the current years class and ask for some predictions on the test B results, but that is a trickier and more time consuming question.

“To see how close a person was to the average” is close to the right idea if you think of “average” as the predicted score or the score you would expect to see.

I don’t care for the question either.
I would love a question like: “what would a best fit line y=x mean on this graph?”
That might get students thinking about slope of line of best fit in this dataset.

Can we agree on the correct answer? I think it might be:
“The line of best fit approximates an average student’s performance on Test A and Test B. In particular if the slope of the line of best fit is greater than 1, then students tended to do better on Test B than Test A. If the slope of the line is less than 1, then students tended to do worse on Test B than Test A.”

I would love to post-mortem the instruction, and the rubric used to grade all the replies to the question.

I agree with Louise’s comment about the “wavy” LOBF. We overcompensate for students’ desires to “go through the most points” and this is what happens. However, why didn’t the kid have a ruler?!

I like the student’s response that used the term “estimate” intstead of “average.” I think if you combined that with the first response, it would be fairly accurate. The problem with using the term average is that you end up with statements like the third and fourth student. It is difficult to determine whether they truly do not understand or they just cannot communicate very well.

I also agree with John that this particular scatter plot lends itself quite well to an examination with y=x. Students need more opportunities with these situations where the two variables represent the “same quantities.”

In addition to everything else said, the LOBF drawn intersects the origin. In physics, students often assume they must start from zero. A similar error is happening here. The student should imagine the LOBF before drawing it, not start from the origin and figure things out as he or she goes along.

Comments are closed.