Assessing Text Response Questions

Designing assessment profiles for a text question can become very complex depending on the richness of conceptual understanding which is being tested.  Text questions with highly defined "correct" answers consisting of just a few key words or phrases are most easily assessed; such a case is outlined in "Example 1" at the foot of this article.  In other cases, natural (written) language responses to a question might be more expansive, describing a single "correct" concept in many different yet equivalent ways.  Furthermore, there are very often multiple concepts which should be recognised when assessing answers to a question.  Sometimes several concepts are "correct" (even if one is more correct than the others).  Xorro's assessment text response tools provide considerable flexibility to identify the presence of multiple understandings while still catering for many different ways of expressing the same concepts.  A key constraint to Xorro's current text assessment is that it searches for a single assessment outcome - a single matched "Answer Term" - for a submitted text, resulting in a single associated score and feedback.

Setting the rules: 

In the first step, the author determines which of four general rules to apply to the assessment of the participants' responses: Ignore case; ignore punctuation; ignore spaces; ignore additonal space either side of term.  These rules determine the exactness with which submitted strings will be compared to the Answer Terms.  Should differences in case be ignored?  If so, then regardless of the use of case typed by the participant, a text string which otherwise matches one of the strings in an Answer Term will be matched to that Term.  Similarly, if the author does not set the rule to "ignore punctuation" then the string submitted by the participant must match the string in the Answer Term inclusive of any punctuation (characters " ' ; : - , . .). The author may choose to ignore any spaces in the participants' responses. Lastly, choosing to "Ignore additional text on either side of the term" will allow the participant to have non-matching text in their submission, so long as threre is some part of their submission which matches with a string in one of the Answer Terms.

Note that if the author selects to "Ignore additional text on either side of the term" then depending on the term this can greatly loosen the rigour of the assessment:  a participant has only to list all terms which might be possible answers, and if only one of them matches a ternmin the questioin, the response will be assessed as correct.

Having established the base rules for the matching process, the next step is for the author to prepare the list of Answer Terms.

Setting the Answer Terms: 

An Answer Term (as used in a Text Response question) is a string of text, or multiple strings,  which can be compared to submitted responses in order to assess the responses.

When assessing a text response, Xorro will compare the submitted text string with each Answer Term marked "correct", starting from the top of the list.  If a "correct" match is not found, then Xorro will start again from the top of the list looking for the first Answer Term (not marked "correct") which has a match to the submitted text string.  A match is determined by the presence of an Answer Term string within the submitted text, always subject to the rules (as set above).

Note that each Answer Term field in fact allows for multiple possible strings, sperated by commas.  Xorro will check for a match against each of these text strings, and if one is found, will assign the "correct" status and the marks for the Answer Term to the participant's response.

Important:  Only one match is established: the first match found using the process described above.  The first match found therefore detewrmines the "correctness" and the score awarded for the response.  The purpose of having multiple Answer Terms in a text question's assessment profile is not to recognise the presence of multiple conceptual ideas in a response, but rather to determine whether it contains any one of them, and if so, which. 

This constraint should focus attention on the identification of expected answers, and of equivalent forms of describing these in natural (written) language.  These expected answers can then be grouped into Answer Terms where each Answer Term addresses a particular concept being expressed.  Each Answer Term can be considered as a set of text strings, the submission of any of which by a participant would indicate a particular understanding which is deserving of a specific score and feedback.  The strings within an Answer Term are separated by a comma.

Example 1:  Consider the simple text response question "Who was the Chancellor of Nazi Germany from 1933 through 1945?"

This is a very straight-forward example in which only one correct answer exists.  A suitable answer term would be: "Adolf Hitler".  Entering these two words as shown would result in the response being awarded full marks (subject to use of capitals, spaces, punctuation etc as determined in the question rules).  However, depending on the question author's preferences there might be more or less flexibility applied even to such a straight-forward example.

A second answer term might be: "Adolf, Hitler, Führer, Fuhrer".  A response including any of these (comma separated) strings might be regarded as a less complete answer, and might perhaps be awarded less than full marks, but might still be assessed as "correct".

A third answer term might be "Stalin, Churchill,Rooselvelt,Mussolini".  These clearly incorrect answers would attract a different feedback text string and a different mark!  This example serves to highlight that it is sometimes as useful to identify and assess "wrong" answers to a question, as it is "right" answers.

Example 2:  Consider the much more demanding text response question "What is the main purpose of the lungs?"

In this case the assessment of submitted text responses needs to allow for multiple correct yet different conceptual understandings, as well as allowing for differing vocabularly used when describing each concept.  Suitable answer terms might be as follows:

  1. Respiration, inhalation, exhalation, breathing
  2. Oxygen into the blood, O2 into the blood, carbon dioxide out of the blood, CO2 out of the blood, gas exchange
  3. Bellows,suck air, suck oxygen, push carbon dioxide, push CO2, pump air, pump oxygen, pump carbon dioxide, pump CO2
  4. pH,balance pH, balance acidity, filtering,absorb impact,absorb shock,shock absorb

The above Answer Terms represent four different "correct" ideas, each deserving therefore of differing feedback strings, and perhaps differing "correctness" and/or scores.  Xorro would test a submitted text response against each nof these terms in sequence, starting from the top.  As soon as a valid match is discovered, the response would be assigned the correctness status and the score applicable to the matched Answer Term.

Note that the list of Answer Terms for this example is not comprehensive:  There would certainly be other correct text responses to this question which might not include any of the strings listed within the above four Answer Terms.  Prudent authors should therefore test a new text response question with a knowledgeable audience to derive the widest possible range of answers; these could then be filtered to extract strings to be included in Answer Terms for future assessments.

 

See also:  Question TypeInformation QuestionsMultiple Choice Questions (with single selection enforced); Multiple Choice Questions (with multiple selections permitted); Numeric Response QuestionsLikert QuestionsHotspot Questions (with single zone selection enforced); Hotspot Questions (with multiple zone selections permitted); Label Questions (text or numeric responses required); Label Questions (selecting from a drop-down list); Peer Assessment Questions.

Categories: Xorro-Q, Facilitators.
Tags: Question Type, scoring, assessment.