C-STAR Subjective Evaluation, Summer 2003



ATR
Grading File I.
Grading File II.
Grading File III.
Grading File IV.
Grading File V.
NLPR
Grading File I.
Grading File II.
Grading File III.
Grading File IV.
Grading File V.
UKA
Grading File I.
Grading File II.
Grading File III.
Grading File IV.
Grading File V.
IRST
Grading File I.
Grading File II.
Grading File III.
Grading File IV.
Grading File V.
ETRI
Grading File I.
Grading File II.
Grading File III.
Grading File IV.
Grading File V.

Each group participating in the C-STAR III evaluation this year has been assigned a series of translations for human grading.

The grading assignments for each group are split into 5 files, listed under the group name. In order to keep the overall file length manageable for a single grader in a single grading session, each file contains around 100 sentences.

To start grading a file, simply click on the filename in this list. The grading page will open automatically, after prompting you for a group user name and password.

A sample grading turn is shown below. For each translation, the grader must supply grades for fluency and adequacy. These terms are explained in the instructions section of each of the grading pages.

IMPORTANT NOTE: Each of the files contains 100 translations that must be graded in a single grading session. Graders should complete the entire file and submit the grades before moving on to another file or closing the browser.

Special Characters and Null Translations: NO_TRANSLATION_FOUND indicates that the translation engine could not produce any output for the current input sentence. Non-Latin characters in the output indicate that some words from the input could not be translated.

GRADING EXAMPLE, DO NOT GRADE THIS SEGMENT.

1.a Fluency: How good is the English?
Evaluate this segment: This is the test hyp.
Flawless English
Good English
Non-native English
Disfluent English
Incomprehensible
Comment:
1.b Adequacy: How much information is retained?
Reference: This is reference translation one.
Evaluate this segment: This is the test hyp.
All of the information
Most of the information
Much of the information
Little information
None of it
Comment:
Please send problem reports and suggestions related to the subjective evaluation web pages to Alicia Tribble atribble@cs.cmu.edu