Membership benefits
Our members enjoy a wide range of benefits including access to a dedicated Member area on this website, a termly magazine, networking opportunities, financial advice, unlimited access to the CIEA Research Database and much more.
Find out more about the benefits of membership
Apply for membership
If you're interested in becoming a member of the Chartered Institute of Educational Assessors, you can complete an application form online today.
Assessment Validity
In accordance with the QCA Code of Practice, all awarding bodies are required to ensure that their assessments are fit for purpose, valid and reliable. Validity in assessment has been defined by QCA as 'the fitness for purpose of an assessment tool or scheme'. In other words, an assessment can be deemed valid if it gives an accurate measurement of whatever it is supposed to measure. Seeking to ensure validity in assessment permeates all the assessment activities of an awarding body, from developing the specifications, producing the question papers or tasks and their associated mark schemes, to the marking and awarding processes. The purpose is to ensure that the results awarded provide a true measure of the knowledge, understanding, skills and aptitudes that the assessment in question is intended to measure.
There are a number of different types of validity. Those most relevant to public examinations and tests include the following:
Face validity
Face validity is a measure of the extent to which an examination looks like an examination in the subject concerned and at the appropriate level. Candidates, teachers and the public have expectations as to what an examination looks like and how it is conducted. For example, they expect question papers to be clear, error-free and written in plain, fairly formal language, and they expect examinations to be taken under controlled conditions. An awarding body would lose credibility if it chose to produce examinations which did not have face validity, even if they were valid in other ways.
Content validity
Content validity is a measure of how closely the content of an assessment matches the content of the specification it is designed to assess. Teachers and candidates prepare for an examination in the expectation that the subject content and assessment objectives (knowledge, understanding and skills) tested in the examination will reflect those set out in the specification. Awarding body specifications are required to include clear information about what is to be tested in each component and the content of draft assessments is checked against the specification by the revisers, scrutineer, chief examiner and chair of examiners to ensure that the content is valid and acceptable.
Construct validity
Construct validity measures the extent to which an examination actually measures what the specification says it is intended to measure. The components of the examination should seek to elicit responses which closely match the overall specification requirements. If the specification is intended to test a range of skills then the assessment should provide opportunities for that range of skills to be demonstrated. For example, modern foreign languages examinations are intended to measure candidates' skills in understanding, speaking, reading and writing the foreign language. Consequently candidates complete a number of assessments which together assess this range of skills. If a specification requires candidates to acquire practical skills, the assessment should include a practical assessment and should not just require candidates to, for example, write about how they would carry out a practical activity. A coursework component is included in many specifications because it allows skills to be tested which could not be tested in a formal written examination situation (e.g. research skills, investigative skills, practical skills).
Predictive validity
Predictive validity is the extent to which the results of an assessment can be used to predict future behaviour or achievement. Schools and colleges use candidates' GCSE grades as a measure of how suited they are to continue their studies post-16 and what sorts of courses would suit them best. Universities and HE colleges use A level grades as a tool for selection to their courses in the expectation that success in A level specifications can be used to predict suitability for further study and students' likely success. The extent to which GCSEs or A Levels are successful predictors of future performance is limited and it should be remembered that predictive validity is not one of the main concerns of awarding bodies.
Ensuring validity
There are a variety of ways in which the awarding bodies and the regulatory authorities seek to ensure assessment validity. These include:
- The accreditation of specifications by QCA, DELLS and CCEA
- Adherence to the QCA Code of Practice
- The use of specification grids by principal examiners drafting question papers and mark schemes to ensure appropriate coverage of assessment objectives and subject content
- Checks on draft question papers by revisers, question paper evaluation committees, scrutineers and awarding body officers
- Reviewing the performance of question papers by principal examiners
- Consideration of a wide range of qualitative and quantitative information by awarding committees and chairs of examiners
- Review of all recommended grade boundaries and grading outcomes by the awarding body's accountable officer
- Seeking feedback from teachers on examinations and tests
- QCA scrutiny and monitoring procedures.
Linking validity and reliability
Validity and reliability in assessment are very closely linked and in many cases inter-dependent. It is possible to think of cases where a valid assessment could not be conducted reliably, for example certain practical activities which produce purely ephemeral evidence. It is also possible to think of assessments which would be highly reliable but not particularly valid, for example certain multiple choice tests, or the use of a spelling test to assess linguistic ability. However, in most cases, validity and reliability are intertwined and examining personnel involved in devising assessments (developing specifications, drafting assessments and mark schemes) seek to ensure the maximum possible validity and reliability with an appropriate balance between the demands of the two. It is hoped that this gives an assessment which is truly fit for purpose.