Assessment applied and criticised

Identification and discussion of the principles of assessment and application to own subject/s

The four principles of assessment are validity, reliability, beneficence and efficiency.

Validity:

Validity is to ensure that assessment is focussed in the correct area. It is not useful or purposeful to assess work unrelated to the intention of the study. The area of study derives from a syllabus which leads to a course outline and objectives for each lesson, and assessment is valid when it focusses on the objectives of the lesson. This is also related to producing work in the syllabus and from the student which is up to date and relevant. In certain cases validity is threatened when a course timescale is unable to contain the whole of a syllabus and therefore assessment is insufficient and the student's strategy of revision for an examination may also be insufficient with an element of gambling involved about what may "come up".

Reliability:

Reliability is to ensure the repetition of assessment. It is not useful or purposeful if students cannot be compared across classes or for one year to the next. This means that assessment is reliable when it is the same across classes and times, so that the achievement of one student can be compared with another. Validity, in terms of updating knowledge and updating teaching fashion, may affect reliability over the medium and long term and this is the subject of much debate in the media.

Beneficence:

Being beneficial assures the student that there is a positive purpose to assessment and that it can be used by him or her for the purposes of improvement or moving on to higher study. Being beneficial is linked to making sure that the learner's own effort is involved because plagiarism by any means is (at the very least) not going going to benefit the student as well as translating material into his or her own output. It is not useful or purposeful if the student cannot see the point of the assessment being made and it carries no advantage or the teacher cannot use it to guide the student on. It is also important that the entry level is accurately gauged, otherwise assessment will not be beneficial if the assessment is too difficult or too easy. However, much summative assessment in subjects (if usually outside adult and further education) does carry a finality about it so that from the examination on no more may be seen or heard of that subject! So formative assessment should offer feedback and guidance, and summative assessment should give certification and a path to improvement.

Efficiency:

Being efficient ensures that the student spends a minimal amount of time away from additional learning to give the formative feedback of current learning and that summative assessment is not overbearing and overstressful or, in an opposite view, frustrating by repetition. This is linked to sufficiency, in that efficient assessment must also be sufficient. It is not useful or purposeful if the student is engaging too much time in formative assessment activities given the limited timescale of a course or having to produce a greater file of work than the syllabus would need or more exams than would cover the syllabus. Some syllabuses may work out at being repetitive in assessment, for example where the same skills are repeated in IT between "word" processing and "text" processing. In any case somewhere along the line more money is being spent by the student or authorities than is necessary and too much time may be given when a shorter course is just as efficient (being linked to the entry level and learning speed of a student).

Whether assessment is valid, reliable, beneficial or efficient depends upon how a particular subject area chooses its methods of formative and summative assessment.

Applications

In my teaching experience I have observed how the above principles of assessment have and have not been applied to subject areas and from this I propose how they ought to be applied. I also propose to go on to argue for the terminology of "evaluation" over "assessment".

In teaching Sociology my task was to carry out formative assessment. This was the purpose of the tutorials, to monitor the effectiveness of the lecture. So whilst I was responsible for generating discussion, I was also monitoring their comprehension. In this introductory course no work was actually handed in. Clearly this informal method relies upon the teacher and may not be very reliable, although one hoped to be even across the groups - but then the groups often generated their own discussions and concerns. So I was a tool of focus and making valid what was an informal ongoing assessment. How beneficial this was to individuals obviously depended upon the effectiveness of the discussion sessions and correcting and guiding particular comments, so it was incumbent on me not to let individuals slip through the net. Therefore my preparation was centred upon the syllabus as it was set for that time and had pre-emptive points to reinforce, but it relied much upon the co-operation and interest of the student, although they were required to attend. In the end the greatest focus came at the time of summative assessment, when I spoke of the questions they would face in the exam. This was extremely efficient, but had the effect of reducing the syllabus, but then the purpose was to pass Psychology students through Sociology to prove that they had "done it".

In a better arrangement then there would be small pieces of work to hand in based upon the lecture content, and there would not be such revelation of examination areas. The identified work would allow me to see what was being understood and not understood by each individual, and then as a whole the objectives could be better served, and although the exam would be more uncertain for students, they would then cover the whole syllabus in their revision efforts, and they would have been able to state that they had a better understanding of the whole of Sociology. It was beneficial that they had passed (they all did) and could move on, but any switching to Sociology or importing that interest into their psychological work might have been impaired, so I would have preferred that they worked to the whole syllabus.

In the Research Methods I taught work was being formed. There was a similar informal monitoring process going on within the seminars based upon my lectures and a lot of reinforcement happened. Nevertheless each student was forming research proposals within a slightly extended essay format. The precision of each work was based on their own interest and then upon the focus on research methods. It was here that I monitored whether they understood the applicability of research methods. This work was often presented, in preparation form, and by summary, by the students to the seminars. This again was up to my judgement as to whether they were learning the correct applicability of research methods. Everyone was covered in this, but some came for assessment more than others as the sessions became more informal (the intention of the course outline).

Arguably this again relies heavily upon the teacher for formative assessment. Reliability was a problem. I followed a teacher who spoonfed all details within seminars as well as lectures, whereas I moved rapidly to more student responsibility and feedback as would have to be the case in actually doing the research. This I believed was far more beneficial for their future but there was unreliability between me and the previous teacher. The subjectivity of which I might be accused in reading individual accounts (I needed a wide knowledge of subjects!) also adds to unreliability. I was also replaced with the change in job description and towards a further workshop method (the students wished me to continue!) and this would make assessment even more unreliable as I certainly wished there to be continuity between monitoring the research proposal and monitoring the research and writing the research findings. However, in charge at the time, and in negotiation with colleagues, I believed that the formative assessment was more efficient in the long run. It would be easy to tell them everything and have it told back in some form of assessment, but not useful in terms of training to become researchers when the work evidence they showed for me proved that they had found out much by themselves.

In terms of the summative assessment, this also relied upon my subjective reading whether they had understood the applicability of research methods in terms of their own project. Had they chosen the right combination of methods for their objectives (they had to differentiate between aims and objectives) and did they understand the methods they were using? There was some cross marking and checking and external monitoring, but obviously one questions the reliability of this method at least. It seemed to be valid, and efficient in that it was closely tied in to the research proposal, which itself was passed as suitable in each case.

However, I might myself with more time institute a separation between early research methods understanding and producing a a piece of work. There could be more bitesize lectures and classwork on research methods, and more rapid assessment of the apparent learning gained. Then they would be move on into the production of the research proposal and methods involved, with seminar emphasis on research methods for the benefit of the other students (I did this, but informally). The summative assessment would then remain more based on the individual but with this existing back up of teaching and learning which had been more reliable.

Rogers (1996, 220) and Curzon (1997, 388) make a distinction between assessment and evaluation. What I was carrying out was evaluation. It takes into account more in the way of value, sets other criteria and is more qualitative than a narrower definition of assessment. Perhaps the clearest form of assessment, towards quantitative testing, is objective testing, and this is criticised for producing fragmentary learning (Curzon, 1997, 413). However, evaluation of learning is less reliable than assessment.

There is still an objectives-assessment regime running throughout teaching and learning, starting with the syllabus, creating the course outline and its aims, creating the lesson each time with its objectives and leading therefore to assessments that directly test those objectives, and all of this formed within a context of league tables and funding. So there is a demand for objective assessment, even with methods of evaluations! There is a delusion inherent in making generalised quantitative statements out of qualitative evaluations. But even with the more instant teaching and feedback in bite-sized chunks and furthermore by the summative assessment the more quantitative reliable approach does not guarantee understanding that learning takes place. However much different teaching processes may be used, which have behind them psychological different theories of learning, the dogma of syllabus, outline, objectives, and assessment is very behaviourist ("there must be evidence of outputs!").

Against this, the whole point about cognitive learning (especially Sociology, Research Methods) is that understanding is long term, self-critical and formative. Understanding is built over time through thinking and reflection. The objectives-assessment regime may in fact be more a system to be obeyed because of the external quantifiable results produced rather than primarily a good system of learning. I call this here the Behaviourist Loop, because the pseudo-science of behaviourism (*) gives the sense at least of quantifiable outputs.

It may be that the question is asked back, what other way is there to measure learning? Teachers already do use their judgement. There are even some specific methods, for example profiling and self-assessments (Reece, Walker, 2000, 54). Yet much assessment takes into account a system of provision as much as the requirements for effective learning. It should be admitted that a system itself (especially now with league tables) has objectives which are being met which are not themselves learning but the presentation of apparent learning.

However, the behaviourist method is appropriate on purely a learning basis within the pyschomotor setting of IT. I have taught in IT and of course the tasks are related to the objectives and the assessment is directly linked. That the person has done the task, and was their own work, shows that they have indeed reached a certain level of competence. It can be marked in a book. It can be argued that having done the tasks that a summative assessment in the form of an exam is unnecessary, although the exam conditions of individual, monitored work, in some isolation is an efficient guarantee at that time, if the tasks elated to the syllabus and no more, of an individual's ability to perform work. But this only goes to emphasise that in the cognitive sphere, the "ability to pass" is itself the skill being measured and not the depth of something deeper called learning where, perhaps the aim of a lesson and not its objectives are more relevant.

(*) I am cross referencing this Brief with my own additional piece of work, Psychological Theories of Learning and the Lesson Plan

Curzon, L. B. (1997), Teaching in Further Education: an outline of principles and practice, fifth edition, Cassell.

Reece, I., Walker, S. (2000), Teaching, Training and Learning: a practical guide, Sunderland: Business Education Publishers.

Rogers, A. (1996), Teaching Adults, second edition, Buckingham: Open University Press.

Adrian Worsfold
http://www.pluralist.co.uk