How does Write & Improve know my score?

This article explains how the automated assessment in Write & Improve works.

Diane avatar
Written by Diane
Updated over a week ago

The Write & Improve assessment engine works by a class of Artificial Intelligence (AI) known as Supervised Machine Learning. In this method, if you want to predict something, you provide an algorithm (computer program) with a large amount of data (known as ‘training data’) in which the thing you want to predict is already known. From this training data, the algorithm learns to predict the same thing in any data of a similar type that it has not 'seen' before. 

In the case of Write & Improve, the training data is the Cambridge Learner Corpus, which is a database of 30 million words of EFL learners’ essays submitted for Cambridge English exams, along with the scores they were awarded by human examiners, benchmarked to the CEFR 0-13 scale, and the language errors in them marked and corrected by EFL teachers. The goal is to predict the CEFR score for any new essay based on what the algorithm knows about previous essays and their scores. Given the examiner-scored essays and the scores they were given, the algorithm is able to identify the many thousands of characteristics (known as ‘features’) that make an essay a 7, rather than an a 7.5 or an 8, for example. 

After computer analysis, each essay in the training data has a ‘feature set’ – a collection of characteristics that explain the relationship between the essay and its CEFR score – why it was given that particular score. Simple examples might be complex phrase structures, use of modal auxiliaries with adverbial complements, a range of narrative tenses, prepositional phrases etc, along with the different error types and their frequencies at different levels of attainment. If a set of features is seen in a new essay that is very similar to essays in the training data that scored 7, for example, that score is predicted for the new essay also. 

So, Write & Improve’s scoring engine relies on a vast range of both very detailed and specific language features of an essay, and more general aspects, such as sentence length and complexity. It is not, however, able to assess how well or completely the essay answers the question, or to determine whether the author has made their intended meaning clear. The first is a goal our research partners are actively working towards. The second is something that will perhaps always require human understanding.

Did this answer your question?