The system gives word-level feedback only when it is 90% or more 'certain' that there is an error. This is measured by the percentage of times it has seen the word or words marked as incorrect by one of our teachers 90% of the times it occurs in the data. This is to avoid incorrect feedback as much as possible. When the system is almost 90% certain, for example, because the word has been marked incorrect 80% of the times it occurs, it gives the 'suspicious word' feedback. That is because the word is often wrong but sometimes correct, or, possibly, the human teachers do not agree on a correction. Here are two examples which make this clearer:
In this case, our teachers do not agree about whether OK should always be written with capital letters, or they may have sometimes failed to notice this error. The actual ratio is: 'ok' identified as an error and corrected to 'OK' = 1382 times; 'ok' not identified as an error = 641 times. As a result, the system cannot say with 90% certainty that 'ok' is incorrect, but it knows there is something 'suspicious' about it. The learner must decide. In the second example here:
the system is uncertain because 'the cars' is sometimes correct (when it refers to specific cars that have been mentioned before), and sometimes incorrect (when the learner is talking about cars in a general way), but has been marked incorrect by the teachers enough times to create doubt about it. Because the system is not context aware (it cannot tell whether 'cars' have been mentioned before, for example) but has seen 'the cars' marked incorrect a significant number of times, it gives it the 'suspicious word' feedback. It is not saying that it is wrong, but that there is a possibility it is wrong. The learner must think for themself and decide based on context and their learning of the use of the definite article.