Automatic Text Scoring By Neural Network

Posted by Kaiyuan Chen on December 25, 2017

Automatic Text Scoring By Neural Network

by Dimitrios Alikaniotis, Helen Yannakoudakis, Marek Rei

This paper proposes to use RNN for ATS. My initial thought is that RNN highly depends on the ability to examine their characteristics, whether they measure what is intended to be measured and whether their internal marking criteria can be interpreted in a meaningful and useful way.


ATS: systems typically exploit a large range of tex-tual features that correspond to different proper-ties of text, such as grammar, vocabulary, style, topic relevance, and discourse coherence and co-hesion.



(by Collobert and Weston) introduce a NN that learns a distributed representation for each word w in a corpus based on its local context. Embedding of a word is learned by substituting target word to a randomly sampled word in dictionary discriminativly.

S = {w1, w2 … wt … wn} S’= {w1, w2 … wc … wn} ->substitute w_target to w_complement

The model learns word embeddings by ranking the activation of the true sequence S higher than the activation of its ‘noisy’ counterpart S’. The the activation of the true sequence S higher than the activation of its ‘noisy’ counterpart S’.

Augmented C&W

Goal is to capture score-specific word embeddings(SSWE), and then use linear output unit vector to do linear regression to predict essay score. Advantage is to have same word that conveys same information to depend on only linguisitic environment. Example in paper:

words such as computer and laptop are going to be placed together with their mis-spelled counterparts cop- muter and labtop


Bidirectional: feed the embedding of each word found in each essay to the LSTM one at a time, zero-padding shorter sequences.


None < bidirectional < word2vec < SSWE Thus the winner(of course) is SSWE + two layer BLSTM with pcc 0.96 and spearsman coefficient 0.91 The quality of word vectors is visualized by “a box”.


I think the whole problem is solved rather by capturing SSWE. LSTM offers a good way to align the sequence and predicting, but what we really care in ATS is to score a text, which requires to find information that determines the score. This problem is solved by augumented C&W model.