Journal Title
Title of Journal:
|
|
Publisher
Springer, Berlin, Heidelberg
|
|
|
|
Authors: BinGe Cui Xin Chen
Publish Date: 2010/8/18
Volume: , Issue: , Pages: 205-212
Abstract
In this paper we proposed an improved Hidden Markov Model HMM to extract metadata in the academic literatures We have built a dataset including 458 literatures from the VLDB conferences which contains the visual feature of text blocks Our approach is based on the assumption that the text blocks in the same line have the same state information type The assumption is effective in more than 98 occasions Thus the state transition probability among the same states in the same line is much larger than that in different lines According to this conclusion we add one state transition matrix for HMM and modified the Viterbi algorithm The experiments show that our extraction accuracy is superior to that of any existing works
Keywords:
.
|
Other Papers In This Journal:
|