Evaluation within the Scope of OCR Workflows - ARIA
efit from image enhancement as part of the pre-processing stage. For modern business docu- ments, at the other end of the spectrum, it is easier to obtain scans ...
Evaluation within the Scope of OCR Workflows Stefan Pletschacher PRImA Lab, School of Computing, Science and Engineering, University of Salford OCR (Optical Character Recognition) has become more and more a synonym for intricate document analysis systems going far beyond the core task of classifying pixel patterns. Depending on the nature of the source material it is common practise to employ a whole range of loosely and/or tightly coupled pre-processing, recognition, and post-processing methods in order to achieve good overall results. Historical documents, for instance, which often suffer from physical deterioration and thus artefacts in the scanned images can greatly benefit from image enhancement as part of the pre-processing stage. For modern business documents, at the other end of the spectrum, it is easier to obtain scans of reasonable quality but the textual content might prove difficult to recognise unless background knowledge (for example from customer databases or accounting systems) is incorporated during recognition and post-processing. Evaluation of complex OCR workflows should therefore not only measure the accuracy of the final output but also target all intermediate stages. Not only can this help to reveal bottlenecks and to identify further optimisation potential of the overall system, it can also provide detailed data for making informed decisions if, especially in mass digitisation, project specific requirements regarding cost, quality and time are to be met. Objective and reproducible evaluation of OCR workflows depends on a number of factors. One important aspect is the selection of datasets which have to be representative of the collection and of the problem that is to be examined. In order to measure the success of a method on a given sample it is necessary to have the corresponding ground truth (the true and/or expected result for this particular task) ready. Unfortunately, the creation of ground truth is typically a manual task and therefore extremely time-consuming and costly. As a consequence, it is very difficult to produce datasets of an acceptable size especially if there are only limited resources available. Semi-automated tools for ground truth production can play an important role in overcoming this problem. As for storing ground truth and processing results it is crucial to rely on mature formats which allow a very accurate representation (such as polygons instead of simple bounding boxes for region outlines) and which can be processed by automated evaluation tools. Workflow frameworks can then be used as experimental environments by setting up whole processing chains including evaluation points for individual methods and the overall system. In order to arrive at a practical interpretation of the results it is important to put the measured facts (metrics) into the context of use scenarios (weights) reflecting the needs of concrete digitisation projects. ABSTRACT.
proposal in the 1978 Release, the Commission stated that it would ''maintain ...... Okapi at TRECâ, Information Processing & Management, vol. 36, no 1, 2000, p.
cipally used for lock elision in Java to accelerate locking. The solution appears to be tightly integrated ... the art of virtualization. In Proceedings of the 19th ACM.
speed up the synchronization of concurrent programs. In this paper, we report .... Sun's hardware transactional memory (HTM) design [11],. TLB misses do not ...
torque-angle characteristic of the tonic stretch reflex arc and the load characteristic ..... et al. had an elbow flexion force production task while, in the present.
force being varied t60â from vertical in the sagittal plane ... load perturbations of the human arm under .... to calculate the initial angular accelerations of the arm.
xxxxxxxxxxxxxx. All rooms will have LCD and overhead projectors. Authors are required to bring their own laptops for LCD presentations. O Please indicate ...
Aria. BWV 515. J.S.Bach h = 140. : EE 34 c k l. G u ita r 1. 1 ! B ! B ! B. " B. " B. $. B. %. B. $. B. " B. $. B. #. B. $. B. $. B. #. B. $. B ! B. : EE 34 c k l. G u itar 2. 1. $. B.
with ISO and most synchronous memory cards. Connected to a work station serial port or ... adapt to new protocols, functions and commands. The connection of.
1-Chargement du document (comme une pièce jointe dans un e-mail). 2-Ouverture du document dans l'éditeur. 3-Double clics sur les profils créés (ouverture ...
Within this context it is important to have a reliable estimate of the uncertainties of the earthquake epicenter location, depth and magnitude. Early-Est (EE) is a ...
varian ts, the av erage n um b er of successful runs. â the closest to. 11 the b etter). Column. 7-10 show the quality score. (and in paren theses, for. D. aE varian.
cadmium, lead and copper by flame (Varian. AA775) or flameless (Varian AA6 and CRA. 90) atomic absorption spectrophotometry. In all cases, correction for ...
The identification method introduced in this paper ... are also relevant for humans from an economic point of view, generates a ... high-resolution methods of Numeric Morphology-Based ...... phylogeography and invasive dynamics within the T. niger- .
Abstract. This paper discusses the problem of training and testing automatic music instrument classification algorithms. Most articles evaluate the performance of ...
Koyama); and the Department of Surgery I, School of Veterinary. Medicine, Azabu .... enrollment (4 cats in the benazepril group and 2 cats in the placebo group) ...
Our study is devoted to the evaluation of schedules of parallel applications ... cles were published later in the field of operations research on that subject [3,7,9 ..... makespan with our algorithm (denoted as Cordyn), with Sculli's approach [14],
Nov 30, 2010 - 4. Individual results for the uncrowded vs. crowded condition in experiment 1. The observer ... 90% bootstrap quantiles (dashed lines) (SI Materials and Methods). .... ments of confidence, especially during rapid learning (31).
May 15, 2005 - response of the human host. Although ... protein of group A streptococci is M protein extending ... age cell membranes and account for the beta hemolysis .... teria provide a guide for clinical diagnosis,16 no specific symptoms ...
Sep 17, 2007 - 1. Introduction: what is translation evaluation? Given a sentence Sn in a source ... characteristics of the context of use of MT systems, and the other for quality characteristics and ... 2.5.2.4 Ease of modifying grammar rules.
Jul 19, 2015 - to the baseline condition, in which the ATC task is not admin- istered. This might demonstrate a reduced availability of the attentional resources ...
prosody of English spoken by French speakers making use of a system of ... test the effect of visual feedback on the acquisition of prosodic patterns for English ...
Sep 17, 2007 - that some metrics require human judges that cannot be replaced with ... A discussion of the role of automatic procedures in MT evaluation ...
Il y a 5 jours - l'autonomie et la responsabilité sociale. Sans soutien continu, les ... –American Academy of Neurology 2011. –TIDE – BC Children's Hospital ...
paper's structure, the following section reviews the background literature for this study, before ... measure the consumer's attitudes and reactions to a variety of different elements of the retailer's ..... to explain their subsequent course of acti