OCRopus 0.4
OCRopus is a state-of-the-art document analysis and OCR system, featuring pluggable layout analysis, pluggable character recognition, statistical natural language modeling, and multi-lingual capabilities.The OCRopus engine is based on two research projects: a novel high-performance layout analysis methods, and a high-performance handwriting recognizer developed in the mid-90\'s and deployed by the US Census bureau.OCRopus is development is sponsored by Google and is initially intended for high-throughput, high-volume document conversion efforts. What\'s New in This Release: [ read full changelog ]· image understanding related code is now in a separate project (www.iulib.org)· move to autoconf/automake build system· support for beam search decoding (OpenFST is now optional)· fixed memory leaks in Voronoi page segmentation code· ported Voronoi page segmentation code to 64-bit architecture· word segmentation· refactored text/image segmentation code· optional access to text/image segmentation from Leptonica· inclusion of images and other non-text zones in the hOCR output· optional inclusion of character boxes in the hOCR output· improved edit distance with block movement for OCR evaluation· scripts for evaluation of Layout and OCR performance· bug fixes in bpnet (normalization of input data, right loading of a bpnet)· feature selection, feature size selection, fix bug in skeleton feature, added connected component segmenter for train and test· scripts for training/testing bpnet, documentation· better loggers for training and testing (confusion matrix, lineinfo, segmenter, fst, etc)· more accurate text line analysis
Ссылка: http://ocropus...files/ocropus-0.4.tar.gz
Ссылка: http://ocropus...files/ocropus-0.4.tar.gz
Видео: