Plenary
Lecture
New Approach for Pre-processing and Efficient Archiving
of Scanned Documents
Professor Roumen Kountchev
Faculty of Telecommunications
Technical University of Sofia
Bulgaria
E-mail:
rkountch@tu-sofia.bg
Abstract: Problems concerning the efficient
archiving of scanned documents are some of the major
contemporary challenges in this area. The standards for
still image compression JPEG and JPEG 2000 are very
efficient when natural images are processed, but they
are not as good for texts and graphics: for relatively
high compression ratios the quality of the restored
images is significantly deteriorated. The efficient
compression and correspondingly – archiving of
documents’ images requires a more flexible approach
adapted to the peculiarities of the processed images.
Additional problems arose when compound images,
containing texts and pictures have to be processed. The
best solution is each part to be processed so that to
obtain maximum efficiency.
A new approach for efficient archiving of scanned
documents, comprising texts and pictures, is presented
in this lecture. The offered approach presumes to
compress the pictures and the texts in different way:
the pictures - with lossy coding based on decomposition,
called Inverse Difference Pyramid (IDP), and the parts,
containing text (graphics) – with lossless Adaptive
Run-Length (ARL) coding.
The processing comprises the following main steps:
-Image preprocessing, comprising background filtration
(aimed at noise removal), histogram analysis and
modification;
-Image segmentation – recognition of texts and pictures;
-Image compression – adaptive approach, which permits
the pictures to be compressed with some kind of lossy
IDP compression and the texts – with lossless ARL
coding.
The experimental results obtained for large number of
example documents processed with JPEG, JPEG 2000 and the
new method prove the advantages of the presented
adaptive approach. The same approach is very efficient
for archiving of old handwritten documents.
The presented approach is based in investigations and
patents developed by the lecturer and his team at the
Technical University of Sofia, Bulgaria.
Brief Biography of the Speaker:
Roumen Kountchev, Ph.D., D. Sc. is a professor at the
Faculty of Telecommunications at the Technical
University of Sofia, Bulgaria and the head of the Image
Processing Laboratory.
His main areas of interest are: Digital image
processing, Image compression, Multimedia watermarking,
Video communications via Internet, Pattern recognition
and neural networks. He has 259 papers published in
magazines and proceedings of conferences; 12 books and
books chapters, 20 patents, and participated in 46
scientific research projects (in 38 projects he was the
principal investigator).
He is the President of the Bulgarian Association for
Pattern Recognition (BAPR), member of International
Association for Pattern Recognition (IAPR), member of
editorial board of “International Journal of
Reasoning-based Intelligent Systems” (IJRIS), member of
the Scientific Expert Commission of Bulgarian Ministry
of Education and Science; President of the Technological
Council of Bulgarian National Radio, member of the
Higher Attestation Commission of the Council of
Ministers of Bulgaria.
|