How to improve the recognition ability of OCR recognition software

In the process of entering texts, we often encounter customers requesting input for printing manuscripts. Although they can be entered using Wubi type or other input methods, if there is a scanner, OCR character recognition software can quickly identify the input. It is convenient and quick, but also saves time and effort. The quality of the documents received by some customers is poor (such as photocopying or stylus printer drafts). According to common operations, OCR is used to identify them. The recognition rate is so low that some of the input personnel prefer to type and do not want to use OCR. Recognition, but according to my usual summary experience, as long as the OCR recognition software is set properly, the text recognition rate is still very high, basically can reach more than 95%.

According to my experience, the following points should be noted during operation:

1. Scan interface settings

Scanner Twain should be used to scan the interface so that it is intuitive and convenient to operate. The resolution is set to 200 lines/inch or 3O0 lines/inch. Brightness is generally set to automatic selection.

2. Twain scan settings

The original type should be set to line drawing, and the threshold value should be set to a larger value, generally set between 40% and 75%. If the original is relatively clear, the threshold should be set between 40% and 55%. If the original is of poor quality, the threshold can be set between 60% and 75%.

3. Specific operation when OCR recognition

After the text scan enters the OCR software, the tilt must be corrected first. The correction is divided into automatic correction and pre-hand correction. In general, the automatic correction function can be used to automatically correct the tilt. In special cases, manual correction can be performed by holding down the right mouse button and dragging a straight line perpendicular to the text. The second is to delete the text layout of the dirty points and unwanted parts that need not be identified (for dirty and redundant parts, you can circle and then delete). The third is layout analysis. Under normal circumstances, automatic layout analysis can be used to obtain ideal results. For manuscripts with more complicated layouts, such as horizontal and vertical originals, manual layout analysis must be used. Line drawings horizontal and vertical sections of the publication, and respectively marked as horizontal text and vertical text (Note: I use OCR recognition software for Tsinghua TH-OCR for HP).

After the above operation. It can improve the recognition rate of the text and basically guarantee that the text recognition rate is above 95%. After the recognition is complete, modify the suspicious or typos. In this way, the entire work is basically completed.

Restaurant Set

Solid Wood Table Set,Restaurant Table Set,Dining Set Furniture

Seating/Office Chair Co., Ltd. , http://www.chinafutan.com

Posted on