Text extracting from an Image (OCR )

Extract text from an image.

Extracting text from an image using pytesseract tool and package using Python programming.

install tesseract.exe from here.

First, here is an image which contains some text as following..[saved as sample.jpg)

And here is the full code to extract text from above image.

from PIL import Image
#Python Imaging Library (abbreviated as PIL) (in newer versions known as Pillow)
# is a free and open-source additional library for the Python programming language that
#adds support for opening, manipulating, and saving many different image file formats.
import pytesseract
#Python-tesseract is an optical character recognition (OCR) tool for python
import cv2
pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe' 
img = "sample.jpg"
t1=pytesseract.image_to_string(Image.open(img))
#print(pytesseract.image_to_string(Image.open(img)))
print(t1)
file1=open('recognized.txt', 'w')
file1.writelines(t1) 
file1.close()
print("File created successfully!")

And the output from above program is..

As the program just extracted only characters from an image. And also extracted string is stored in text file by using file handling.

Share
Share
Scroll to Top