ShareX as my main screen capture tool because it can take care of modifying, uploading, and storing screenshots, screen records, and GIFs all in one place. However, I still have two different Window hotkeys (Win+Shift+S and Ctrl+PrtSc) for taking screenshots, which is redundant. This got me thinking about the different uses I could put the Snip and Sketch tool to.
![[Portable_scanner_and_OCR_(video) 1.webm]] Title: Portable scanner and OCR (video).webm
Author: Vassia Atanassova - Spiritia
Date: 28 February 2017
OCR is a technology that allows you to convert documents, such as scanned paper documents, PDF files, or images captured by a digital camera, into editable and searchable data. The Tesseract Optical Character Recognition (OCR) engine by Google is arguably the most popular out-of-the-box solution for OCR.
It’s only fitting to remedy hotkey redundancy from open-source tools with open-source OCR.
Tesseract is a free software optical character recognition engine developed by Hewlett-Packard in the 1980s. In 2005, it became open source and had since been sponsored by Google. Tesseract has Unicode support and can recognize over 100 languages. It can also be trained to recognize other languages.
Tesseract was initially developed at Hewlett-Packard Laboratories Bristol and at Hewlett-Packard Co, Greeley Colorado between 1985 and 1994. Some additional changes were made in 1996 to port the software to Windows, and again in 1998 to convert some of the code to C++.
There are many benefits to digitizing documents using OCR data entry. This includes the ability to transfer important documents to tablets, computers, smartphones, etc. This can be helpful for businesses in various industries, including banking, mortgage, financial, legal, and healthcare. Some commonly digitized documents include invoices, industry articles, tax documents, payroll information, legal filings, contact information, business cards, flyers, and financial investments.
The process of copying text from an image using OCR involves three simple steps: 1. Monitoring a folder for new screen clips. 2. Passing the latest screen clip through the OCR system. 3. Copying the results from Tesseract OCR to the clipboard.
![[Tesseract_OCR_pipeline_architecture 1.png]]
This can be done with a few lines of code using the pyscreenshot, pytesseract, and pyperclip libraries.
This simple pipeline can be further automated using the Watchdog library to monitor a folder for new screenshots and the PyAutoGUI library to handle hotkey bindings.
The code above can be further extended to take advantage of the Windows Task Scheduler to start the OCR pipeline on startup.
Google Tesseract is a powerful OCR tool that can be used to convert images to text. With the help of Snip & Sketch and a few lines of code, it can be used to create a simple OCR pipeline that can be used to automate the process of extracting text from images.