OK, wanting to do OCR is more information.
I'll second everything Sanwin said. OneNote 2010 has taken over the "native" OCR function in Office. Give it a try. But as Sanwin pointed out, the various tools perform differently on the same file/image, so having more than one on hand is a good idea.
One of the features I've learned to look for when trying OCR tools is specifically how they handle lines and paragraphs of text. Many of the older / cheaper (?) / too-literal tools put each line of recovered text into a separate text box or frame. They do it
to maintain absolute control over text placement. But I find it is a too strict interpretation of the OCR concept and it makes editing the result effectively impossible because each line is treated as a separate entity. I look for tools that at least have
the option of recovering "text only", ignoring formatting because that will often give me the most workable / usable text extraction because I am more interested in extracting editable text than recreating the original document in pseudo-editable form.
Here is some more info I've collected about doing OCR:
OCR- HANDWRITING RECOGNITION IN OFFICE
http://office.microsoft.com/en-us/help/handwriting-recognition-in-office-HA001045125.aspx
Insert_PDFs_to_Word (not exactly OCR)
http://www.officeexpander.com/insertpdf/index.php
Insert and resize multiple pages of a PDF into Word
Pages from a PDF can be inserted as individual PDFs, or as images that can be resized into a Word document.
OCR - Optical Character Recognition
One of the key features I look for when testing an OCR tool is how it handles text location. Some are overly “OCD” when it comes to line positioning so they put every line in a text box. That makes editing the text
in a paragraph impossible. When I see that “feature” I automatically uninstall that trial.
Note: Word 2013 has Built in OCR
FYI: Word 2013 has a form of PDF OCR built in. You can open a PDF file and Word will automatically create copy only “copyable” text in the PDF into a new DOCX.
It does not capture text the is part of an image in the PDF.
How to Scan a Document into Word 2010 - <ALT><i><p><s>
<ALT><i><p><s> is the menu based shortcut for Insert menu / Picture command / From
Scanner or Camera command. It does not map directly to the Ribbon. The scan command is no longer available on the ribbon. You can add it to the QAT.
OCR anything with OneNote 2007 and 2010
http://www.howtogeek.com/howto/14595/ocr-anything-with-onenote-2007-and-2010/
Quality OCR software can often be very expensive, but you may have one already installed on your computer that you didn’t know about. Here’s how you can use OneNote to OCR anything on your computer.
Please Note: This feature is available in OneNote 2007 and 2010. OneNote 2007 is included with Office 2007 Home and Student, Enterprise, and Ultimate, while OneNote 2010 is included with all editions of Office 2010 except for Starter edition.
OCR anything
First, let’s add something to OCR into OneNote. There are many different ways you can add items to OCR into OneNote. Open a blank page or one you want to insert something into, and then follow these steps
to add what you want into OneNote.
Picture
Simply drag-and-drop a picture with text into a notebook…
You can insert a picture directly from OneNote as well. In OneNote 2010, select the Insert tab, and then choose Picture.
In OneNote 2007, select the Insert menu, select Picture, and then choose From File.
Screen Clipping
There are many times we’d like to copy text from something we see onscreen, but there is no direct way to copy text from that thing. For instance, you cannot copy text from the title-bar of a window, or from
a flash-based online presentation. For these cases, the Screen Clipping option is very useful. To add a clip of anything onscreen in OneNote 2010, select the Insert tab in the ribbon and click Screen Clipping.
In OneNote 2007, either click the Clip button on the toolbar or select the Insert menu and choose Screen Clipping.
Alternately, you can take a screen clipping by pressing the windows key + S.
When you click Screen Clipping, OneNote will minimize, your desktop will fade lighter, and your mouse pointer will change to a plus sign. Now, click and drag over anything you want to add to OneNote. Here
we’re selecting the title of this article.
The section you selected will now show up in your OneNote notebook, complete with the date and time the clip was made.
Insert a file
You’re not limited to pictures; OneNote can even OCR anything in most files on your computer. You can add files directly in OneNote 2010 by selecting File Printout in the Insert tab.
In OneNote 2007, select the Insert menu and choose Files as Printout.
Choose the file you want to add to OneNote in the dialog.
Select Insert, and OneNote will pause momentarily as it processes the file.
Now your file will show up in OneNote as a printout with a link to the original file above it.
You can also send any file directly to OneNote via the OneNote virtual printer. If you have a file open, such as a PDF, that you’d like to OCR, simply open the print dialog in that program and select the “Send
to OneNote” printer.
Or, if you have a scanner, you can scan documents directly into OneNote by clicking Scanner Printout in the Insert tab in OneNote 2010.
In OneNote 2003, to add a scanned document select the Insert menu, select Picture, and then choose From Scanner or Camera.
OCR the image, file, or screenshot you put in OneNote
Now that you’ve got your stuff into OneNote, let’s put it to work. OneNote automatically did an OCR scan on anything you inserted into OneNote. You can check to make sure by right-clicking on any picture,
screenshot, or file you inserted. Select “Make Text in Image Searchable” and then make sure the correct language is selected.
Now, you can copy text from the Picture. Simply right-click on the picture, and select “Copy Text from Picture”.
And here’s the text that OneNote found in this picture:
OCR anything with OneNote 2007 and 2010 - Windows Live Writer
Not bad, huh? Now you can paste the text from the picture into a document or anywhere you need to use the text.
If you are instead copying text from a printout, it may give you the option to copy text from this page or all pages of the printout.
This works the exact same in OneNote 2007.
In OneNote 2010, you can also edit the text OneNote has saved in the image from the OCR. This way, if OneNote read something incorrectly you can change it so you can still find it when you use search in OneNote.
Additionally, you can copy only a specific portion of the text from the edit box, so it can be useful just for general copying as well. To do this, right-click on the item and select “Edit Alt Text”.
Here is the window to edit alternate text. If you want to copy only a portion of the text, simply select it and press Ctrl+C to copy that portion.
Searching
OneNote’s OCR engine is very useful for finding specific pictures you have saved in OneNote. Simply enter your search query in the search box on top right, and OneNote will automatically find all instances
of that term in all of your notebooks. Notice how it highlights the search term even in the image!
This works the same in OneNote 2007. Notice how it highlighted “How-to” in a shot of the header image in our favorite website.
In Windows Vista and 7, you can even search for things OneNote OCRed from the Start Menu search. Here the start menu search found the words “Windows Live Writer” in our OCR Test notebook in OneNote where we
inserted the screen clip above.
Conclusion
OneNote is a very useful OCR tool, and can help you capture text from just about anything. Plus, since you can easily search everything you have stored in OneNote, you can quickly find anything you insert
anytime. OneNote is one of the least-used Office tools, but we have found it very useful and hope you do too.
Handwriting Recognition in Office
http://office.microsoft.com/en-us/help/handwriting-recognition-in-office-HA001045125.aspx
Converting PDF to Word Documents
http://word.tips.net/T000096_Converting_PDF_to_Word_Documents.html
PDF to OneNote
If you can’t import the PDF into OneNote, you can use the OneNote printer driver to “Print to OneNote”.
MODI on 64 bit Windows
http://social.msdn.microsoft.com/forums/en-US/isv/thread/f3cf6478-d6c8-42dd-97b3-bc54d30719d2/
aaronmartinez
I have a workaround that is annoying but it works. For 64 bit users, the image writer can be installed in a virtual machine, such as Virtual PC or Sun's free Virtualbox.
I used the XP Mode in Windows 7 Ultimate 64 bit and was able to install and use the 2003 image writer.
XPS2OneNote - Print to OneNote 2007 on 64 bit Vista or Windows 7 OS
https://xps2onenote.codeplex.com/
2003 Office Document Imaging - Part 1 – Using MODI
http://news.office-watch.com/t/n.aspx?a=261
Helen Bradley looks at Office Document Scanning, which is a small dedicated interface for scanning documents.
2003 Office Document Imaging - Part 2
http://office-watch.com/t/n.aspx?a=262&z=0
17 August 2005 - Office Watch
We look at the faux printer that comes with Office 2003 plus the Document Imaging Tool that brings the scanning tool (mentioned in part 1) and the imaging writer together into a useful trio.
Using MODI in Office 2010
http://www.brighthub.com/computing/windows-platform/articles/91749.aspx
Replace MODI in 2010
http://en.wikipedia.org/wiki/Microsoft_Office_Document_Imaging
If running Office 2010 which lacks MODI, here are a few alternatives (among others):
Missing your 2007 Microsoft Document Imaging, MODI, in Office 2010? – Get it free with SharePoint Designer 2007
https://msmvps.com/blogs/steveb/archive/2010/10/12/missing-your-microsoft-document-imaging-in-office-2010.aspx
A quick Word trick for typing text into a scanned document
http://www.techrepublic.com/blog/msoffice/a-quick-word-trick-for-typing-text-into-a-scanned-document/8092?tag=nl.e064
OCR Tools
OCR- FreeOCR
Take a look at this article about a free open source OCR tool call FreeOCR V3:
http://www.worldstart.com/tips/tips-pr.php/7225
I've only played with it for a few minutes, but for the most part I like what it does.
It is a fairly simple OCR program compared to many. It extracts text. It doesn't worry about formatting or graphics, just extracts text only. 1 page at a time. It works better if you select blocks around graphics.
It is slow, but accurate when you keep it away from graphical elements.
OCR - LibreOffice - Import PDFs with ease
In previous iterations of OpenOffice, the PDF import was a kludge at best. Although LibreOffice handles the importing of PDFs the same way (imports them as LibreOffice Draw documents), the results are far better. Once
you have imported your PDF (you do so by using the Open dialog), you can edit text, images, and layouts by clicking and dragging (for images/layout) or double-clicking a line of text to edit. (You can edit only a single line of text at a time.) When you finish
editing your PDF, don’t save the document — instead, export it as a PDF by clicking File | Export as PDF.
Extract Text from Images: 10 OCR Tools Compared (see the comments)
http://www.howtogeek.com/96712/extract-text-from-images-10-ocr-tool-compared/
How to extract
text from images: a comparison of 10 free OCR tools
http://www.freewaregenius.com/2011/11/01/how-to-extract-text-from-images-a-comparison-of-free-ocr-tools/
Online OCR services
- **Google Docs**
- **Free Online OCR**
- **i2OCR**
- **OCRonline**
- **Online OCR**
Desktop software
- **Cuneiform OpenOCR**
- **FreeOCR**
- **gImageReader**
- **Puma.NET**
- **SimpleOCR**
iOrgSoft PDF Converter
http://www.convertapdftoword.com/
Trial limited to 5 pages. Tried 7 pdfs. 1 failed to open, 1 wasn’t very good, the other 5 were reasonable.
Each line is treated as a paragraph
Recosoft PDF2Office Personal - 5 use / 5 page limited “Free” trial.
http://www.recosoft.com/pdf-converter.htm
Convert PDF to Word. Other packages on site include Excel and PowerPoint converters too.
Has 4 modes: Flowing Text, Retain Layout, Text Only, Images only.
5 use / 5 page limited “Free” trial.
Refused to read first 5 PDF’s I tried, 65451 file format error.
The 5 files I tried had mixed results. I tried Text flow and Text only. One of text flow trials had a lot of embedded graphics. Program had a problem placing them correctly in the text. Text extraction was generally done
well enough. Some strange, additional, paragraph breaks were inserted.
Worth considering.
Boxoft PDF to Word
http://www.boxoft.com/pdf-to-word/
Boxoft PDF to Word is a is a 100% freeware to convert Adobe PDF documents to Microsoft Word files. By using the efficient software, you can batch convert portable PDFs to editable Word files with preserving the original formatting: text,
images, column and row layout. The program also provides Hot Directory Mode help to convert PDF files written in some folder to DOC format automatically. Most of all, this program is freeware, you can use this converter either for commercial or personal purposes
as you will.
FREE PDF to WORD CONVERTER
http://www.hellopdf.com/
Free PDF to Word Doc Converter is such a desktop document conversion tool to convert Adobe PDF file to Microsoft Word Doc file - and it's totally for
FREE!
The program can extract text, images, shapes from PDF file to Word Doc file and preserve the layout. It can convert all the pages, or any pages range of the PDF file.
And it is a standalone program - you can convert PDF to Word Doc without Adobe Acrobat Reader or Microsoft Word
installed!
Solid Converter PDF (to Word)
http://www.techrepublic.com/software/solid-converter-pdf-71-build-934-windows/3807121
Use our easy Conversion Wizard to walk through the process step-by-step, or quickly convert PDFs to DOC or RTF files with a few clicks. You can download and try out Solid Converter for a free 15-day trial. Solid Converter PDF is ideal for
anyone who needs to edit and reuse content from a PDF file.
Soft Solutions Image to OCR Converter
http://products.softsolutionslimited.com/img2ocr/index.htm
Convert PDF to HTML
Many PDF to DOC converters use text boxes to control exact placement of lines on page. Conversion to HTML does not require that exact control of line placement.
- Each line separate: Some PDF to HTML.
- Paragraph maintained: PDF Online
- break each PDF page into a separate HTML page:(Some PDF to HTML
- create single HTML document for whole PDF: PDF Online
5 Free Online Converter to Convert PDF to Word
http://www.docxtopdf.org/5-free-online-converter-to-convert-pdf-to-word/
doc2pdf
Free File Converter
Zamzer
Convertonlinefree.com
Pdftoword.com
3 Online Converters to Convert PDF to Excel
http://www.docxtopdf.org/3-online-converter-to-convert-pdf-to-excel/
Getting Rid of a Bunch of Frames
http://word.tips.net/T001664_Getting_Rid_of_a_Bunch_of_Frames.html
How to Make Scanning Big Pictures Easy With (Freeware) Microsoft ICE
– Image Composite Editor
http://www.howtogeek.com/100500/how-to-make-scanning-big-pictures-easy-with-freeware-microsoft-ice/
Scanning pics is a big enough pain, but oversize images can be a nightmare. Today, we’ll look at some tips at scanning huge images with smaller scanners, and how a bit of Microsoft freeware can make the process much easier.
You won’t see any Photoshop or even GIMP in our how-to today. HTG readers suggested this very excellent
freeware, and it’ll make your life a lot easier if you ever find you have to get digital images of any of your oversize prints, posters, or photographs. We’ll cover a wealth of tips and advice for
making the process easier, as well as covering making images with the free software. And, for those readers that are very experienced at scanning, tell us in the comments sections about your own tips and tricks to get great images out of your favorite brand
of scanner.
Finally, Word 2013 has a form of OCR, but it is not exactly OCR. In Word 2013 you can open PDF files for editing. If they have text in a form that can be selected for copy and paste, that text will be extracted into a new Word DOC. Images will be imported as
is without extracting text.