From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Greylist: delayed 304 seconds by postgrey-1.34 at speech; Tue, 03 Jan 2012 13:18:15 EST Received: from imr-db03.mx.aol.com (imr-db03.mx.aol.com [205.188.91.97]) by speech.braille.uwo.ca (Postfix) with ESMTP id 16F2AC1A0C8 for ; Tue, 3 Jan 2012 13:18:15 -0500 (EST) Received: from mtaout-da04.r1000.mx.aol.com (mtaout-da04.r1000.mx.aol.com [172.29.51.132]) by imr-db03.mx.aol.com (8.14.1/8.14.1) with ESMTP id q03ID83w004515 for ; Tue, 3 Jan 2012 13:13:08 -0500 Received: from layla (mwhapples.plus.com [80.229.137.216]) by mtaout-da04.r1000.mx.aol.com (MUA/Third Party Client Interface) with ESMTPA id 84869E000086 for ; Tue, 3 Jan 2012 13:13:07 -0500 (EST) Message-ID: <66B84B1D49004B7D8DCE9C34AF6CB22E@layla> From: "Michael Whapples" To: "Speakup is a screen review system for Linux." References: <20120103164040.GA12039@sonata.rednote.net> In-Reply-To: <20120103164040.GA12039@sonata.rednote.net> Subject: Re: Anyone able to OCR a PDF file? Date: Tue, 3 Jan 2012 17:38:12 -0000 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal Importance: Normal X-Mailer: Microsoft Windows Live Mail 15.4.3538.513 X-MimeOLE: Produced By Microsoft MimeOLE V15.4.3538.513 x-aol-global-disposition: G DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mx.aol.com; s=20110426; t=1325614388; bh=hNFIMeclDPUwk4LcBUV6eYvF8OXQGyXKns2V5cfgsaU=; h=From:To:Subject:Message-ID:Date:MIME-Version:Content-Type; b=B8idDKA6hW3qIgkffJUJ8epcZLT/agqvSKduDkCZGguT8Vg++mfjjDZHi9YfrPSXg BtOL1+W+pDscEYKizAd/9CZoq2aQz49OVRpoeTPGdbfeV9HQwcLxLqgVUpSdHa9ov4 ni8Pbz2cODYLAMtJvG8tmC8HktYlgdeJ3rZBQz3s= X-AOL-SCOLL-SCORE: 0:2:406629568:93952408 X-AOL-SCOLL-URL_COUNT: 0 x-aol-sid: 3039ac1d33844f0345332e8c X-AOL-IP: 80.229.137.216 X-BeenThere: speakup@braille.uwo.ca X-Mailman-Version: 2.1.14 Precedence: list Reply-To: "Speakup is a screen review system for Linux." List-Id: "Speakup is a screen review system for Linux." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Jan 2012 18:18:15 -0000 I have personally used cuneiform for linux mostly. I cannot remmeber if it can natively manage PDF files (possibly, certainly it can do more than TIFF), however you could use a conversion tool (memory seems to say pdf2tiff). Michael Whapples -----Original Message----- From: Janina Sajka Sent: Tuesday, January 03, 2012 4:40 PM To: speakup@braille.uwo.ca Subject: Anyone able to OCR a PDF file? Has anyone figured out how to get one of the Linux OCR engines (like tesseract) to accept a graphical file (other than .tiff) as input? In particular I'm going to be swamped with graphical PDF files this year. Printing these just to scan them seems both wasteful and inefficient. I know people do this on other OS's. Has anyone suggestions on how to do this in Linux? All suggestions greatly appreciated. Janina -- Janina Sajka, Phone: +1.443.300.2200 sip:janina@asterisk.rednote.net Chair, Open Accessibility janina@a11y.org Linux Foundation http://a11y.org Chair, Protocols & Formats Web Accessibility Initiative http://www.w3.org/wai/pf World Wide Web Consortium (W3C)