From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from opera.rednote.net (opera.rednote.net [66.228.34.147]) by speech.braille.uwo.ca (Postfix) with ESMTP id F16A1C1A0C8 for ; Tue, 3 Jan 2012 11:40:45 -0500 (EST) Received: from sonata.rednote.net (sonata.rednote.net [IPv6:2001:470:8:4ef:216:d3ff:fecc:ec01]) by opera.rednote.net (8.14.5/8.14.5) with ESMTP id q03Gek1L007539 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 3 Jan 2012 16:40:46 GMT X-DKIM: Sendmail DKIM Filter v2.8.3 opera.rednote.net q03Gek1L007539 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=rednote.net; s=default; t=1325608846; bh=vqfTy/4ekATGDgQPN3plHAwUPZazuEnX2OxB2IjNEE0=; h=Date:From:To:Subject:Message-ID:MIME-Version:Content-Type; b=Lh3eQHqFica/4hc94l7jj6bijoH0ahh5n1aNdwfaL1kqadZ16zLnA7qVThlDc1NxE RaXLq4Dbyylk+KK8XeTQcfpThRnWFjPPWQh7BwzMkBEgS6iRJD5vmcLNnXV9+Rsr2E jYNgFj2/FZEqjs3AUlJT4LQSx7eLLO2M2os7LvRw= Received: from sonata.rednote.net (sonata.rednote.net [127.0.0.1]) by sonata.rednote.net (8.14.5/8.14.5) with ESMTP id q03Gej8K015987 for ; Tue, 3 Jan 2012 11:40:45 -0500 Received: (from janina@localhost) by sonata.rednote.net (8.14.5/8.14.5/Submit) id q03GejEE015986 for speakup@braille.uwo.ca; Tue, 3 Jan 2012 11:40:45 -0500 X-Authentication-Warning: sonata.rednote.net: janina set sender to janina@rednote.net using -f Date: Tue, 3 Jan 2012 11:40:45 -0500 From: Janina Sajka To: speakup@braille.uwo.ca Subject: Anyone able to OCR a PDF file? Message-ID: <20120103164040.GA12039@sonata.rednote.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: speakup@braille.uwo.ca X-Mailman-Version: 2.1.14 Precedence: list Reply-To: "Speakup is a screen review system for Linux." List-Id: "Speakup is a screen review system for Linux." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Jan 2012 16:40:46 -0000 Has anyone figured out how to get one of the Linux OCR engines (like tesseract) to accept a graphical file (other than .tiff) as input? In particular I'm going to be swamped with graphical PDF files this year. Printing these just to scan them seems both wasteful and inefficient. I know people do this on other OS's. Has anyone suggestions on how to do this in Linux? All suggestions greatly appreciated. Janina -- Janina Sajka, Phone: +1.443.300.2200 sip:janina@asterisk.rednote.net Chair, Open Accessibility janina@a11y.org Linux Foundation http://a11y.org Chair, Protocols & Formats Web Accessibility Initiative http://www.w3.org/wai/pf World Wide Web Consortium (W3C)