* Re: OCR software (was Re: Concerning BLinux project (fwd))
@ Lloyd G. Rasmussen
` Jude Dashiell
0 siblings, 1 reply; 8+ messages in thread
From: Lloyd G. Rasmussen @ UTC (permalink / raw)
To: blinux-list
What you ask for is not likely to be available until artificial
intelligence goes forward much further. You are asking a computer
program which knows the *presentation* of a document to correctly
infer the *structure* of that document, or at least attempt to do so.
I recently bought Omnipage 9 for Win95 from Caere Corporation. Among
all its export formats, it includes an HTML export format. From what
I've seen so far, the objective is to make a GUI web browser display
the page, with fonts, italics, centering, intact. The HTML is a
series of <p> and <br> with Font, I, Align attributes. No structure.
It even claims to conform to the HTML 3.0 DTD, and tells you that the
generator is Adobe Word for Word. I know that HTML is not SGML. But
I'm not too hopeful that when OCR programs begin exporting XML, that
they will do much better than this.
I know that Duxbury attempts to create styles in a file which it has
imported from ASCII, but this is usually just a beginning toward
correctly marking up a document. I agree that you're aiming for the
right objective, but I don't know how we're going to get there.
On Mon, 7 Dec 1998 09:28:03 +1100 (AEDT),
Jason White <jasonw@ariel.ucs.unimelb.EDU.AU> wrote:
>On the subject of freely available OCR software, currently under
>development, see http://www.socr.org/
>
>What is most needed as output is not straightforward ASCII text, but
>rather a document which has been marked up in SGML, XML or a related
>language, that preserves its structure and maintains the distinctions
>necessary for the generation of high quality braille and audio output.
>
>
>---
>Send your message for blinux-list to blinux-list@redhat.com
>Blinux software archive at ftp://leb.net/pub/blinux
>Blinux web page at http://leb.net/blinux
>To unsubscribe send mail to blinux-list-request@redhat.com
>with subject line: unsubscribe
>
-- Lloyd Rasmussen
Senior Staff Engineer, Engineering Section
National Library Service for the Blind and Physically Handicapped
Library of Congress 202-707-0535
(work) lras@loc.gov http://www.loc.gov/nls/
(home) lras@sprynet.com http://home.sprynet.com/sprynet/lras/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: OCR software (was Re: Concerning BLinux project (fwd))
OCR software (was Re: Concerning BLinux project (fwd)) Lloyd G. Rasmussen
@ ` Jude Dashiell
0 siblings, 0 replies; 8+ messages in thread
From: Jude Dashiell @ UTC (permalink / raw)
To: Lloyd G. Rasmussen; +Cc: blinux-list
That o.c.r. site is alive. I tried three times tonight and
on the third try got through and bookmarked the page. Thanks much for the
site, however this turns out should be interesting.
------------------------------------------------------------------------
jude <jdashiel@clark.net>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: OCR software (was Re: Concerning BLinux project (fwd))
` Jude Dashiell
` Jack Berdeaux
@ ` Ron Marriage
1 sibling, 0 replies; 8+ messages in thread
From: Ron Marriage @ UTC (permalink / raw)
To: blinux-list
Had the same problem the first time, tried a second time and got in.
Later tried again, and still got a no DNS for first try, but came up again
on second try.
Ron
At 08:10 PM 12/6/98 -0500, Jude Dashiell wrote:
>That site can't be located by lynx. Either busy or nonexistent.
>On Mon, 7
>Dec 1998, Jason White wrote:
>
>> On the subject of freely available OCR software, currently under
>> development, see http://www.socr.org/
Ron Marriage
Email = mailto:marriage@seidata.com
Homepage = http://www.seidata.com/~marriage/
Blind Related Links
http://www.seidata.com/~marriage/rblind.html
or
http://welcome.to/blindlinks
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: OCR software (was Re: Concerning BLinux project (fwd))
` Jack Berdeaux
@ ` Jude Dashiell
0 siblings, 0 replies; 8+ messages in thread
From: Jude Dashiell @ UTC (permalink / raw)
To: blinux-list
Is that a new domain? Reason I ask is I tried a second time and still no
cigar.
------------------------------------------------------------------------
jude <jdashiel@clark.net>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: OCR software (was Re: Concerning BLinux project (fwd))
` OCR software (was Re: Concerning BLinux project (fwd)) Jason White
` Jude Dashiell
@ ` Jack Berdeaux
` Jude Dashiell
1 sibling, 1 reply; 8+ messages in thread
From: Jack Berdeaux @ UTC (permalink / raw)
To: blinux-list
It is working now http://www.socr.org/
Jason White wrote:
> On the subject of freely available OCR software, currently under
> development, see http://www.socr.org/
>
> What is most needed as output is not straightforward ASCII text, but
> rather a document which has been marked up in SGML, XML or a related
> language, that preserves its structure and maintains the distinctions
> necessary for the generation of high quality braille and audio output.
>
> ---
> Send your message for blinux-list to blinux-list@redhat.com
> Blinux software archive at ftp://leb.net/pub/blinux
> Blinux web page at http://leb.net/blinux
> To unsubscribe send mail to blinux-list-request@redhat.com
> with subject line: unsubscribe
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: OCR software (was Re: Concerning BLinux project (fwd))
` Jude Dashiell
@ ` Jack Berdeaux
` Ron Marriage
1 sibling, 0 replies; 8+ messages in thread
From: Jack Berdeaux @ UTC (permalink / raw)
To: blinux-list
nonexistent!
Jude Dashiell wrote:
> That site can't be located by lynx. Either busy or nonexistent.
> On Mon, 7
> Dec 1998, Jason White wrote:
>
> > On the subject of freely available OCR software, currently under
> > development, see http://www.socr.org/
> >
> > What is most needed as output is not straightforward ASCII text, but
> > rather a document which has been marked up in SGML, XML or a related
> > language, that preserves its structure and maintains the distinctions
> > necessary for the generation of high quality braille and audio output.
> >
> >
> > ---
> > Send your message for blinux-list to blinux-list@redhat.com
> > Blinux software archive at ftp://leb.net/pub/blinux
> > Blinux web page at http://leb.net/blinux
> > To unsubscribe send mail to blinux-list-request@redhat.com
> > with subject line: unsubscribe
> >
>
> ------------------------------------------------------------------------
>
> jude <jdashiel@clark.net>
>
> ---
> Send your message for blinux-list to blinux-list@redhat.com
> Blinux software archive at ftp://leb.net/pub/blinux
> Blinux web page at http://leb.net/blinux
> To unsubscribe send mail to blinux-list-request@redhat.com
> with subject line: unsubscribe
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: OCR software (was Re: Concerning BLinux project (fwd))
` OCR software (was Re: Concerning BLinux project (fwd)) Jason White
@ ` Jude Dashiell
` Jack Berdeaux
` Ron Marriage
` Jack Berdeaux
1 sibling, 2 replies; 8+ messages in thread
From: Jude Dashiell @ UTC (permalink / raw)
To: blinux-list
That site can't be located by lynx. Either busy or nonexistent.
On Mon, 7
Dec 1998, Jason White wrote:
> On the subject of freely available OCR software, currently under
> development, see http://www.socr.org/
>
> What is most needed as output is not straightforward ASCII text, but
> rather a document which has been marked up in SGML, XML or a related
> language, that preserves its structure and maintains the distinctions
> necessary for the generation of high quality braille and audio output.
>
>
> ---
> Send your message for blinux-list to blinux-list@redhat.com
> Blinux software archive at ftp://leb.net/pub/blinux
> Blinux web page at http://leb.net/blinux
> To unsubscribe send mail to blinux-list-request@redhat.com
> with subject line: unsubscribe
>
------------------------------------------------------------------------
jude <jdashiel@clark.net>
^ permalink raw reply [flat|nested] 8+ messages in thread
* OCR software (was Re: Concerning BLinux project (fwd))
Concerning BLinux project (fwd) Jude Dashiell
@ ` Jason White
` Jude Dashiell
` Jack Berdeaux
0 siblings, 2 replies; 8+ messages in thread
From: Jason White @ UTC (permalink / raw)
To: blinux-list
On the subject of freely available OCR software, currently under
development, see http://www.socr.org/
What is most needed as output is not straightforward ASCII text, but
rather a document which has been marked up in SGML, XML or a related
language, that preserves its structure and maintains the distinctions
necessary for the generation of high quality braille and audio output.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~ UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
OCR software (was Re: Concerning BLinux project (fwd)) Lloyd G. Rasmussen
` Jude Dashiell
-- strict thread matches above, loose matches on Subject: below --
Concerning BLinux project (fwd) Jude Dashiell
` OCR software (was Re: Concerning BLinux project (fwd)) Jason White
` Jude Dashiell
` Jack Berdeaux
` Ron Marriage
` Jack Berdeaux
` Jude Dashiell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).