From: AARON HOWELL <a.howell@student.qut.edu.au>
To: blinux-list@goldfish.cube.net
Subject: Re: Speech in Linux and X-windows (fwd)
Date: Wed, 25 Sep 1996 21:10:23 +1000 (EST) [thread overview]
Message-ID: <Pine.OSF.3.93.960925210931.3832A-100000@sparrow.qut.edu.au> (raw)
This came through on another list I am subscribed to.
I thought some of you, particularly those of you with unix/x programming
experience might find this very interesting.
Aaron
-----------------------------------------------------------------------------
Aaron Howell. Q.U.T Equity Department, Technical Support/Training.
work: a.howell@qut.edu.au Linux/Networking Support.
home: a.howell@student.qut.edu.au phone +61-19-956-467
www: http://www.cnl.com.au/~aaron irc: DaRkAnGeL
MODULE disclaimer; FROM STextIO IMPORT WriteString,WriteLn; BEGIN;
WriteString("The opinions herein are mine, and do not in any way");
WriteString(" Reflect those of Q.U.T."); WriteLn; END disclaimer.
---------- Forwarded message ----------
Date: Tue, 24 Sep 1996 14:31:51 -0700 (PDT)
From: Tim Noonan <timn@ion.apana.org.au>
Reply-To: Keith Edwards <kedwards@parc.xerox.com>
To: GUISPEAK@LISTSERV.NODAK.EDU
Subject: Re: Speech in Linux and X-windows (fwd)
Originally From: Keith Edwards <kedwards@PARC.XEROX.COM>
Hello,
I'm not a regular subscriber to this list, but a message from it
recently got forwarded to me and I felt that I should follow up.
In a previous life, I was one of the developers of the UltraSonix X
screenreader software at Georgia Tech (which was called Sonic X before
that, and Mercator even before that...trademark issues and all). I
was the project manager for part of that time.
The status of the project is that work on the screenreader system at
Georgia Tech officially ended back in December of last year. Most
of the original developers have since gone their own way. My work now
is unrelated to access.
I've been in contact with the licensing folks at Tech, however, and
we've gotten clearance to make the code available to anyone who wants
it for non-commercial use. I've been *extremely* bogged down at work
and haven't actually managed to jump through all the legal hoops to
make this happen yet, but I'm going to try to do it as soon as I can.
I'll forward an announcement to this list when it's done.
Before I respond to some of Jim's specific comments, a bit of project
history might be helpful.
We started this project way back in 1991, with the goal of making a
system that required no modifications to either the X client-side
libraries or the X server. Basically this system was a
"pseudo-server" -- a program that sits between client applications and
the "real" X server and interprets the X protocol as it flies back and
forth.
This early version of the system had three great benefits for us: (1)
it gave us a working prototype screenreader for X that we could play
with, (2) it gave us some experience with the range of design
possibilities for building X screenreaders, and (3) it generated
attention about access issues inside the X Consortium.
On item (2) above, we learned that it is *extremely* difficult to
build a screenreader for X that works by just snooping X protocol
traffic -- the X protocol is simply too low-level to build a reliable
screenreader. This is actually something we knew before we started,
but because of our design constraint of requiring no modifications to
X, we were forced to go with it.
A benefit of item (3) is that we were able to convince the X
Consortium that a set of standard modifications could be made to the X
platform that would make it easy for screenreaders (and other
applications with similar needs) to work in the X environment. We
built a second prototype screenreader based on a small set of
modifications to X and worked with the X Consortium to get these built
in to the base X distribution.
As of X11R6.1, most of these "hooks" for screenreaders are present in
the system. Unfortunately, one remaining piece of the puzzle is still
lacking, and has not been standardized by the Consortium. (I won't go
into all the grungy technical details on this piece here.)
The current screenreader is based on the standardized hooks that are
there, but because of the missing piece, it still requires a few
modifications to the X libraries (not the server) to work.
We think that our big successes with this project were
1. Actually getting the base X platform changed to incorporate hooks
for accessibility.
2. Coming up with some novel interaction techniques for navigating
through 2D graphical interfaces.
...and to a lesser degree...
3. Producing a system that uses the hooks to actually provide access
to some X apps.
I certaintly won't claim that the system is bullet-proof, but the
basic machinery is all there and (IMHO) I think that quite a bit of it
is fairly novel and interesting (but I'm biased of course :-).
Our goal with the non-commercial release is to get the code out into
the hands of seasoned X and Unix hackers and let them bring the system
to the next level of robustness and usability.
On to Jim's specific comments...
jrebman@NETCOM.COM writes:
> They are
> integrating a text manipulation module (probably EMACSpeak) and a
> refreshable braille output module they developed called, DOTSCREEN. They
> are using a DEC Express, and because most of the development is specific
> to this, as is EMACSpeak, I assume that this will be the only synthesizer
> supported. The next most likely candidate would be the DEC PC, because
> all that would be needed to support this synth would be a unix device
> driver, and that would be much more likely to happen than a port of
> textassist.
By "they" I'm not sure if you mean the Georgia Tech folks, or the
military intelligence folks, but I'll say a bit about how I/O works in
UltraSonix.
The basic idea with the system is that UltraSonix is built around a
completely device-independent I/O framework that we did in-house.
This framework lets you plug in a variety of devices--including speech
synthesizers, external keypads, braille keyboards, and even speech
recognizers--and, with a very minimal amount of programming, have them
"just work" with the system.
In essence, the I/O framework translates low-level device-specific
events into higher level internal events used by UltraSonix. So if
you can wrap a bit of code around a device to make it do this event
translation, you're set to use it in UltraSonix.
Having said that, we've built device support for the Dectalk DTC01,
Dectalk Express, Entropic TruTalk (software-only synthesis), Alva 3/20
and 3/80 Braille terminals, a Genovations external keypad, and the IN3
speech recognition software (although support for this has fallen
quite a bit out-of-date).
The code to support these isn't in the form of Unix device drivers;
it's in the form of small shared object modules that get dynamically
loaded into the running UltraSonix executable as needed.
> Even when all of this work is done, it will probably only be a mediocre
> screen reader at best because it will most likely only support a very
> limited number of apps. Explanation: For Ultrasonic to work effectively
> with an application, it assumes some factors in application development
> such as the use of motif, x-intrinsics toolkit, x-test extensions,
> dynamically-linked libraries, etc., etc... In other words, applications
> developers must follow a fairly rigidly-defined style guide. As with
> Windows 3.1, some developers will do this, many will not, and at the
> present time, there are no kernel-level accessibility extensions being
> proposed, at least not any that either myself, or anybody else I know,
> knows of. This is probably where the focus of the X-access project
> should have been, but hindsight is 20/20.
Some of the "factors in application development" that you talk about
are simply the *way* that people develop in X. They're not really
style-guide issues. For example, nearly all commercial (and much
non-commercial) X development is done in the Motif toolkit. If you're
using Motif, you're using the Xt Intrinsics layer on top of which it's
built. This is not an optional thing that developers can do or not
do, and it's certainly not a style guide issue. Likewise, the X Test
extension is not something developers have a choice over. Pretty much
100% of the R5 and later X servers have this extension built-in, so
UltraSonix can rely on this being there. There's nothing developers
can do to turn it off.
The accessibility extensions aren't "kernel level" (and probably
shouldn't be, since Unix kernels typically know nothing of the window
system anyway), but they are "X toolkit level," and which is about as good
as you can get. Anyone using the X toolkit from the X Consortium
already has most of these extensions in their applications, and they
don't have to "do" anything to use them. In fact, they would have to
go out of their way to make their applications *not* use these
extensions. This was the whole point of our work with the X
Consortium.
There are a few factors that must be addressed by the app developers
though, specifically dynamic linking. Right now, to work with
UltraSonix, apps must be dynamically linked. This doesn't require any
code changes on the parts of developers though.
I should point out that even the dynamic linking issue exists only
because not all of the required hooks are present in the standard X
release. If (when?) these are present, then dynamic linking won't be
required.
> You can get the code from the technology office at Georgia Tech, but the
> license is probably in the multi-thousands of dollars, and unless you
> are, or have access to, a unix/X expert, it will probably be nothing but
> frustration.
I'll try to make the legal department at Georgia Tech happy and get
this code out the door as soon as possible.
Unfortunately I may have to agree with you about having a Unix/X
expert handy, though. The software is a bit painful to port and
requires a bit of mastery to get the modified X libraries running.
And some things, particularly basic text navigation, are not as usable
as they could be (building full-scale text-handling code a la
commercial screenreaders is very time-consuming).
Thanks for wading through all this. If anyone has any followups,
please CC me on them since I won't get mail sent just to GUISPEAK.
-keith
----
keith edwards
xerox palo alto research center
reply other threads:[~ UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.OSF.3.93.960925210931.3832A-100000@sparrow.qut.edu.au \
--to=a.howell@student.qut.edu.au \
--cc=blinux-list@goldfish.cube.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).