public inbox for blinux-list@redhat.com
 help / color / mirror / Atom feed
* Re: Technical Question (was Digital Talking Book Standard )
@  Martin G. McCormick
   ` Nicolas Pitre
  0 siblings, 1 reply; 5+ messages in thread
From: Martin G. McCormick @  UTC (permalink / raw)
  To: blinux-list

	Yes, I am a bit slow, but I am catching on.  I definitely
need to understand more about time scale shifting methods and how
to accomplish them without adding distortion.

	Each digit represents a moment in time and we can make
things appear to speed up or slow down by intelligently inserting
or deleting  information.

	If we do it on a sample by sample basis, we can make the
recording appear to speed up or slow down with the expected pitch
changes.  If we do it on a wave form by wave form basis, we can
appear to keep the same pitch, but speed up the tempo or, for
that matter, we can add extra wave forms and stretch out the
syllables or whatever and slow them down.

	The old pitch correcting devices like the one I presently
use to read magazines butcher the sound because they aren't smart
enough to make sure the next snippet of sound starts at the same
place on the wave form that the previous one ended so we get that
characteristic gravelly sound at high rates.  I usually run mine
at maximum throttle so it is pretty bad, but with headphones, I
can still understand it.

	Thanks to both of you for giving me some more food for
thought.  I may play some more with /dev/dsp and see what weird
sounds I can come up with.

Martin McCormick

Janina Sajka writes:
>Martin:
>
>It sounds like you're extrapolating from experience with analog systems 
>that achieve similar results. Fortunately, the science has become much 
>more sophisticated. I say that's fortunate because the results can be much 
>much better than those we've heard on analog tape decks.




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Technical Question (was Digital Talking Book Standard )
   Technical Question (was Digital Talking Book Standard ) Martin G. McCormick
@  ` Nicolas Pitre
  0 siblings, 0 replies; 5+ messages in thread
From: Nicolas Pitre @  UTC (permalink / raw)
  To: blinux-list

On Tue, 20 Nov 2001, Martin G. McCormick wrote:

> 	Yes, I am a bit slow, but I am catching on.  I definitely
> need to understand more about time scale shifting methods and how
> to accomplish them without adding distortion.
> 
> 	Each digit represents a moment in time and we can make
> things appear to speed up or slow down by intelligently inserting
> or deleting  information.

Exactly.

> 	If we do it on a sample by sample basis, we can make the
> recording appear to speed up or slow down with the expected pitch
> changes.  If we do it on a wave form by wave form basis, we can
> appear to keep the same pitch, but speed up the tempo or, for
> that matter, we can add extra wave forms and stretch out the
> syllables or whatever and slow them down.

Right.  And therefore you can accurately find out where the waveform 
boundaries are when dealing with digital sound.  The technique consist of 
finding the best correlation between the original signal with a small 
moving window 
of the same signal inside a limitted range.  You then get the exact sample 
position where the current waveform is likely to start and end.  Then you 
only need to duplicate or remove that waveform once in a while with a 
certain ratio to create the desired effect.

> 	The old pitch correcting devices like the one I presently
> use to read magazines butcher the sound because they aren't smart
> enough to make sure the next snippet of sound starts at the same
> place on the wave form that the previous one ended so we get that
> characteristic gravelly sound at high rates.

That's because those devices just don't care about signal periods at all, 
and tend to duplicate or remove an arbitrary fixed duration of signal.  


Nicolas




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Technical Question (was  Digital Talking Book Standard )
   ` Nicolas Pitre
@    ` Janina Sajka
  0 siblings, 0 replies; 5+ messages in thread
From: Janina Sajka @  UTC (permalink / raw)
  To: blinux-list

Martin:

It sounds like you're extrapolating from experience with analog systems 
that achieve similar results. Fortunately, the science has become much 
more sophisticated. I say that's fortunate because the results can be much 
much better than those we've heard on analog tape decks.

Generically, the process is known as "Time Scale Modification." I bet a 
good way to begin to come up to speed on that would be a google search on 
this phrase.

 On Mon, 19 Nov 2001, Nicolas Pitre wrote:

> On Mon, 19 Nov 2001, Martin G. McCormick wrote:
> 
> > 	My question is whether or not it is possible to sample at
> > rates that are deliberately non-standard in order to simulate the
> > effect of a continuous speed control.
> 
> You can't expect most soundcard to do any samplerate.
> 
> > 	This may sound totally off-topic, but a digital Talking
> > Book player has to be able to vary its sampling rate in order to
> > emulate a speech compressor.
> 
> Absolutely not.  The technique to do that involves duplication and/or 
> supression of signal patterns based on period windows.  This is perfectly 
> doable in software without altering the samplerate at all.  Since this is 
> performed numerically you can have much better results than any conventional 
> methods.
> 
> 
> Nicolas
> 
> 
> 
> _______________________________________________
> Blinux-list mailing list
> Blinux-list@redhat.com
> https://listman.redhat.com/mailman/listinfo/blinux-list
> 

-- 
	
				Janina Sajka, Director
				Technology Research and Development
				Governmental Relations Group
				American Foundation for the Blind (AFB)

Email: janina@afb.net		Phone: (202) 408-8175

Chair, Accessibility SIG
Open Electronic Book Forum (OEBF)
http://www.openebook.org

Will electronic books surpass print books? Read our white paper,
Surpassing Gutenberg, at http://www.afb.org/ebook.asp

Download a free sample Digital Talking Book edition of Martin Luther
King Jr's inspiring "I Have A Dream" speech at
http://www.afb.org/mlkweb.asp

Learn how to make accessible software at
http://www.afb.org/accessapp.asp




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Technical Question (was  Digital Talking Book Standard )
   Martin G. McCormick
@  ` Nicolas Pitre
     ` Janina Sajka
  0 siblings, 1 reply; 5+ messages in thread
From: Nicolas Pitre @  UTC (permalink / raw)
  To: blinux-list

On Mon, 19 Nov 2001, Martin G. McCormick wrote:

> 	My question is whether or not it is possible to sample at
> rates that are deliberately non-standard in order to simulate the
> effect of a continuous speed control.

You can't expect most soundcard to do any samplerate.

> 	This may sound totally off-topic, but a digital Talking
> Book player has to be able to vary its sampling rate in order to
> emulate a speech compressor.

Absolutely not.  The technique to do that involves duplication and/or 
supression of signal patterns based on period windows.  This is perfectly 
doable in software without altering the samplerate at all.  Since this is 
performed numerically you can have much better results than any conventional 
methods.


Nicolas




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Technical Question (was  Digital Talking Book Standard )
@  Martin G. McCormick
   ` Nicolas Pitre
  0 siblings, 1 reply; 5+ messages in thread
From: Martin G. McCormick @  UTC (permalink / raw)
  To: blinux-list

	One part of the NISO standard I read said that players
should be able to allow the user to speed up the recording while
restoring the pitch.  In other words, the players should be able
to deliver compressed speech much like what we presently have
with the variable-speed Talking Book machines and tape players
and the electronic pitch restoration devices which have existed
for several decades.

	One question I have for the group is whether or not it is
possible to even somewhat continuously vary the time base of the
sound cards found in most computers?  I know that most sound
cards can be set to sample at 1 of a number of different rates,
but the rates are still rather fixed at multiples of 8 kilohertz
sampling and multiples of 11.025 kilohertz sampling rates.  The
8-KHZ rate is good for communications-grade audio such as would
be found on 2-way radio and telephone systems while the rates
based on 11.025 KHZ samples can neatly fit in to the 44.1 KHZ
compact disk standard.

	My question is whether or not it is possible to sample at
rates that are deliberately non-standard in order to simulate the
effect of a continuous speed control.

	This would also make it possible to rescue damaged tapes
by recording them at a sampling rate that is off by enough to
compensate for a recorder that is not quite recording at the
correct speed.

	This may sound totally off-topic, but a digital Talking
Book player has to be able to vary its sampling rate in order to
emulate a speech compressor.

	There are actually two flavors of compression which have
been used in the past.  One is to speed up the tape or record and
then run the audio through a pitch correction circuit so it
doesn't sound like "The Chipmunks."  The other compression scheme
is one in which the tape is played at normal speed through a
device that has a second recorder whose tape is stopped and
started very quickly such that pauses longer than a set length
are removed.

	Of course, the pause-removal system was less popular
because somebody had to make the compressed recordings.  The
pitch corrector can be run right wen it is needed and run on the
original recording.

	If sound cards can be made to slide from one sampling
rate to another, then we should be able to have both kinds of
compression on audio recordings.

	In reality, I know that a variable sampling rate is more
than likely going to be a series of small steps, but if they are
small enough, it gives the appearance of continuous variability.

	I have played in the past with the timer/counter device
that controls the pitch of the P.C.'s speaker and that pitch is
set by stuffing a 16-bit number in to a counter that divides a
roughly 1 MHZ clock by whatever is in the counter.  By the time
one is in the audio range, it is very hard to tell the difference
between one step and the next.  I am hoping there is something
similar in most sound cards that one can mess with to get odd
sampling rates.

	I hope some of you experts can please fill in the vast
holes in my knowledge base, here.

Martin McCormick




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~ UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
 Technical Question (was Digital Talking Book Standard ) Martin G. McCormick
 ` Nicolas Pitre
  -- strict thread matches above, loose matches on Subject: below --
 Martin G. McCormick
 ` Nicolas Pitre
   ` Janina Sajka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).