public inbox for speakup@linux-speakup.org
 help / color / mirror / Atom feed
* html to text
@  Gregory Nowak
   ` David Poehlman
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Gregory Nowak @  UTC (permalink / raw)
  To: speakup

Hi All,

I've got an article off the web on which
I need to write a summarizing paper for class.
The easiest and most comfortable way for me to write papers is on my
good old braille 'n speak.
I was wondering if there was a utility under Linux that could
convert the html article into plain ascii for importing into the bns.
If not, then I guess I'll have to
import it in to the macrosloppy internet exploder, and save it as text.
However, I would rather not admit that macroslop is superior
in some respect to Linux.
Thanks for  any suggestions.
Greg




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: html to text
   html to text Gregory Nowak
@  ` David Poehlman
     ` Janina Sajka
     ` Amanda Lee
   ` Saqib Shaikh
   ` Charles Hallenbeck
  2 siblings, 2 replies; 16+ messages in thread
From: David Poehlman @  UTC (permalink / raw)
  To: speakup

I usually print the page to disk from lynx but it has been a long time
since I have done this.

----- Original Message -----
From: "Gregory Nowak" <romualt@megsinet.net>
To: <speakup@braille.uwo.ca>
Sent: Monday, September 03, 2001 10:38 PM
Subject: html to text


Hi All,

I've got an article off the web on which
I need to write a summarizing paper for class.
The easiest and most comfortable way for me to write papers is on my
good old braille 'n speak.
I was wondering if there was a utility under Linux that could
convert the html article into plain ascii for importing into the bns.
If not, then I guess I'll have to
import it in to the macrosloppy internet exploder, and save it as text.
However, I would rather not admit that macroslop is superior
in some respect to Linux.
Thanks for  any suggestions.
Greg



_______________________________________________
Speakup mailing list
Speakup@braille.uwo.ca
http://speech.braille.uwo.ca/mailman/listinfo/speakup



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: html to text
   html to text Gregory Nowak
   ` David Poehlman
@  ` Saqib Shaikh
     ` Ann Parsons
   ` Charles Hallenbeck
  2 siblings, 1 reply; 16+ messages in thread
From: Saqib Shaikh @  UTC (permalink / raw)
  To: speakup

I'm not sure, but try:
lynx --dump filename.html >filename.txt
not 100% sure, but something like this.
saqib

----- Original Message ----- 
From: "Gregory Nowak" <romualt@megsinet.net>
To: <speakup@braille.uwo.ca>
Sent: Tuesday, September 04, 2001 3:38 AM
Subject: html to text


> Hi All,
> 
> I've got an article off the web on which
> I need to write a summarizing paper for class.
> The easiest and most comfortable way for me to write papers is on my
> good old braille 'n speak.
> I was wondering if there was a utility under Linux that could
> convert the html article into plain ascii for importing into the bns.
> If not, then I guess I'll have to
> import it in to the macrosloppy internet exploder, and save it as text.
> However, I would rather not admit that macroslop is superior
> in some respect to Linux.
> Thanks for  any suggestions.
> Greg
> 
> 
> 
> _______________________________________________
> Speakup mailing list
> Speakup@braille.uwo.ca
> http://speech.braille.uwo.ca/mailman/listinfo/speakup



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: html to text
   html to text Gregory Nowak
   ` David Poehlman
   ` Saqib Shaikh
@  ` Charles Hallenbeck
     ` David Poehlman
  2 siblings, 1 reply; 16+ messages in thread
From: Charles Hallenbeck @  UTC (permalink / raw)
  To: speakup

Lynx does a great job. If you open a file with lynx you can then select
"p" and save it to a disk file as plain text. The links imbedded in the
document are gathered together at the end of the file in a section called
"references" with numbers corresponding to the bracketed numbers in the
body of the document where they first appear.

Also, the suggestion to use the --dump switch does the same thing
silently. Just be sure to redirect the output from lynx to your output
file.

Chuck
 On Mon, 3 Sep 2001, Gregory Nowak wrote:

> Hi All,
> 
> I've got an article off the web on which
> I need to write a summarizing paper for class.
> The easiest and most comfortable way for me to write papers is on my
> good old braille 'n speak.
> I was wondering if there was a utility under Linux that could
> convert the html article into plain ascii for importing into the bns.
> If not, then I guess I'll have to
> import it in to the macrosloppy internet exploder, and save it as text.
> However, I would rather not admit that macroslop is superior
> in some respect to Linux.
> Thanks for  any suggestions.
> Greg
> 
> 
> 
> _______________________________________________
> Speakup mailing list
> Speakup@braille.uwo.ca
> http://speech.braille.uwo.ca/mailman/listinfo/speakup
> 

Visit me at http://www.mhonline.net/~chuckh 
The Moon is Waning Gibbous (98% of Full)



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: html to text
   ` Charles Hallenbeck
@    ` David Poehlman
  0 siblings, 0 replies; 16+ messages in thread
From: David Poehlman @  UTC (permalink / raw)
  To: speakup

and if you want a cleaner copy of the file, you can turn off the numbers
and I don't know how this is done but you can capture it without
references.

----- Original Message -----
From: "Charles Hallenbeck" <chuckh@mhonline.net>
To: <speakup@braille.uwo.ca>
Sent: Tuesday, September 04, 2001 6:36 AM
Subject: Re: html to text


Lynx does a great job. If you open a file with lynx you can then select
"p" and save it to a disk file as plain text. The links imbedded in the
document are gathered together at the end of the file in a section
called
"references" with numbers corresponding to the bracketed numbers in the
body of the document where they first appear.

Also, the suggestion to use the --dump switch does the same thing
silently. Just be sure to redirect the output from lynx to your output
file.

Chuck
 On Mon, 3 Sep 2001, Gregory Nowak wrote:

> Hi All,
>
> I've got an article off the web on which
> I need to write a summarizing paper for class.
> The easiest and most comfortable way for me to write papers is on my
> good old braille 'n speak.
> I was wondering if there was a utility under Linux that could
> convert the html article into plain ascii for importing into the bns.
> If not, then I guess I'll have to
> import it in to the macrosloppy internet exploder, and save it as
text.
> However, I would rather not admit that macroslop is superior
> in some respect to Linux.
> Thanks for  any suggestions.
> Greg
>
>
>
> _______________________________________________
> Speakup mailing list
> Speakup@braille.uwo.ca
> http://speech.braille.uwo.ca/mailman/listinfo/speakup
>

Visit me at http://www.mhonline.net/~chuckh
The Moon is Waning Gibbous (98% of Full)


_______________________________________________
Speakup mailing list
Speakup@braille.uwo.ca
http://speech.braille.uwo.ca/mailman/listinfo/speakup



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: html to text
   ` Saqib Shaikh
@    ` Ann Parsons
       ` David Poehlman
  0 siblings, 1 reply; 16+ messages in thread
From: Ann Parsons @  UTC (permalink / raw)
  To: speakup

Hi all,

Remind me to tell you about my experience teaching in the Rochester
City Schools as a VI Student Teacher.

Ann P.

-- 
			Ann K. Parsons  
email:  akp@eznet.net 			ICQ Number:  33006854
WEB SITE:  http://home.eznet.net/~akp
"All that is gold does not glitter.  Not all those who wander are lost."  JRRT



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: html to text
     ` Ann Parsons
@      ` David Poehlman
         ` Ann Parsons
  0 siblings, 1 reply; 16+ messages in thread
From: David Poehlman @  UTC (permalink / raw)
  To: speakup

this is a reminder.

----- Original Message -----
From: "Ann Parsons" <akp@eznet.net>
To: <speakup@braille.uwo.ca>
Sent: Tuesday, September 04, 2001 8:07 AM
Subject: Re: html to text


Hi all,

Remind me to tell you about my experience teaching in the Rochester
City Schools as a VI Student Teacher.

Ann P.

--
Ann K. Parsons
email:  akp@eznet.net ICQ Number:  33006854
WEB SITE:  http://home.eznet.net/~akp
"All that is gold does not glitter.  Not all those who wander are lost."
JRRT


_______________________________________________
Speakup mailing list
Speakup@braille.uwo.ca
http://speech.braille.uwo.ca/mailman/listinfo/speakup



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: html to text
   ` David Poehlman
@    ` Janina Sajka
     ` Amanda Lee
  1 sibling, 0 replies; 16+ messages in thread
From: Janina Sajka @  UTC (permalink / raw)
  To: speakup

If you're using lynx, press p for print, and select the first option, 1, 
to write the file to disk. It will be a text file, and you'll be prompted 
to supply a file name, though lynx will offer a default suggestion.

Next, depending on how you use lynx, you may need to clean it up a little. 
What do I mean? Well, I use lynx with numbered links turned on. So, when 
my screens are written to disk, all those lovely consecutive link numbers 
are right there, one after the next. I have two choices in a circumstance 
like this:

1.)	Turn off link numbering temporarily before saving the file;

2.)	Write a little script to get rid of the stuff enclosed in 
brackets.



-- 
	
				Janina Sajka, Director
				Technology Research and Development
				Governmental Relations Group
				American Foundation for the Blind (AFB)

Email: janina@afb.net		Phone: (202) 408-8175

Chair, Accessibility SIG
Open Electronic Book Forum (OEBF)
http://www.openebook.org

Will electronic books surpass print books? Read our white paper,
Surpassing Gutenberg, at http://www.afb.org/ebook.asp

Download a free sample Digital Talking Book edition of Martin Luther
King Jr's inspiring "I Have A Dream" speech at
http://www.afb.org/mlkweb.asp

Learn how to make accessible software at
http://www.afb.org/accessapp.asp



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: html to text
       ` David Poehlman
@        ` Ann Parsons
  0 siblings, 0 replies; 16+ messages in thread
From: Ann Parsons @  UTC (permalink / raw)
  To: speakup

Hi all,

David, I am *not* in the mood, nor do I have time to do this, this
morning.  I've been sitting here typing while my roof changes over my
head and my own head rings with hammer blows delivered to the roof
above me as they replace it!  No, wait till some time else.  Keep
reminding me, though.

Besides, I don't think this message was meant for the speakup list.  I
think it got misdirected.  Sorry.


Ann P.

-- 
			Ann K. Parsons  
email:  akp@eznet.net 			ICQ Number:  33006854
WEB SITE:  http://home.eznet.net/~akp
"All that is gold does not glitter.  Not all those who wander are lost."  JRRT



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: html to text
   ` David Poehlman
     ` Janina Sajka
@    ` Amanda Lee
  1 sibling, 0 replies; 16+ messages in thread
From: Amanda Lee @  UTC (permalink / raw)
  To: speakup

I've been away so if someone else has responded, please press the Delete
command now!

Just press p and save the page to a file.
If you don't want to use the default name, press ctrl-u to clear the
filename and enter the name you want.
This file will be placed under your directory.

Amanda Lee


On Mon, 3 Sep 2001, David Poehlman wrote:

> I usually print the page to disk from lynx but it has been a long time
> since I have done this.
>
> ----- Original Message -----
> From: "Gregory Nowak" <romualt@megsinet.net>
> To: <speakup@braille.uwo.ca>
> Sent: Monday, September 03, 2001 10:38 PM
> Subject: html to text
>
>
> Hi All,
>
> I've got an article off the web on which
> I need to write a summarizing paper for class.
> The easiest and most comfortable way for me to write papers is on my
> good old braille 'n speak.
> I was wondering if there was a utility under Linux that could
> convert the html article into plain ascii for importing into the bns.
> If not, then I guess I'll have to
> import it in to the macrosloppy internet exploder, and save it as text.
> However, I would rather not admit that macroslop is superior
> in some respect to Linux.
> Thanks for  any suggestions.
> Greg
>
>
>
> _______________________________________________
> Speakup mailing list
> Speakup@braille.uwo.ca
> http://speech.braille.uwo.ca/mailman/listinfo/speakup
>
>
> _______________________________________________
> Speakup mailing list
> Speakup@braille.uwo.ca
> http://speech.braille.uwo.ca/mailman/listinfo/speakup
>



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: html to text.
   Thomas Ward
   ` Dave Hunt
   ` Luke Davis
@  ` Chuck Hallenbeck
  2 siblings, 0 replies; 16+ messages in thread
From: Chuck Hallenbeck @  UTC (permalink / raw)
  To: Speakup List

You can use lynx the cat to do that. The format is:

lynx --dump filename.html > filename.txt

Chuck
On Wed, 19 Mar 2003, Thomas Ward wrote:

>
> Hello, list. Does anyone know of a Linux tool, script, or anything for
> Linux that will take a html file, remove the tags, and break it down to a
> flat ascii text file without all the html markup?
> I've got several books in html format which I hate reading in a web
> browser, and I want the ability to remove the tags so I can read it
> anywhere I want.
> Thanks.
>
>
>
> _______________________________________________
> Speakup mailing list
> Speakup@braille.uwo.ca
> http://speech.braille.uwo.ca/mailman/listinfo/speakup
>

-- 
The Moon is Waning Gibbous (94% of Full)
So visit me sometime at http://www.mhonline.net/~chuckh
or you might reach me at chuckh on the jabber network.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: html to text.
     ` Luke Davis
@      ` Dave Hunt
  0 siblings, 0 replies; 16+ messages in thread
From: Dave Hunt @  UTC (permalink / raw)
  To: speakup

Sorry, typo.  

lynx -dump file.html >file.txt


-Dave
>>>>> "Luke" == Luke Davis <ldavis@shellworld.net> writes:

    Luke> No, it doesn't.  Lynx will not read stdin in that manner.


    Luke> On Thu, 20 Mar 2003, Dave Hunt wrote:

    >> Thomas:
    >> 
    >> Try "lynx -dump" <file.html >file.txt.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: html to text.
   ` Dave Hunt
@    ` Luke Davis
       ` Dave Hunt
  0 siblings, 1 reply; 16+ messages in thread
From: Luke Davis @  UTC (permalink / raw)
  To: speakup

No, it doesn't.  Lynx will not read stdin in that manner.


On Thu, 20 Mar 2003, Dave Hunt wrote:

> Thomas:
>
> Try "lynx -dump" <file.html >file.txt.
>
> It works great!
>
> -Dave
>
>
> _______________________________________________
> Speakup mailing list
> Speakup@braille.uwo.ca
> http://speech.braille.uwo.ca/mailman/listinfo/speakup
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: html to text.
   Thomas Ward
   ` Dave Hunt
@  ` Luke Davis
   ` Chuck Hallenbeck
  2 siblings, 0 replies; 16+ messages in thread
From: Luke Davis @  UTC (permalink / raw)
  To: Speakup List

lynx -dump filename.html > file.txt

On Wed, 19 Mar 2003, Thomas Ward wrote:

>
> Hello, list. Does anyone know of a Linux tool, script, or anything for
> Linux that will take a html file, remove the tags, and break it down to a
> flat ascii text file without all the html markup?
> I've got several books in html format which I hate reading in a web
> browser, and I want the ability to remove the tags so I can read it
> anywhere I want.
> Thanks.
>
>
>
> _______________________________________________
> Speakup mailing list
> Speakup@braille.uwo.ca
> http://speech.braille.uwo.ca/mailman/listinfo/speakup
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* html to text.
   Thomas Ward
@  ` Dave Hunt
     ` Luke Davis
   ` Luke Davis
   ` Chuck Hallenbeck
  2 siblings, 1 reply; 16+ messages in thread
From: Dave Hunt @  UTC (permalink / raw)
  To: speakup

Thomas:

Try "lynx -dump" <file.html >file.txt.

It works great!

-Dave



^ permalink raw reply	[flat|nested] 16+ messages in thread

* html to text.
@  Thomas Ward
   ` Dave Hunt
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Thomas Ward @  UTC (permalink / raw)
  To: Speakup List

Hello, list. Does anyone know of a Linux tool, script, or anything for 
Linux that will take a html file, remove the tags, and break it down to a 
flat ascii text file without all the html markup?
I've got several books in html format which I hate reading in a web 
browser, and I want the ability to remove the tags so I can read it 
anywhere I want.
Thanks.




^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~ UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
 html to text Gregory Nowak
 ` David Poehlman
   ` Janina Sajka
   ` Amanda Lee
 ` Saqib Shaikh
   ` Ann Parsons
     ` David Poehlman
       ` Ann Parsons
 ` Charles Hallenbeck
   ` David Poehlman
 Thomas Ward
 ` Dave Hunt
   ` Luke Davis
     ` Dave Hunt
 ` Luke Davis
 ` Chuck Hallenbeck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).