* Re: Grabbing An Entire Website
` Grabbing An Entire Website Janina Sajka
@ ` Xandy Johnson
` ADAM Sulmicki
` (3 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: Xandy Johnson @ UTC (permalink / raw)
To: Janina Sajka; +Cc: ma-linux, speakup
You probably want to look into wget. It can follow links to recursively
retrieve all documents referenced by an http URL (also does ftp, but you
specifically said your needs were http). There are a lot of options (e.g.
maximum depth, spanning hosts, converting absolute links to relative ones
locally, etc.), so I suggest reading the man page and then asking more
specific questions if you have them.
Yours,
Xandy
On Wed, 19 Apr 2000, Janina Sajka wrote:
> Hi:
>
> Anyone know how to auto-retrieve an entire www page hierarchy?
>
> I know software like ncftp can and wuftp can tar up an entire directory
> tree, but the pages I need aren't available over ftp, only http. I'd hate
> to have them by hand one at a time, though.
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: Grabbing An Entire Website
` Grabbing An Entire Website Janina Sajka
` Xandy Johnson
@ ` ADAM Sulmicki
` Janina Sajka
` George Lewis
` (2 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: ADAM Sulmicki @ UTC (permalink / raw)
To: Janina Sajka; +Cc: ma-linux, speakup
> I know software like ncftp can and wuftp can tar up an entire directory
> tree, but the pages I need aren't available over ftp, only http. I'd hate
> to have them by hand one at a time, though.
wget
ftp.gnu.org/pub/gnu/wget
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: Grabbing An Entire Website
` ADAM Sulmicki
@ ` Janina Sajka
0 siblings, 0 replies; 11+ messages in thread
From: Janina Sajka @ UTC (permalink / raw)
To: ADAM Sulmicki; +Cc: ma-linux, speakup
Got wget. Very cool.
No, very very cool! <grin>
Thanks.
Janina
On Wed, 19 Apr 2000, ADAM Sulmicki
wrote:
>
> > I know software like ncftp can and wuftp can tar up an entire directory
> > tree, but the pages I need aren't available over ftp, only http. I'd hate
> > to have them by hand one at a time, though.
>
> wget
>
> ftp.gnu.org/pub/gnu/wget
>
>
>
--
Janina Sajka, Director
Information Systems Research & Development
American Foundation for the Blind (AFB)
janina@afb.net
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Grabbing An Entire Website
` Grabbing An Entire Website Janina Sajka
` Xandy Johnson
` ADAM Sulmicki
@ ` George Lewis
` Brett W. McCoy
` Garrett Nievin
4 siblings, 0 replies; 11+ messages in thread
From: George Lewis @ UTC (permalink / raw)
To: Janina Sajka; +Cc: ma-linux, speakup
Use wget, it has special mirroring options specifically for such a task.
There are also other applications and perl modules for this task, but
generically wget is excellent.
George
Janina Sajka (janina@afb.net) wrote:
> Hi:
>
> Anyone know how to auto-retrieve an entire www page hierarchy?
>
> I know software like ncftp can and wuftp can tar up an entire directory
> tree, but the pages I need aren't available over ftp, only http. I'd hate
> to have them by hand one at a time, though.
>
> --
>
> Janina Sajka, Director
> Information Systems Research & Development
> American Foundation for the Blind (AFB)
>
> janina@afb.net
>
--
George Lewis
http://schvin.net/
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: Grabbing An Entire Website
` Grabbing An Entire Website Janina Sajka
` (2 preceding siblings ...)
` George Lewis
@ ` Brett W. McCoy
` Garrett Nievin
4 siblings, 0 replies; 11+ messages in thread
From: Brett W. McCoy @ UTC (permalink / raw)
To: Janina Sajka; +Cc: ma-linux, speakup
On Wed, 19 Apr 2000, Janina Sajka wrote:
> Anyone know how to auto-retrieve an entire www page hierarchy?
>
> I know software like ncftp can and wuftp can tar up an entire directory
> tree, but the pages I need aren't available over ftp, only http. I'd hate
> to have them by hand one at a time, though.
Take a look at http://www.enfin.com/getweb/
Brett W. McCoy
http://www.chapelperilous.net
---------------------------------------------------------------------------
If only God would give me some clear sign! Like making a large deposit
in my name at a Swiss Bank.
- Woody Allen
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: Grabbing An Entire Website
` Grabbing An Entire Website Janina Sajka
` (3 preceding siblings ...)
` Brett W. McCoy
@ ` Garrett Nievin
` Aaron
4 siblings, 1 reply; 11+ messages in thread
From: Garrett Nievin @ UTC (permalink / raw)
To: Janina Sajka; +Cc: ma-linux, speakup
I think that you can use wget for that. Have not done it myself.
Cheers,
Garrett
On Wed, 19 Apr 2000, Janina Sajka wrote:
> Hi:
>
> Anyone know how to auto-retrieve an entire www page hierarchy?
>
> I know software like ncftp can and wuftp can tar up an entire directory
> tree, but the pages I need aren't available over ftp, only http. I'd hate
> to have them by hand one at a time, though.
>
>
--
Garrett P. Nievin <gnievin@gmu.edu>
Non est ad astra mollis e terris via. -- Seneca
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: Grabbing An Entire Website
` Garrett Nievin
@ ` Aaron
0 siblings, 0 replies; 11+ messages in thread
From: Aaron @ UTC (permalink / raw)
To: Garrett Nievin; +Cc: Janina Sajka, ma-linux, speakup
yup, wget -r www.foobar.com. Of course that gets what a browser would
"see" not the source code behind dymanic pages, unless of course it's cold
fusion ;) If you want to get source code, for dynamic pages or something
else that would depend on the situation.
Aaron
On Wed, 19 Apr 2000, Garrett Nievin wrote:
> I think that you can use wget for that. Have not done it myself.
>
>
> Cheers,
> Garrett
>
> On Wed, 19 Apr 2000, Janina Sajka wrote:
>
> > Hi:
> >
> > Anyone know how to auto-retrieve an entire www page hierarchy?
> >
> > I know software like ncftp can and wuftp can tar up an entire directory
> > tree, but the pages I need aren't available over ftp, only http. I'd hate
> > to have them by hand one at a time, though.
> >
> >
>
> --
> Garrett P. Nievin <gnievin@gmu.edu>
>
> Non est ad astra mollis e terris via. -- Seneca
>
^ permalink raw reply [flat|nested] 11+ messages in thread