public inbox for blinux-list@redhat.com
 help / color / mirror / Atom feed
* Convert unwrapped paragraphs to hard wrapped paragraphs when there's no blank lines.
@  Linux for blind general discussion
   ` Linux for blind general discussion
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Linux for blind general discussion @  UTC (permalink / raw)
  To: Linux for blind general discussion

Okay, this isn't strictly an accessibility question, but I can't think
of any better place to ask and Google didn't help much.

I occasionally purchase eBooks from Smash Words as they're the only
eBook Store I know of that offers plain text along side the far too
prevalent for my liking PDF, ePub, and Kindle formats.

Problem is, their plain text eBooks are typically long enough Firefox
and Orca simply choke on them and they have paragraphs that are
unwrapped, which makes reading them with nano and SBL cumbersome.
Normally, I'd just use nano's justify command to hard wrap thewhole
file, but they lack blank lines between paragraphs, so Nano would
think the whole book a single paragraph.

So, does anyone know a way to automate inserting blank lines before
and after each line in a file that's too long to fit on the screen all
at once and then hard wrap those long lines?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Convert unwrapped paragraphs to hard wrapped paragraphs when there's no blank lines.
   Convert unwrapped paragraphs to hard wrapped paragraphs when there's no blank lines Linux for blind general discussion
@  ` Linux for blind general discussion
     ` Convert unwrapped paragraphs to hard wrapped paragraphs whenthere's " Linux for blind general discussion
   ` Convert unwrapped paragraphs to hard wrapped paragraphs when there's " Linux for blind general discussion
   ` Convert unwrapped paragraphs to hard wrapped paragraphs when there'sno " Linux for blind general discussion
  2 siblings, 1 reply; 9+ messages in thread
From: Linux for blind general discussion @  UTC (permalink / raw)
  To: Linux for blind general discussion

Hi,

On Fri, 27 Mar 2020 15:30:29 +0000
Linux for blind general discussion <blinux-list@redhat.com> wrote:

> Okay, this isn't strictly an accessibility question, but I can't think
> of any better place to ask and Google didn't help much.
> 
> I occasionally purchase eBooks from Smash Words as they're the only
> eBook Store I know of that offers plain text along side the far too
> prevalent for my liking PDF, ePub, and Kindle formats.
> 
> Problem is, their plain text eBooks are typically long enough Firefox
> and Orca simply choke on them and they have paragraphs that are
> unwrapped, which makes reading them with nano and SBL cumbersome.
> Normally, I'd just use nano's justify command to hard wrap thewhole
> file, but they lack blank lines between paragraphs, so Nano would
> think the whole book a single paragraph.
> 
> So, does anyone know a way to automate inserting blank lines before
> and after each line in a file that's too long to fit on the screen all
> at once and then hard wrap those long lines?
> 

I don't understand how paragraphs start and end in these files. Otherwise you
can try using one of the text processing tools mentioned here:

* https://www.shlomifish.org/open-source/resources/text-processing-tools/

* https://www.computerhope.com/unix/ufold.htm

* https://en.wikipedia.org/wiki/Fmt_(Unix)

* https://en.wikipedia.org/wiki/Par_(command)

Note that you may have better luck converting EPUBs (assuming they lack
https://en.wikipedia.org/wiki/Digital_rights_management ) to plaintext using
tools such as https://pandoc.org/ ,
https://metacpan.org/search?q=html%3A%3Awikiconverter&size=20 , etc.

Regards,

	Shlomi Fish

> 
> _______________________________________________
> Blinux-list mailing list
> Blinux-list@redhat.com
> https://www.redhat.com/mailman/listinfo/blinux-list
> 



-- 

Shlomi Fish       https://www.shlomifish.org/
https://is.gd/MQHVF3 - The Atom Text Editor edits a 2,000,001B file

Real programmers use a nice editor and a nice programming language and get it
done in less than O(N!).
    — vanguard on Freenode’s ##programming

Please reply to list if it's a mailing list post - http://shlom.in/reply .


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Convert unwrapped paragraphs to hard wrapped paragraphs when there's no blank lines.
   Convert unwrapped paragraphs to hard wrapped paragraphs when there's no blank lines Linux for blind general discussion
   ` Linux for blind general discussion
@  ` Linux for blind general discussion
     ` Linux for blind general discussion
   ` Convert unwrapped paragraphs to hard wrapped paragraphs when there'sno " Linux for blind general discussion
  2 siblings, 1 reply; 9+ messages in thread
From: Linux for blind general discussion @  UTC (permalink / raw)
  To: Linux for blind general discussion

On March 27, 2020, Linux for blind general discussion wrote:
> does anyone know a way to automate inserting blank lines before
> and after each line in a file that's too long to fit on the screen
> all at once and then hard wrap those long lines?

Well, since adding a blank line after each line-break puts a blank
line before the next line, you (should?) only need to add newlines
after each line which can easily be done with sed:

  $ sed G input_file.txt > output_file_with_spaces.txt

If you want to format the lines at the same time, you can do that
with "fmt"

  $ sed G input.txt | fmt > formatted_output_with_spaces.txt

By default, fmt formats to 72 characters wide but you can adjust that
using

  fmt -80

Hope this helps,

-tim



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Convert unwrapped paragraphs to hard wrapped paragraphs when there'sno blank lines.
   Convert unwrapped paragraphs to hard wrapped paragraphs when there's no blank lines Linux for blind general discussion
   ` Linux for blind general discussion
   ` Convert unwrapped paragraphs to hard wrapped paragraphs when there's " Linux for blind general discussion
@  ` Linux for blind general discussion
  2 siblings, 0 replies; 9+ messages in thread
From: Linux for blind general discussion @  UTC (permalink / raw)
  To: blinux-list

Apache PHP server would do it, though would need to be coded.  I used to 
love doing this and text-crunchers were my stock'in trade.

But a long time ago now. But the functions are still there and it's not a 
difficult language to grasp.

RobH.

----- Original Message ----- 
From: "Linux for blind general discussion" <blinux-list@redhat.com>
To: "Linux for blind general discussion" <blinux-list@redhat.com>
Sent: Friday, March 27, 2020 3:30 PM
Subject: Convert unwrapped paragraphs to hard wrapped paragraphs when 
there'sno blank lines.


Okay, this isn't strictly an accessibility question, but I can't think
of any better place to ask and Google didn't help much.

I occasionally purchase eBooks from Smash Words as they're the only
eBook Store I know of that offers plain text along side the far too
prevalent for my liking PDF, ePub, and Kindle formats.

Problem is, their plain text eBooks are typically long enough Firefox
and Orca simply choke on them and they have paragraphs that are
unwrapped, which makes reading them with nano and SBL cumbersome.
Normally, I'd just use nano's justify command to hard wrap thewhole
file, but they lack blank lines between paragraphs, so Nano would
think the whole book a single paragraph.

So, does anyone know a way to automate inserting blank lines before
and after each line in a file that's too long to fit on the screen all
at once and then hard wrap those long lines?


_______________________________________________
Blinux-list mailing list
Blinux-list@redhat.com
https://www.redhat.com/mailman/listinfo/blinux-list


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Convert unwrapped paragraphs to hard wrapped paragraphs whenthere's no blank lines.
   ` Linux for blind general discussion
@    ` Linux for blind general discussion
       ` Linux for blind general discussion
  0 siblings, 1 reply; 9+ messages in thread
From: Linux for blind general discussion @  UTC (permalink / raw)
  To: blinux-list

Btw:  I think Project Gutenberg still exists and they did tons of .txt and 
well-formatted as a rule,  though miles of header material to wade through.

----- Original Message ----- 
From: "Linux for blind general discussion" <blinux-list@redhat.com>
To: "Linux for blind general discussion" <blinux-list@redhat.com>
Sent: Friday, March 27, 2020 4:25 PM
Subject: Re: Convert unwrapped paragraphs to hard wrapped paragraphs 
whenthere's no blank lines.


Hi,

On Fri, 27 Mar 2020 15:30:29 +0000
Linux for blind general discussion <blinux-list@redhat.com> wrote:

> Okay, this isn't strictly an accessibility question, but I can't think
> of any better place to ask and Google didn't help much.
>
> I occasionally purchase eBooks from Smash Words as they're the only
> eBook Store I know of that offers plain text along side the far too
> prevalent for my liking PDF, ePub, and Kindle formats.
>
> Problem is, their plain text eBooks are typically long enough Firefox
> and Orca simply choke on them and they have paragraphs that are
> unwrapped, which makes reading them with nano and SBL cumbersome.
> Normally, I'd just use nano's justify command to hard wrap thewhole
> file, but they lack blank lines between paragraphs, so Nano would
> think the whole book a single paragraph.
>
> So, does anyone know a way to automate inserting blank lines before
> and after each line in a file that's too long to fit on the screen all
> at once and then hard wrap those long lines?
>

I don't understand how paragraphs start and end in these files. Otherwise 
you
can try using one of the text processing tools mentioned here:

* https://www.shlomifish.org/open-source/resources/text-processing-tools/

* https://www.computerhope.com/unix/ufold.htm

* https://en.wikipedia.org/wiki/Fmt_(Unix)

* https://en.wikipedia.org/wiki/Par_(command)

Note that you may have better luck converting EPUBs (assuming they lack
https://en.wikipedia.org/wiki/Digital_rights_management ) to plaintext using
tools such as https://pandoc.org/ ,
https://metacpan.org/search?q=html%3A%3Awikiconverter&size=20 , etc.

Regards,

Shlomi Fish

>
> _______________________________________________
> Blinux-list mailing list
> Blinux-list@redhat.com
> https://www.redhat.com/mailman/listinfo/blinux-list
>



-- 

Shlomi Fish       https://www.shlomifish.org/
https://is.gd/MQHVF3 - The Atom Text Editor edits a 2,000,001B file

Real programmers use a nice editor and a nice programming language and get 
it
done in less than O(N!).
    — vanguard on Freenode’s ##programming

Please reply to list if it's a mailing list post - http://shlom.in/reply .


_______________________________________________
Blinux-list mailing list
Blinux-list@redhat.com
https://www.redhat.com/mailman/listinfo/blinux-list 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Convert unwrapped paragraphs to hard wrapped paragraphs whenthere's no blank lines.
     ` Convert unwrapped paragraphs to hard wrapped paragraphs whenthere's " Linux for blind general discussion
@      ` Linux for blind general discussion
         ` Linux for blind general discussion
  0 siblings, 1 reply; 9+ messages in thread
From: Linux for blind general discussion @  UTC (permalink / raw)
  To: Linux for blind general discussion

> I don't understand how paragraphs start and end in these files. Otherwise
> you
> can try using one of the text processing tools mentioned here:
>
> * https://www.shlomifish.org/open-source/resources/text-processing-tools/
>
> * https://www.computerhope.com/unix/ufold.htm
>
> * https://en.wikipedia.org/wiki/Fmt_(Unix)
>
> * https://en.wikipedia.org/wiki/Par_(command)
>
> Note that you may have better luck converting EPUBs (assuming they lack
> https://en.wikipedia.org/wiki/Digital_rights_management ) to plaintext using
> tools such as https://pandoc.org/ ,
> https://metacpan.org/search?q=html%3A%3Awikiconverter&size=20 , etc.

Of that list of programs, I'd be inclined to use Pandoc. It permits
you to write filters in (embedded) Lua, which is a quick-to-learn
programming language. For example, this Lua one-liner converts a
string ("s") to add a line break after each existing line break:

s = string.gsub(s, "<BR>", "<BR>\n<BR>")

On writing Pandoc filters with Lua, see <https://pandoc.org/lua-filters.html>.

Best regards,

Paul

-- 
[Notice not included in the above original message:  The U.S. National
Security Agency neither confirms nor denies that it intercepted this
message.]
                                                ¯\_(ツ)_/¯


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Convert unwrapped paragraphs to hard wrapped paragraphs when there's no blank lines.
   ` Convert unwrapped paragraphs to hard wrapped paragraphs when there's " Linux for blind general discussion
@    ` Linux for blind general discussion
       ` Linux for blind general discussion
  0 siblings, 1 reply; 9+ messages in thread
From: Linux for blind general discussion @  UTC (permalink / raw)
  To: Linux for blind general discussion

The following assumes a newline at the end of each paragraph.

The PERL script below will put a blank line after each unwrapped paragraph, which in text is
just one long line terminated with a newline.
Put the perl script in your ~/bin/ directory in a file named doublespace and chmod it to 755. 

Next, the following command will break an unwrapped paragraph into 75 character lines:
fold -sbw 75 <file>

Now put the two together:

cat $1 | doublespace  | fold -sbw 75 

Put the above command into ~/bin/dscat, chmod it to 755 
and you can do:
dscat unwrapped_file > double_spaced_file

Now here's the perl script:
#!/usr/local/bin/perl

# Double space the standard input. Expects a text file.
while(<STDIN>)
{
chomp $_;
if ($_ eq "" or $_ =~ /^  *$/)
{
print "\n";
}
else
{
print "$_

";
}
}



On Fri, Mar 27, 2020 at 02:27:02PM -0500, Linux for blind general discussion wrote:
> On March 27, 2020, Linux for blind general discussion wrote:
> > does anyone know a way to automate inserting blank lines before
> > and after each line in a file that's too long to fit on the screen
> > all at once and then hard wrap those long lines?
> 
> Well, since adding a blank line after each line-break puts a blank
> line before the next line, you (should?) only need to add newlines
> after each line which can easily be done with sed:
> 
>   $ sed G input_file.txt > output_file_with_spaces.txt
> 
> If you want to format the lines at the same time, you can do that
> with "fmt"
> 
>   $ sed G input.txt | fmt > formatted_output_with_spaces.txt
> 
> By default, fmt formats to 72 characters wide but you can adjust that
> using
> 
>   fmt -80
> 
> Hope this helps,
> 
> -tim
> 
> 
> 
> _______________________________________________
> Blinux-list mailing list
> Blinux-list@redhat.com
> https://www.redhat.com/mailman/listinfo/blinux-list

-- 
Rudy Vener
Website: http://www.rudyvener.com


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Convert unwrapped paragraphs to hard wrapped paragraphs when there's no blank lines.
     ` Linux for blind general discussion
@      ` Linux for blind general discussion
  0 siblings, 0 replies; 9+ messages in thread
From: Linux for blind general discussion @  UTC (permalink / raw)
  To: blinux-list

sed G did the trick.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Convert unwrapped paragraphs to hard wrapped paragraphs whenthere's no blank lines.
       ` Linux for blind general discussion
@        ` Linux for blind general discussion
  0 siblings, 0 replies; 9+ messages in thread
From: Linux for blind general discussion @  UTC (permalink / raw)
  To: Linux for blind general discussion

Hi Paul,

On Fri, 27 Mar 2020 14:43:01 -0700
Linux for blind general discussion <blinux-list@redhat.com> wrote:

> > I don't understand how paragraphs start and end in these files. Otherwise
> > you
> > can try using one of the text processing tools mentioned here:
> >
> > * https://www.shlomifish.org/open-source/resources/text-processing-tools/
> >
> > * https://www.computerhope.com/unix/ufold.htm
> >
> > * https://en.wikipedia.org/wiki/Fmt_(Unix)
> >
> > * https://en.wikipedia.org/wiki/Par_(command)
> >
> > Note that you may have better luck converting EPUBs (assuming they lack
> > https://en.wikipedia.org/wiki/Digital_rights_management ) to plaintext using
> > tools such as https://pandoc.org/ ,
> > https://metacpan.org/search?q=html%3A%3Awikiconverter&size=20 , etc.  
> 
> Of that list of programs, I'd be inclined to use Pandoc. It permits
> you to write filters in (embedded) Lua, which is a quick-to-learn
> programming language. For example, this Lua one-liner converts a
> string ("s") to add a line break after each existing line break:
> 
> s = string.gsub(s, "<BR>", "<BR>\n<BR>")
> 

Other tools may work as well. Furthermore, your HTML processing substitution
will not work if one has "<br>" or "<br />" or "<br/>" for newlines or uses the
more recommended https://developer.mozilla.org/en-US/docs/Web/HTML/Element/p
element.

Also see:

* https://perl-begin.org/uses/text-parsing/

* https://blog.codinghorror.com/parsing-html-the-cthulhu-way/



> On writing Pandoc filters with Lua, see <https://pandoc.org/lua-filters.html>.
> 
> Best regards,
> 
> Paul
> 



-- 

Shlomi Fish       https://www.shlomifish.org/
https://is.gd/MQHVF3 - The Atom Text Editor edits a 2,000,001B file

Joel’s Generalisation: If it happens to you, it happens to everybody.
(Or: It’s never only you.)
    — Based on http://www.joelonsoftware.com/news/20020402.html

Please reply to list if it's a mailing list post - http://shlom.in/reply .


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~ UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
 Convert unwrapped paragraphs to hard wrapped paragraphs when there's no blank lines Linux for blind general discussion
 ` Linux for blind general discussion
   ` Convert unwrapped paragraphs to hard wrapped paragraphs whenthere's " Linux for blind general discussion
     ` Linux for blind general discussion
       ` Linux for blind general discussion
 ` Convert unwrapped paragraphs to hard wrapped paragraphs when there's " Linux for blind general discussion
   ` Linux for blind general discussion
     ` Linux for blind general discussion
 ` Convert unwrapped paragraphs to hard wrapped paragraphs when there'sno " Linux for blind general discussion

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).