NAME¶
Text::Format - various subroutines to format text.
SYNOPSIS¶
use Text::Format;
my $text = Text::Format->new (
{
text => [], # all
columns => 72, # format, paragraphs, center
leftMargin => 0, # format, paragraphs, center
rightMargin => 0, # format, paragraphs, center
firstIndent => 4, # format, paragraphs
bodyIndent => 0, # format, paragraphs
rightFill => 0, # format, paragraphs
rightAlign => 0, # format, paragraphs
justify => 0, # format, paragraphs
extraSpace => 0, # format, paragraphs
abbrevs => {}, # format, paragraphs
hangingIndent => 0, # format, paragraphs
hangingText => [], # format, paragraphs
noBreak => 0, # format, paragraphs
noBreakRegex => {}, # format, paragraphs
tabstop => 8, # expand, unexpand, center
}
); # these are the default values
my %abbr = (foo => 1, bar => 1);
$text->abbrevs(\%abbr);
$text->abbrevs();
$text->abbrevs({foo => 1,bar => 1});
$text->abbrevs(qw/foo bar/);
$text->text(\@text);
$text->columns(132);
$text->tabstop(4);
$text->extraSpace(1);
$text->firstIndent(8);
$text->bodyIndent(4);
$text->config({tabstop => 4,firstIndent => 0});
$text->rightFill(0);
$text->rightAlign(0);
DESCRIPTION¶
The
format routine will format under all circumstances even if the width
isn't enough to contain the longest words.
Text::Wrap will die under
these circumstances, although I am told this is fixed. If columns is set to a
small number and words are longer than that and the leading 'whitespace' than
there will be a single word on each line. This will let you make a simple word
list which could be indented or right aligned. There is a chance for croaking
if you try to subvert the module. If you don't pass in text then the internal
text is worked on, though not modfied.
Text::Format is meant for more powerful text formatting than what
Text::Wrap allows. I also have a module called
Text::NWrap that
is meant as a direct replacement for
Text::Wrap.
Text::NWrap
requires
Text::Format since it uses
Text::Format->format to
do the actual wrapping but gives you the interface of
Text::Wrap.
General setup should be explained with the below graph.
columns
<------------------------------------------------------------>
<----------><------><---------------------------><----------->
leftMargin indent text is formatted into here rightMargin
indent is firstIndent or bodyIndent depending on where we are in the paragraph.
- format @ARRAY || \@ARRAY || [<FILEHANDLE>] ||
NOTHING
- Allows one to do some advanced formatting of text into a
paragraph, with indent for first line and body set separately. Can specify
total width of text, right fill with spaces or right align or justify
(align to both margins), right margin and left margin, non-breaking space,
two spaces at end of sentence, hanging indents (tagged paragraphs). Strips
all leading and trailing whitespace before proceeding. Text is first split
into words and then reassembled. If no text is passed in then the internal
text in the object is formatted.
- paragraphs @ARRAY || \@ARRAY || [<FILEHANDLE>]
|| NOTHING
- Considers each element of text as a paragraph and if the
indents are the same for first line and the body then the paragraphs are
separated by a single empty line otherwise they follow one under the
other. If hanging indent is set then a single empty line will separate
each paragraph as well. Calls format to do the actual formatting.
If no text is passed in then the internal text in the object is formatted,
though not changed.
- center @ARRAY || NOTHING
- Centers a list of strings in @ARRAY or internal text. Empty
lines appear as, you guessed it, empty lines. Center strips all leading
and trailing whitespace before proceeding. Left margin and right margin
can be set. If no text is passed in then the internal text in the object
is formatted.
- expand @ARRAY || NOTHING
- Expand tabs in the list of text to tabstop number of spaces
in @ARRAY or internal text. Doesn't modify the internal text just passes
back the modified text. If no text is passed in then the internal text in
the object is formatted.
- unexpand @ARRAY || NOTHING
- Tabstop number of spaces are turned into tabs in @ARRAY or
internal text. Doesn't modify the internal text just passes back the
modified text. If no text is passed in then the internal text in the
object is formatted.
- new \%HASH || NOTHING
- Instantiates the object. If you pass a reference to a hash,
or an anonymous hash then it is used in setting attributes.
- config \%HASH
- Allows the configuration of all object attributes at once.
Returns the object prior to configuration. You can use it to make a clone
of your object before you change attributes.
- columns NUMBER || NOTHING
- Set width of text or retrieve width. This is total width
and includes indentation and the right and left margins.
- tabstop NUMBER || NOTHING
- Set tabstop size or retrieve tabstop size, only used by
expand, unexpand and center.
- firstIndent NUMBER || NOTHING
- Set or get indent for the first line of paragraph. This is
the number of spaces to indent.
- bodyIndent NUMBER || NOTHING
- Set or get indent for the body of paragraph. This is the
number of spaces to indent.
- leftMargin NUMBER || NOTHING
- Set or get width of left margin. This is the number of
spaces used for the margin.
- rightMargin NUMBER || NOTHING
- Set or get width of right margin. This is the number of
spaces used for the margin.
- rightFill 0 || 1 || NOTHING
- Set right fill or retrieve its value. The filling is done
with spaces. Keep in mind that if rightAlign is also set then both
rightFill and rightAlign are ignored.
- rightAlign 0 || 1 || NOTHING
- Set right align or retrieve its value. Text is aligned with
the right side of the margin. Keep in mind that if rightFill is
also set then both rightFill and rightAlign are
ignored.
- justify 0 || 1 || NOTHING
- Set justify or retrieve its value. Text is aligned with
both margins, adding extra spaces as necessary to align text with left and
right margins. Keep in mind that if either of rightAlign or
rightFill are set then justify is ignored, even if both are
set in which case they are all ignored.
- text \@ARRAY || NOTHING
- Pass in a reference to your text, or an anonymous array of
text that you want the routines to manipulate. Returns the text held in
the object.
- hangingIndent 0 || 1 || NOTHING
- Use hanging indents in front of a paragraph, returns
current value of attribute. This is also called a tagged paragraph.
- hangingText \@ARRAY || NOTHING
- The text that will be displayed in front of each paragraph,
if you call format then only the first element is used, if you call
paragraphs then paragraphs cycles through all of them. If
you have more paragraphs than elements in your array than the remainder of
the paragraphs will not have a hanging indented text. Pass a reference to
your array. This is also called a tagged paragraph.
- noBreak 0 || 1 || NOTHING
- Set whether you want to use the non-breaking space
feature.
- noBreakRegex \%HASH || NOTHING
- Pass in a reference to your hash that would hold the
regexes on which not to break. Without any arguments, it returns the hash.
eg.
{'^Mrs?\.$' => '^\S+$','^\S+$' => '^(?:S|J)r\.$'}
don't break names such as Mr. Jones, Mrs. Jones, Jones Jr.
The breaking algorithm is simple. If there should not be a break at the
current end of sentence, then a backtrack is done till there are two words
on which breaking is allowed. If no two such words are found then the end
of sentence is broken anyhow. If there is a single word on current line
then no backtrack is done and the word is stuck on the end. This is so you
can make a list of names for example.
- extraSpace 0 || 1 || NOTHING
- Add extra space after end of sentence, normally
format would add 1 space after end of sentence, if this is set to 1
then 2 spaces are used. Abbreviations are not followed by two spaces.
There are a few internal abbreviations and you can add your own to the
object with abbrevs
- abbrevs \%HASH || @ARRAY || NOTHING
- Add to the current abbreviations, takes a reference to your
hash or an array of abbreviations, if called a second time the original
reference is removed and replaced by the new one. Returns the current
INTERNAL abbreviations.
EXAMPLE¶
use Text::Format;
my $text = Text::Format->new;
$text->rightFill(1);
$text->columns(65);
$text->tabstop(4);
print $text->format("a line to format to an indented regular
paragraph using 65 character wide display");
print $text->paragraphs("paragraph one","paragraph two");
print $text->center("hello world","nifty line 2");
print $text->expand("\t\thello world\n","hmm,\twell\n");
print $text->unexpand(" hello world\n"," hmm");
$text->config({columns => 132, tabstop => 4});
$text = Text::Format->new();
print $text->format(@text);
print $text->paragraphs(@text);
print $text->center(@text);
print $text->format([<FILEHANDLE>]);
print $text->format([$fh->getlines()]);
print $text->paragraphs([<FILEHANDLE>]);
print $text->expand(@text);
print $text->unexpand(@text);
$text = Text::Format->new
({tabstop => 4,bodyIndent => 4,text => \@text});
print $text->format();
print $text->paragraphs();
print $text->center();
print $text->expand();
print $text->unexpand();
print Text::Format->new({columns => 95})->format(@text);
BUGS¶
Line length can exceed the number of specified columns if columns is set to a
small number and long words plus leading whitespace exceed the specified
column length. Actually I see this as a feature since it can be used to make
up a nice word list.
AUTHOR¶
Gabor Egressy
gabor@vmunix.com
Copyright (c) 1998 Gabor Egressy. All rights reserved. All wrongs reversed. This
program is free software; you can redistribute and/or modify it under the same
terms as Perl itself.
Adopted and modified by Shlomi Fish, <
http://www.shlomifish.org/> - all
rights disclaimed.
ACKNOWLEDGMENTS¶
Tom Phoenix
Found a bug with code for two spaces at the end of the sentence and provided a
code fragment for a better solution. Also some preliminary suggestions on the
design.
Brad Appleton
Suggestion and explanation of hanging indents, suggestion for non-breaking
whitespace, general suggestions with regard to interface design.
Byron Brummer
Suggestion for better interface design and object design, code for better
implementation of getting abbreviations.
H. Merijn Brand
Suggestion for a justify feature and original code for doing the justification.
I changed the code to take into account the extra space at end of sentence
feature.
TODO¶