doc/tta_api/api_includes/Texinfo-Convert-Unicode.texi - texinfo - Git at Google

 @node Texinfo@asis{::}Convert@asis{::}Unicode
 @chapter Texinfo::Convert::Unicode

 @node Texinfo@asis{::}Convert@asis{::}Unicode NAME
 @section Texinfo::Convert::Unicode NAME

 Texinfo::Convert::Unicode - Representation as Unicode characters

 @node Texinfo@asis{::}Convert@asis{::}Unicode SYNOPSIS
 @section Texinfo::Convert::Unicode SYNOPSIS

 @verbatim
   use Texinfo::Convert::Unicode qw(unicode_accent encoded_accents
                                    unicode_text);
   use Texinfo::Convert::Text qw(convert_to_text);

   my ($contents_element, $stack)
       = Texinfo::Common::find_innermost_accent_contents($accent);

   my $formatted_accents = encoded_accents($converter,
                  convert_to_text($contents_element), $stack, $encoding,
                         \&Texinfo::Text::ascii_accent_fallback);
 @end verbatim

 @node Texinfo@asis{::}Convert@asis{::}Unicode NOTES
 @section Texinfo::Convert::Unicode NOTES

 The Texinfo Perl module main purpose is to be used in @code{texi2any} to convert
 Texinfo to other formats.  There is no promise of API stability.

 @node Texinfo@asis{::}Convert@asis{::}Unicode DESCRIPTION
 @section Texinfo::Convert::Unicode DESCRIPTION

 @code{Texinfo::Convert::Unicode} provides methods dealing with Unicode
 representation and conversion of Unicode code points, to be used in Texinfo
 converters.

 When an encoding supported in Texinfo is given as argument of a method of the
 module, the accented letters or characters returned by the method should only
 be represented by Unicode code points if it is known that Perl should manage
 to convert the Unicode code points to encoded characters in the encoding
 character set.  Note that the actual conversion is done by Perl, not by the
 module.

 @node Texinfo@asis{::}Convert@asis{::}Unicode METHODS
 @section Texinfo::Convert::Unicode METHODS

 @table @asis
 @item $result = brace_no_arg_command($command_name, $encoding)
 @anchor{Texinfo@asis{::}Convert@asis{::}Unicode $result = brace_no_arg_command($command_name@comma{} $encoding)}
 @cindex @code{brace_no_arg_command}

 Return the Unicode representation of a command with brace and no argument
 @emph{$command_name} (like @code{@@bullet@{@}}, @code{@@aa@{@}} or @code{@@guilsinglleft@{@}}),
 or @code{undef} if the Unicode representation cannot be converted to encoding
 @emph{$encoding}.

 @item $possible_conversion = check_unicode_point_conversion($arg, $output_debug)
 @anchor{Texinfo@asis{::}Convert@asis{::}Unicode $possible_conversion = check_unicode_point_conversion($arg@comma{} $output_debug)}
 @cindex @code{check_unicode_point_conversion}

 Check that it is possible to output actual UTF-8 binary bytes
 corresponding to the Unicode code point string @emph{$arg} (such as
 @code{201D}).  Perl gives a warning and will not output UTF-8 for
 Unicode non-characters such as U+10FFFF.  If the optional
 @emph{$output_debug} argument is set, a debugging output warning
 is emitted if the test of the conversion failed.
 Returns 1 if the conversion is possible and can be attempted,
 0 otherwise.

 @item $result = encoded_accents($converter, $text, $stack, $encoding, $format_accent, $set_case)
 @anchor{Texinfo@asis{::}Convert@asis{::}Unicode $result = encoded_accents($converter@comma{} $text@comma{} $stack@comma{} $encoding@comma{} $format_accent@comma{} $set_case)}
 @cindex @code{encoded_accents}

 @emph{$encoding} is the encoding the accented characters should be encoded to.  If
 @emph{$encoding} not set, @emph{$result} is set to @code{undef}.  Nested accents and
 their content are passed with @emph{$text} and @emph{$stack}.  @emph{$text} is the text
 appearing within nested accent commands.  @emph{$stack} is an array reference
 holding the nested accents texinfo tree elements.  In general, @emph{$text} is
 the formatted contents and @emph{$stack} the stack returned by
 @ref{Texinfo@asis{::}Common ($contents_element@comma{}
 \@@accent_commands) = find_innermost_accent_contents($element),, Texinfo::Common::find_innermost_accent_contents}.  The function
 tries to convert as much as possible the accents to @emph{$encoding} starting from the
 innermost accent.

 @emph{$format_accent} is a function reference that is used to format the accent
 commands if there is no encoded character available at some point of the
 conversion of the @emph{$stack}.  @emph{$converter} is a converter object optionaly
 used by @emph{$format_accent}.  It may be @code{undef} if there is no need of
 converter object in @emph{$format_accent}.

 The @emph{$set_case} argument is optional.  If @emph{$set_case} is positive, the result
 is upper-cased, while if it is negative, the result is lower-cased.

 @item $width = string_width($string)
 @anchor{Texinfo@asis{::}Convert@asis{::}Unicode $width = string_width($string)}
 @cindex @code{string_width}

 Return the string width, taking into account the fact that some characters
 have a zero width (like composing accents) while some have a width of 2
 (most chinese characters, for example).

 @item $result = unicode_accent($text, $accent_command, $index_in_stack, $accents_stack)
 @anchor{Texinfo@asis{::}Convert@asis{::}Unicode $result = unicode_accent($text@comma{} $accent_command@comma{} $index_in_stack@comma{} $accents_stack)}
 @cindex @code{unicode_accent}

 @emph{$text} is the text appearing within an accent command.  @emph{$accent_command}
 should be a Texinfo tree element corresponding to an accent command taking
 an argument.  @emph{$index_in_stack} is the position in the @emph{$accents_stack}
 of the accent command being converted.  The function returns the Unicode
 representation of the accented character.

 @item $is_decoded = unicode_point_decoded_in_encoding($encoding, $unicode_point)
 @anchor{Texinfo@asis{::}Convert@asis{::}Unicode $is_decoded = unicode_point_decoded_in_encoding($encoding@comma{} $unicode_point)}
 @cindex @code{unicode_point_decoded_in_encoding}

 Return true if the @emph{$unicode_point} will be encoded in the encoding
 @emph{$encoding}.  The @emph{$unicode_point} should be specified as a four letter
 string describing an hexadecimal number with letters in upper case
 (such as @code{201D}).  Tables are used to determine if the @emph{$unicode_point}
 will be encoded, when the encoding does not cover the whole Unicode range.

 If the encoding is not supported in Texinfo, the result will always be false.

 @item $result = unicode_text($text, $in_code)
 @anchor{Texinfo@asis{::}Convert@asis{::}Unicode $result = unicode_text($text@comma{} $in_code)}
 @cindex @code{unicode_text}

 Return @emph{$text} with dashes and quotes corresponding, for example to @code{---} or
 @code{'}, represented as Unicode code points.  If @emph{$in_code} is set, the text is
 considered to be in code style.

 @end table

 @node Texinfo@asis{::}Convert@asis{::}Unicode AUTHOR
 @section Texinfo::Convert::Unicode AUTHOR

 Patrice Dumas, <bug-texinfo@@gnu.org>

 @node Texinfo@asis{::}Convert@asis{::}Unicode COPYRIGHT AND LICENSE
 @section Texinfo::Convert::Unicode COPYRIGHT AND LICENSE

 Copyright 2010- Free Software Foundation, Inc.  See the source file for
 all copyright years.

 This library is free software; you can redistribute it and/or modify
 it under the terms of the GNU General Public License as published by
 the Free Software Foundation; either version 3 of the License, or (at
 your option) any later version.
	@node Texinfo@asis{::}Convert@asis{::}Unicode
	@chapter Texinfo::Convert::Unicode

	@node Texinfo@asis{::}Convert@asis{::}Unicode NAME
	@section Texinfo::Convert::Unicode NAME

	Texinfo::Convert::Unicode - Representation as Unicode characters

	@node Texinfo@asis{::}Convert@asis{::}Unicode SYNOPSIS
	@section Texinfo::Convert::Unicode SYNOPSIS

	@verbatim
	use Texinfo::Convert::Unicode qw(unicode_accent encoded_accents
	unicode_text);
	use Texinfo::Convert::Text qw(convert_to_text);

	my ($contents_element, $stack)
	= Texinfo::Common::find_innermost_accent_contents($accent);

	my $formatted_accents = encoded_accents($converter,
	convert_to_text($contents_element), $stack, $encoding,
	\&Texinfo::Text::ascii_accent_fallback);
	@end verbatim

	@node Texinfo@asis{::}Convert@asis{::}Unicode NOTES
	@section Texinfo::Convert::Unicode NOTES

	The Texinfo Perl module main purpose is to be used in @code{texi2any} to convert
	Texinfo to other formats. There is no promise of API stability.

	@node Texinfo@asis{::}Convert@asis{::}Unicode DESCRIPTION
	@section Texinfo::Convert::Unicode DESCRIPTION

	@code{Texinfo::Convert::Unicode} provides methods dealing with Unicode
	representation and conversion of Unicode code points, to be used in Texinfo
	converters.

	When an encoding supported in Texinfo is given as argument of a method of the
	module, the accented letters or characters returned by the method should only
	be represented by Unicode code points if it is known that Perl should manage
	to convert the Unicode code points to encoded characters in the encoding
	character set. Note that the actual conversion is done by Perl, not by the
	module.

	@node Texinfo@asis{::}Convert@asis{::}Unicode METHODS
	@section Texinfo::Convert::Unicode METHODS

	@table @asis
	@item $result = brace_no_arg_command($command_name, $encoding)
	@anchor{Texinfo@asis{::}Convert@asis{::}Unicode $result = brace_no_arg_command($command_name@comma{} $encoding)}
	@cindex @code{brace_no_arg_command}

	Return the Unicode representation of a command with brace and no argument
	@emph{$command_name} (like @code{@@bullet@{@}}, @code{@@aa@{@}} or @code{@@guilsinglleft@{@}}),
	or @code{undef} if the Unicode representation cannot be converted to encoding
	@emph{$encoding}.

	@item $possible_conversion = check_unicode_point_conversion($arg, $output_debug)
	@anchor{Texinfo@asis{::}Convert@asis{::}Unicode $possible_conversion = check_unicode_point_conversion($arg@comma{} $output_debug)}
	@cindex @code{check_unicode_point_conversion}

	Check that it is possible to output actual UTF-8 binary bytes
	corresponding to the Unicode code point string @emph{$arg} (such as
	@code{201D}). Perl gives a warning and will not output UTF-8 for
	Unicode non-characters such as U+10FFFF. If the optional
	@emph{$output_debug} argument is set, a debugging output warning
	is emitted if the test of the conversion failed.
	Returns 1 if the conversion is possible and can be attempted,
	0 otherwise.

	@item $result = encoded_accents($converter, $text, $stack, $encoding, $format_accent, $set_case)
	@anchor{Texinfo@asis{::}Convert@asis{::}Unicode $result = encoded_accents($converter@comma{} $text@comma{} $stack@comma{} $encoding@comma{} $format_accent@comma{} $set_case)}
	@cindex @code{encoded_accents}

	@emph{$encoding} is the encoding the accented characters should be encoded to. If
	@emph{$encoding} not set, @emph{$result} is set to @code{undef}. Nested accents and
	their content are passed with @emph{$text} and @emph{$stack}. @emph{$text} is the text
	appearing within nested accent commands. @emph{$stack} is an array reference
	holding the nested accents texinfo tree elements. In general, @emph{$text} is
	the formatted contents and @emph{$stack} the stack returned by
	@ref{Texinfo@asis{::}Common ($contents_element@comma{}
	\@@accent_commands) = find_innermost_accent_contents($element),, Texinfo::Common::find_innermost_accent_contents}. The function
	tries to convert as much as possible the accents to @emph{$encoding} starting from the
	innermost accent.

	@emph{$format_accent} is a function reference that is used to format the accent
	commands if there is no encoded character available at some point of the
	conversion of the @emph{$stack}. @emph{$converter} is a converter object optionaly
	used by @emph{$format_accent}. It may be @code{undef} if there is no need of
	converter object in @emph{$format_accent}.

	The @emph{$set_case} argument is optional. If @emph{$set_case} is positive, the result
	is upper-cased, while if it is negative, the result is lower-cased.

	@item $width = string_width($string)
	@anchor{Texinfo@asis{::}Convert@asis{::}Unicode $width = string_width($string)}
	@cindex @code{string_width}

	Return the string width, taking into account the fact that some characters
	have a zero width (like composing accents) while some have a width of 2
	(most chinese characters, for example).

	@item $result = unicode_accent($text, $accent_command, $index_in_stack, $accents_stack)
	@anchor{Texinfo@asis{::}Convert@asis{::}Unicode $result = unicode_accent($text@comma{} $accent_command@comma{} $index_in_stack@comma{} $accents_stack)}
	@cindex @code{unicode_accent}

	@emph{$text} is the text appearing within an accent command. @emph{$accent_command}
	should be a Texinfo tree element corresponding to an accent command taking
	an argument. @emph{$index_in_stack} is the position in the @emph{$accents_stack}
	of the accent command being converted. The function returns the Unicode
	representation of the accented character.

	@item $is_decoded = unicode_point_decoded_in_encoding($encoding, $unicode_point)
	@anchor{Texinfo@asis{::}Convert@asis{::}Unicode $is_decoded = unicode_point_decoded_in_encoding($encoding@comma{} $unicode_point)}
	@cindex @code{unicode_point_decoded_in_encoding}

	Return true if the @emph{$unicode_point} will be encoded in the encoding
	@emph{$encoding}. The @emph{$unicode_point} should be specified as a four letter
	string describing an hexadecimal number with letters in upper case
	(such as @code{201D}). Tables are used to determine if the @emph{$unicode_point}
	will be encoded, when the encoding does not cover the whole Unicode range.

	If the encoding is not supported in Texinfo, the result will always be false.

	@item $result = unicode_text($text, $in_code)
	@anchor{Texinfo@asis{::}Convert@asis{::}Unicode $result = unicode_text($text@comma{} $in_code)}
	@cindex @code{unicode_text}

	Return @emph{$text} with dashes and quotes corresponding, for example to @code{---} or
	@code{'}, represented as Unicode code points. If @emph{$in_code} is set, the text is
	considered to be in code style.

	@end table

	@node Texinfo@asis{::}Convert@asis{::}Unicode AUTHOR
	@section Texinfo::Convert::Unicode AUTHOR

	Patrice Dumas, <bug-texinfo@@gnu.org>

	@node Texinfo@asis{::}Convert@asis{::}Unicode COPYRIGHT AND LICENSE
	@section Texinfo::Convert::Unicode COPYRIGHT AND LICENSE

	Copyright 2010- Free Software Foundation, Inc. See the source file for
	all copyright years.

	This library is free software; you can redistribute it and/or modify
	it under the terms of the GNU General Public License as published by
	the Free Software Foundation; either version 3 of the License, or (at
	your option) any later version.