.\" Automatically generated by Pod::Man v1.37, Pod::Parser v1.14 .\" .\" Standard preamble: .\" ======================================================================== .de Sh \" Subsection heading .br .if t .Sp .ne 5 .PP \fB\\$1\fR .PP .. .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. | will give a .\" real vertical bar. \*(C+ will give a nicer C++. Capital omega is used to .\" do unbreakable dashes and therefore won't be available. \*(C` and \*(C' .\" expand to `' in nroff, nothing in troff, for use with C<>. .tr \(*W-|\(bv\*(Tr .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' 'br\} .\" .\" If the F register is turned on, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.Sh), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . nr % 0 . rr F .\} .\" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .hy 0 .if n .na .\" .\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2). .\" Fear. Run. Save yourself. No user-serviceable parts. . \" fudge factors for nroff and troff .if n \{\ . ds #H 0 . ds #V .8m . ds #F .3m . ds #[ \f1 . ds #] \fP .\} .if t \{\ . ds #H ((1u-(\\\\n(.fu%2u))*.13m) . ds #V .6m . ds #F 0 . ds #[ \& . ds #] \& .\} . \" simple accents for nroff and troff .if n \{\ . ds ' \& . ds ` \& . ds ^ \& . ds , \& . ds ~ ~ . ds / .\} .if t \{\ . ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u" . ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u' . ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u' . ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u' . ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u' . ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u' .\} . \" troff and (daisy-wheel) nroff accents .ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V' .ds 8 \h'\*(#H'\(*b\h'-\*(#H' .ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#] .ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H' .ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u' .ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#] .ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#] .ds ae a\h'-(\w'a'u*4/10)'e .ds Ae A\h'-(\w'A'u*4/10)'E . \" corrections for vroff .if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u' .if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u' . \" for low resolution devices (crt and lpr) .if \n(.H>23 .if \n(.V>19 \ \{\ . ds : e . ds 8 ss . ds o a . ds d- d\h'-1'\(ga . ds D- D\h'-1'\(hy . ds th \o'bp' . ds Th \o'LP' . ds ae ae . ds Ae AE .\} .rm #[ #] #H #V #F C .\" ======================================================================== .\" .IX Title "Writer 3" .TH Writer 3 "2006-09-14" "perl v5.8.4" "User Contributed Perl Documentation" .SH "NAME" XML::SAX::Writer \- SAX2 Writer .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 2 \& use XML::SAX::Writer; \& use XML::SAX::SomeDriver; .Ve .PP .Vb 2 \& my $w = XML::SAX::Writer->new; \& my $d = XML::SAX::SomeDriver->new(Handler => $w); .Ve .PP .Vb 1 \& $d->parse('some options...'); .Ve .SH "DESCRIPTION" .IX Header "DESCRIPTION" .Sh "Why yet another \s-1XML\s0 Writer ?" .IX Subsection "Why yet another XML Writer ?" A new \s-1XML\s0 Writer was needed to match the \s-1SAX2\s0 effort because quite naturally no existing writer understood \s-1SAX2\s0. My first intention had been to start patching XML::Handler::YAWriter as it had previously been my favourite writer in the \s-1SAX1\s0 world. .PP However the more I patched it the more I realised that what I thought was going to be a simple patch (mostly adding a few event handlers and changing the attribute syntax) was turning out to be a rewrite due to various ideas I'd been collecting along the way. Besides, I couldn't find a way to elegantly make it work with \s-1SAX2\s0 without breaking the \&\s-1SAX1\s0 compatibility which people are probably still using. There are of course ways to do that, but most require user interaction which is something I wanted to avoid. .PP So in the end there was a new writer. I think it's in fact better this way as it helps keep \s-1SAX1\s0 and \s-1SAX2\s0 separated. .SH "METHODS" .IX Header "METHODS" .IP "* new(%hash)" 4 .IX Item "new(%hash)" This is the constructor for this object.  It takes a number of parameters, all of which are optional. .IP "\-\- Output" 4 .IX Item "-- Output" This parameter can be one of several things.  If it is a simple scalar, it is interpreted as a filename which will be opened for writing.  If it is a scalar reference, output will be appended to this scalar.  If it is an array reference, output will be pushed onto this array as it is generated.  If it is a filehandle, then output will be sent to this filehandle. .Sp Finally, it is possible to pass an object for this parameter, in which case it is assumed to be an object that implements the consumer interface described later in the documentation. .Sp If this parameter is not provided, then output is sent to \s-1STDOUT\s0. .IP "\-\- Escape" 4 .IX Item "-- Escape" This should be a hash reference where the keys are characters sequences that should be escaped and the values are the escaped form of the sequence.  By default, this module will escape the ampersand (&), less than (<), greater than (>), double quote ("), and apostrophe ('). Note that some browsers don't support the ' escape used for apostrophes so that you should be careful when outputting \s-1XHTML\s0. .Sp If you only want to add entries to the Escape hash, you can first copy the contents of \f(CW%XML::SAX::Writer::DEFAULT_ESCAPE\fR. .IP "\-\- CommentEscape" 4 .IX Item "-- CommentEscape" Comment content often needs to be escaped differently from other content. This option works exactly as the previous one except that by default it only escapes the double dash (\-\-) and that the contents can be copied from \f(CW%XML::SAX::Writer::COMMENT_ESCAPE\fR. .IP "\-\- EncodeFrom" 4 .IX Item "-- EncodeFrom" The character set encoding in which incoming data will be provided. This defaults to \s-1UTF\-8\s0, which works for US-ASCII as well. .IP "\-\- EncodeTo" 4 .IX Item "-- EncodeTo" The character set encoding in which output should be encoded.  Again, this defaults to \s-1UTF\-8\s0. .SH "THE CONSUMER INTERFACE" .IX Header "THE CONSUMER INTERFACE" XML::SAX::Writer can receive pluggable consumer objects that will be in charge of writing out what is formatted by this module. Setting a Consumer is done by setting the Output option to the object of your choice instead of to an array, scalar, or file handle as is more commonly done (internally those in fact map to Consumer classes and and simply available as options for your convienience). .PP If you don't understand this, don't worry. You don't need it most of the time. .PP That object can be from any class, but must have two methods in its \&\s-1API\s0. It is also strongly recommended that it inherits from XML::SAX::Writer::ConsumerInterface so that it will not break if that interface evolves over time. There are examples at the end of XML::SAX::Writer's code. .PP The two methods that it needs to implement are: .IP "* output \s-1STRING\s0" 4 .IX Item "output STRING" (Required) .Sp This is called whenever the Writer wants to output a string formatted in \s-1XML\s0. Encoding conversion, character escaping, and formatting have already taken place. It's up to the consumer to do whatever it wants with the string. .IP "* \fIfinalize()\fR" 4 .IX Item "finalize()" (Optional) .Sp This is called once the document has been output in its entirety, during the end_document event. end_document will in fact return whatever \fIfinalize()\fR returns, and that in turn should be returned by \fIparse()\fR for whatever parser was invoked. It might be useful if you need to provide feedback of some sort. .PP Here's an example of a custom consumer. Note the extra \f(CW\*(C`$\*(C'\fR signs in front of \f(CW$self\fR; the base class is optimized for the overwhelmingly common case where only one data member is required and \f(CW$self\fR is a reference to that data member. .PP .Vb 1 \& package MyConsumer; .Ve .PP .Vb 1 \& @ISA = qw( XML::SAX::Writer::ConsumerInterface ); .Ve .PP .Vb 1 \& use strict; .Ve .PP .Vb 2 \& sub new { \& my $self = shift->SUPER::new( my $output ); .Ve .PP .Vb 1 \& $$self = ''; # Note the extra '$' .Ve .PP .Vb 2 \& return $self; \& } .Ve .PP .Vb 4 \& sub output { \& my $self = shift; \& $$self .= uc shift; \& } .Ve .PP .Vb 4 \& sub get_output { \& my $self = shift; \& return $$self; \& } .Ve .PP And here's one way to use it: .PP .Vb 2 \& my $c = MyConsumer->new; \& my $w = XML::SAX::Writer->new( Output => $c ); .Ve .PP .Vb 1 \& ## ... send events to $w ... .Ve .PP .Vb 1 \& print $c->get_output; .Ve .PP If you need to store more that one data member, pass in an array or hash reference: .PP .Vb 1 \& my $self = shift->SUPER::new( {} ); .Ve .PP and access it like: .PP .Vb 4 \& sub output { \& my $self = shift; \& $$self->{Output} .= uc shift; \& } .Ve .SH "THE ENCODER INTERFACE" .IX Header "THE ENCODER INTERFACE" Encoders can be plugged in to allow one to use one's favourite encoder object. Presently there are two encoders: Iconv and NullEncoder, and one based on \f(CW\*(C`Encode\*(C'\fR ought to be out soon. They need to implement two methods, and may inherit from XML::SAX::Writer::NullConverter if they wish to .IP "new \s-1FROM_ENCODING\s0, \s-1TO_ENCODING\s0" 4 .IX Item "new FROM_ENCODING, TO_ENCODING" Creates a new Encoder. The arguments are the chosen encodings. .IP "convert \s-1STRING\s0" 4 .IX Item "convert STRING" Converts that string and returns it. .SH "CUSTOM OUTPUT" .IX Header "CUSTOM OUTPUT" This module is generally used to write \s-1XML\s0 \*(-- which it does most of the time \*(-- but just like the rest of \s-1SAX\s0 it can be used as a generic framework to output data, the opposite of a non-XML \s-1SAX\s0 parser. .PP Of course there's only so much that one can abstract, so depending on your format this may or may not be useful. If it is, you'll need to know the followin \s-1API\s0 (and probably to have a look inside \&\f(CW\*(C`XML::SAX::Writer::XML\*(C'\fR, the default Writer). .IP "init" 4 .IX Item "init" Called before the writing starts, it's a chance for the subclass to do some initialisation if it needs it. .IP "setConverter" 4 .IX Item "setConverter" This is used to set the proper converter for character encodings. The default implementation should suffice but you can override it. It must set \f(CW\*(C`$self\-\*(C'\fR{Encoder}> to an Encoder object. Subclasses *should* call it. .IP "setConsumer" 4 .IX Item "setConsumer" Same as above, except that it is for the Consumer object, and that it must set \f(CW\*(C`$self\-\*(C'\fR{Consumer}>. .IP "setEscaperRegex" 4 .IX Item "setEscaperRegex" Will initialise the escaping regex \f(CW\*(C`$self\-\*(C'\fR{EscaperRegex}> based on what is needed. .IP "escape \s-1STRING\s0" 4 .IX Item "escape STRING" Takes a string and escapes it properly. .IP "setCommentEscaperRegex and escapeComment \s-1STRING\s0" 4 .IX Item "setCommentEscaperRegex and escapeComment STRING" These work exactly the same as the two above, except that they are meant to operate on comment contents, which often have different escaping rules than those that apply to regular content. .SH "TODO" .IX Header "TODO" .Vb 1 \& - proper UTF-16 handling .Ve .PP .Vb 4 \& - make the quote character an option. By default it is here ', but \& I know that a lot of people (for reasons I don't understand but \& won't question :-) prefer to use ". (on most keyboards " is more \& typing, on the rest it's often as much typing). .Ve .PP .Vb 1 \& - the formatting options need to be developed. .Ve .PP .Vb 1 \& - test, test, test (and then some tests) .Ve .PP .Vb 1 \& - doc, doc, doc (actually this part is in better shape) .Ve .PP .Vb 6 \& - add support for Perl 5.7's Encode module so that we can use it \& instead of Text::Iconv. Encode is more complete and likely to be \& better supported overall. This will be done using a pluggable \& encoder (so that users can provide their own if they want to) \& and detecter both in Makefile.PL requirements and in the module \& at runtime. .Ve .PP .Vb 2 \& - remove the xml_decl and replace it with intelligent logic, as \& discussed on perl-xml .Ve .PP .Vb 2 \& - make a the Consumer selecting code available in the API, to avoid \& duplicating .Ve .PP .Vb 1 \& - add an Apache output Consumer, triggered by passing $r as Output .Ve .SH "CREDITS" .IX Header "CREDITS" Michael Koehne (XML::Handler::YAWriter) for much inspiration and Barrie Slaymaker for the Consumer pattern idea, the coderef output option and miscellaneous bugfixes and performance tweaks. Of course the usual suspects (Kip Hampton and Matt Sergeant) helped in the usual ways. .SH "AUTHOR" .IX Header "AUTHOR" Robin Berjon, robin@knowscape.com .SH "COPYRIGHT" .IX Header "COPYRIGHT" Copyright (c) 2001\-2006 Robin Berjon nad Perl \s-1XML\s0 project. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. .SH "SEE ALSO" .IX Header "SEE ALSO" XML::SAX::*