package IO::ScalarArray; =head1 NAME IO::ScalarArray - IO:: interface for reading/writing an array of scalars =head1 SYNOPSIS Perform I/O on strings, using the basic OO interface... use IO::ScalarArray; @data = ("My mes", "sage:\n"); ### Open a handle on an array, and append to it: $AH = new IO::ScalarArray \@data; $AH->print("Hello"); $AH->print(", world!\nBye now!\n"); print "The array is now: ", @data, "\n"; ### Open a handle on an array, read it line-by-line, then close it: $AH = new IO::ScalarArray \@data; while (defined($_ = $AH->getline)) { print "Got line: $_"; } $AH->close; ### Open a handle on an array, and slurp in all the lines: $AH = new IO::ScalarArray \@data; print "All lines:\n", $AH->getlines; ### Get the current position (either of two ways): $pos = $AH->getpos; $offset = $AH->tell; ### Set the current position (either of two ways): $AH->setpos($pos); $AH->seek($offset, 0); ### Open an anonymous temporary array: $AH = new IO::ScalarArray; $AH->print("Hi there!"); print "I printed: ", @{$AH->aref}, "\n"; ### get at value Don't like OO for your I/O? No problem. Thanks to the magic of an invisible tie(), the following now works out of the box, just as it does with IO::Handle: use IO::ScalarArray; @data = ("My mes", "sage:\n"); ### Open a handle on an array, and append to it: $AH = new IO::ScalarArray \@data; print $AH "Hello"; print $AH ", world!\nBye now!\n"; print "The array is now: ", @data, "\n"; ### Open a handle on a string, read it line-by-line, then close it: $AH = new IO::ScalarArray \@data; while (<$AH>) { print "Got line: $_"; } close $AH; ### Open a handle on a string, and slurp in all the lines: $AH = new IO::ScalarArray \@data; print "All lines:\n", <$AH>; ### Get the current position (WARNING: requires 5.6): $offset = tell $AH; ### Set the current position (WARNING: requires 5.6): seek $AH, $offset, 0; ### Open an anonymous temporary scalar: $AH = new IO::ScalarArray; print $AH "Hi there!"; print "I printed: ", @{$AH->aref}, "\n"; ### get at value And for you folks with 1.x code out there: the old tie() style still works, though this is I: use IO::ScalarArray; ### Writing to a scalar... my @a; tie *OUT, 'IO::ScalarArray', \@a; print OUT "line 1\nline 2\n", "line 3\n"; print "Array is now: ", @a, "\n" ### Reading and writing an anonymous scalar... tie *OUT, 'IO::ScalarArray'; print OUT "line 1\nline 2\n", "line 3\n"; tied(OUT)->seek(0,0); while () { print "Got line: ", $_; } =head1 DESCRIPTION This class is part of the IO::Stringy distribution; see L for change log and general information. The IO::ScalarArray class implements objects which behave just like IO::Handle (or FileHandle) objects, except that you may use them to write to (or read from) arrays of scalars. Logically, an array of scalars defines an in-core "file" whose contents are the concatenation of the scalars in the array. The handles created by this class are automatically tiehandle'd (though please see L<"WARNINGS"> for information relevant to your Perl version). For writing large amounts of data with individual print() statements, this class is likely to be more efficient than IO::Scalar. Basically, this: my @a; $AH = new IO::ScalarArray \@a; $AH->print("Hel", "lo, "); ### OO style $AH->print("world!\n"); ### ditto Or this: my @a; $AH = new IO::ScalarArray \@a; print $AH "Hel", "lo, "; ### non-OO style print $AH "world!\n"; ### ditto Causes @a to be set to the following array of 3 strings: ( "Hel" , "lo, " , "world!\n" ) See L and compare with this class. =head1 PUBLIC INTERFACE =cut use Carp; use strict; use vars qw($VERSION @ISA); use IO::Handle; # The package version, both in 1.23 style *and* usable by MakeMaker: $VERSION = "2.110"; # Inheritance: @ISA = qw(IO::Handle); require IO::WrapTie and push @ISA, 'IO::WrapTie::Slave' if ($] >= 5.004); #============================== =head2 Construction =over 4 =cut #------------------------------ =item new [ARGS...] I Return a new, unattached array handle. If any arguments are given, they're sent to open(). =cut sub new { my $proto = shift; my $class = ref($proto) || $proto; my $self = bless \do { local *FH }, $class; tie *$self, $class, $self; $self->open(@_); ### open on anonymous by default $self; } sub DESTROY { shift->close; } #------------------------------ =item open [ARRAYREF] I Open the array handle on a new array, pointed to by ARRAYREF. If no ARRAYREF is given, a "private" array is created to hold the file data. Returns the self object on success, undefined on error. =cut sub open { my ($self, $aref) = @_; ### Sanity: defined($aref) or do {my @a; $aref = \@a}; (ref($aref) eq "ARRAY") or croak "open needs a ref to a array"; ### Setup: $self->setpos([0,0]); *$self->{AR} = $aref; $self; } #------------------------------ =item opened I Is the array handle opened on something? =cut sub opened { *{shift()}->{AR}; } #------------------------------ =item close I Disassociate the array handle from its underlying array. Done automatically on destroy. =cut sub close { my $self = shift; %{*$self} = (); 1; } =back =cut #============================== =head2 Input and output =over 4 =cut #------------------------------ =item flush I No-op, provided for OO compatibility. =cut sub flush { "0 but true" } #------------------------------ =item getc I Return the next character, or undef if none remain. This does a read(1), which is somewhat costly. =cut sub getc { my $buf = ''; ($_[0]->read($buf, 1) ? $buf : undef); } #------------------------------ =item getline I Return the next line, or undef on end of data. Can safely be called in an array context. Currently, lines are delimited by "\n". =cut sub getline { my $self = shift; my ($str, $line) = (undef, ''); ### Minimal impact implementation! ### We do the fast fast thing (no regexps) if using the ### classic input record separator. ### Case 1: $/ is undef: slurp all... if (!defined($/)) { return undef if ($self->eof); ### Get the rest of the current string, followed by remaining strings: my $ar = *$self->{AR}; my @slurp = ( substr($ar->[*$self->{Str}], *$self->{Pos}), @$ar[(1 + *$self->{Str}) .. $#$ar ] ); ### Seek to end: $self->_setpos_to_eof; return join('', @slurp); } ### Case 2: $/ is "\n": elsif ($/ eq "\012") { ### Until we hit EOF (or exitted because of a found line): until ($self->eof) { ### If at end of current string, go fwd to next one (won't be EOF): if ($self->_eos) {++*$self->{Str}, *$self->{Pos}=0}; ### Get ref to current string in array, and set internal pos mark: $str = \(*$self->{AR}[*$self->{Str}]); ### get current string pos($$str) = *$self->{Pos}; ### start matching from here ### Get from here to either \n or end of string, and add to line: $$str =~ m/\G(.*?)((\n)|\Z)/g; ### match to 1st \n or EOS $line .= $1.$2; ### add it *$self->{Pos} += length($1.$2); ### move fwd by len matched return $line if $3; ### done, got line with "\n" } return ($line eq '') ? undef : $line; ### return undef if EOF } ### Case 3: $/ is ref to int. Bail out. elsif (ref($/)) { croak '$/ given as a ref to int; currently unsupported'; } ### Case 4: $/ is either "" (paragraphs) or something weird... ### Bail for now. else { croak '$/ as given is currently unsupported'; } } #------------------------------ =item getlines I Get all remaining lines. It will croak() if accidentally called in a scalar context. =cut sub getlines { my $self = shift; wantarray or croak("can't call getlines in scalar context!"); my ($line, @lines); push @lines, $line while (defined($line = $self->getline)); @lines; } #------------------------------ =item print ARGS... I Print ARGS to the underlying array. Currently, this always causes a "seek to the end of the array" and generates a new array entry. This may change in the future. =cut sub print { my $self = shift; push @{*$self->{AR}}, join('', @_) . (defined($\) ? $\ : ""); ### add the data $self->_setpos_to_eof; 1; } #------------------------------ =item read BUF, NBYTES, [OFFSET]; I Read some bytes from the array. Returns the number of bytes actually read, 0 on end-of-file, undef on error. =cut sub read { my $self = $_[0]; ### we must use $_[1] as a ref my $n = $_[2]; my $off = $_[3] || 0; ### print "getline\n"; my $justread; my $len; ($off ? substr($_[1], $off) : $_[1]) = ''; ### Stop when we have zero bytes to go, or when we hit EOF: my @got; until (!$n or $self->eof) { ### If at end of current string, go forward to next one (won't be EOF): if ($self->_eos) { ++*$self->{Str}; *$self->{Pos} = 0; } ### Get longest possible desired substring of current string: $justread = substr(*$self->{AR}[*$self->{Str}], *$self->{Pos}, $n); $len = length($justread); push @got, $justread; $n -= $len; *$self->{Pos} += $len; } $_[1] .= join('', @got); return length($_[1])-$off; } #------------------------------ =item write BUF, NBYTES, [OFFSET]; I Write some bytes into the array. =cut sub write { my $self = $_[0]; my $n = $_[2]; my $off = $_[3] || 0; my $data = substr($_[1], $n, $off); $n = length($data); $self->print($data); return $n; } =back =cut #============================== =head2 Seeking/telling and other attributes =over 4 =cut #------------------------------ =item autoflush I No-op, provided for OO compatibility. =cut sub autoflush {} #------------------------------ =item binmode I No-op, provided for OO compatibility. =cut sub binmode {} #------------------------------ =item clearerr I Clear the error and EOF flags. A no-op. =cut sub clearerr { 1 } #------------------------------ =item eof I Are we at end of file? =cut sub eof { ### print "checking EOF [*$self->{Str}, *$self->{Pos}]\n"; ### print "SR = ", $#{*$self->{AR}}, "\n"; return 0 if (*{$_[0]}->{Str} < $#{*{$_[0]}->{AR}}); ### before EOA return 1 if (*{$_[0]}->{Str} > $#{*{$_[0]}->{AR}}); ### after EOA ### ### at EOA, past EOS: ((*{$_[0]}->{Str} == $#{*{$_[0]}->{AR}}) && ($_[0]->_eos)); } #------------------------------ # # _eos # # I Are we at end of the CURRENT string? # sub _eos { (*{$_[0]}->{Pos} >= length(*{$_[0]}->{AR}[*{$_[0]}->{Str}])); ### past last char } #------------------------------ =item seek POS,WHENCE I Seek to a given position in the stream. Only a WHENCE of 0 (SEEK_SET) is supported. =cut sub seek { my ($self, $pos, $whence) = @_; ### Seek: if ($whence == 0) { $self->_seek_set($pos); } elsif ($whence == 1) { $self->_seek_cur($pos); } elsif ($whence == 2) { $self->_seek_end($pos); } else { croak "bad seek whence ($whence)" } return 1; } #------------------------------ # # _seek_set POS # # Instance method, private. # Seek to $pos relative to start: # sub _seek_set { my ($self, $pos) = @_; ### Advance through array until done: my $istr = 0; while (($pos >= 0) && ($istr < scalar(@{*$self->{AR}}))) { if (length(*$self->{AR}[$istr]) > $pos) { ### it's in this string! return $self->setpos([$istr, $pos]); } else { ### it's in next string $pos -= length(*$self->{AR}[$istr++]); ### move forward one string } } ### If we reached this point, pos is at or past end; zoom to EOF: return $self->_setpos_to_eof; } #------------------------------ # # _seek_cur POS # # Instance method, private. # Seek to $pos relative to current position. # sub _seek_cur { my ($self, $pos) = @_; $self->_seek_set($self->tell + $pos); } #------------------------------ # # _seek_end POS # # Instance method, private. # Seek to $pos relative to end. # We actually seek relative to beginning, which is simple. # sub _seek_end { my ($self, $pos) = @_; $self->_seek_set($self->_tell_eof + $pos); } #------------------------------ =item tell I Return the current position in the stream, as a numeric offset. =cut sub tell { my $self = shift; my $off = 0; my ($s, $str_s); for ($s = 0; $s < *$self->{Str}; $s++) { ### count all "whole" scalars defined($str_s = *$self->{AR}[$s]) or $str_s = ''; ###print STDERR "COUNTING STRING $s (". length($str_s) . ")\n"; $off += length($str_s); } ###print STDERR "COUNTING POS ($self->{Pos})\n"; return ($off += *$self->{Pos}); ### plus the final, partial one } #------------------------------ # # _tell_eof # # Instance method, private. # Get position of EOF, as a numeric offset. # This is identical to the size of the stream - 1. # sub _tell_eof { my $self = shift; my $len = 0; foreach (@{*$self->{AR}}) { $len += length($_) } $len; } #------------------------------ =item setpos POS I Seek to a given position in the array, using the opaque getpos() value. Don't expect this to be a number. =cut sub setpos { my ($self, $pos) = @_; (ref($pos) eq 'ARRAY') or die "setpos: only use a value returned by getpos!\n"; (*$self->{Str}, *$self->{Pos}) = @$pos; } #------------------------------ # # _setpos_to_eof # # Fast-forward to EOF. # sub _setpos_to_eof { my $self = shift; $self->setpos([scalar(@{*$self->{AR}}), 0]); } #------------------------------ =item getpos I Return the current position in the array, as an opaque value. Don't expect this to be a number. =cut sub getpos { [*{$_[0]}->{Str}, *{$_[0]}->{Pos}]; } #------------------------------ =item aref I Return a reference to the underlying array. =cut sub aref { *{shift()}->{AR}; } =back =cut #------------------------------ # Tied handle methods... #------------------------------ ### Conventional tiehandle interface: sub TIEHANDLE { (defined($_[1]) && UNIVERSAL::isa($_[1],"IO::ScalarArray")) ? $_[1] : shift->new(@_) } sub GETC { shift->getc(@_) } sub PRINT { shift->print(@_) } sub PRINTF { shift->print(sprintf(shift, @_)) } sub READ { shift->read(@_) } sub READLINE { wantarray ? shift->getlines(@_) : shift->getline(@_) } sub WRITE { shift->write(@_); } sub CLOSE { shift->close(@_); } sub SEEK { shift->seek(@_); } sub TELL { shift->tell(@_); } sub EOF { shift->eof(@_); } #------------------------------------------------------------ 1; __END__ # SOME PRIVATE NOTES: # # * The "current position" is the position before the next # character to be read/written. # # * Str gives the string index of the current position, 0-based # # * Pos gives the offset within AR[Str], 0-based. # # * Inital pos is [0,0]. After print("Hello"), it is [1,0]. =head1 WARNINGS Perl's TIEHANDLE spec was incomplete prior to 5.005_57; it was missing support for C, C, and C. Attempting to use these functions with an IO::ScalarArray will not work prior to 5.005_57. IO::ScalarArray will not have the relevant methods invoked; and even worse, this kind of bug can lie dormant for a while. If you turn warnings on (via C<$^W> or C), and you see something like this... attempt to seek on unopened filehandle ...then you are probably trying to use one of these functions on an IO::ScalarArray with an old Perl. The remedy is to simply use the OO version; e.g.: $AH->seek(0,0); ### GOOD: will work on any 5.005 seek($AH,0,0); ### WARNING: will only work on 5.005_57 and beyond =head1 VERSION $Id: ScalarArray.pm,v 1.7 2005/02/10 21:21:53 dfs Exp $ =head1 AUTHOR =head2 Primary Maintainer David F. Skoll (F). =head2 Principal author Eryq (F). President, ZeeGee Software Inc (F). =head2 Other contributors Thanks to the following individuals for their invaluable contributions (if I've forgotten or misspelled your name, please email me!): I for suggesting C. I for suggesting C. I for his offset-using read() and write() implementations. I for the IO::Handle inheritance and automatic tie-ing. =cut #------------------------------ 1;