2.7. table.pl - Table Processing Library
2.7.1. Purpose
This library provides routines for reading, processing and writing TBL files.
2.7.2. Interface
require "table.pl"; @TABLE_MODEL_MODEL = ... ($success, @records) = &TableFetch($file, $format); @records = &TableParse(@strings); @records = &TableParseUsingParams($ref_to_strings, %params); &TableValidate(*table, *rules); &TablePrint($strm, *table, %flags); @strings = &TableFormat(*table, %flags); @fields = &TableFields($format); %values = &TableRecSplit(*fields, $record); $record = &TableRecJoin(*fields, %values); $string = &TableRecFormat($format, %values); @result = &TableFilter(*table, $where, *var); @result = &TableDeleteFields(*table, @junk); @result = &TableSelectFields(*table, @new_fields); @result = &TableSort(*table, @by); %index = &TableIndex(*table, *duplicates, @by); %values = &TableLookup(*table, *index, @key_values); @flds = &TableFieldsCheck($format, $msg_type, %known);
2.7.3. Description
Tables are stored in arrays. The first element in the array is the input format specification. Remaining elements are data, one record per element.
The routines are often used together as follows:
# Read in the table (using the default format) ($ok, @table) = &TableFetch($table_name); # Process the data records $format = shift @table; @flds = &TableFields($format); for $rec (@table) { %value = &TableRecSplit(*flds, $rec); $value{'Age'}++; # say ... $rec = &TableRecJoin(*flds, %value); } unshift(@table, $format); # Ouptut the new table (using the default flags) &TablePrint(STDOUT, *table);
Note: Multi-line fields are stored with a newline as the first character so be sure to allow for this when processing them.
TABLE_MODEL_MODEL is the model for model files.
TableFetch reads file as a table defined in TBL format. If the first data line of the file is not an input format specification, it can be specified using format. success is 1 if the file is opened successfully. records is an array of records, the first of which is the format specification.
TableParse converts a list of strings into a table.
TableParseUsingParams converts a list of strings into a table using the nominated parameters. No parameters are supported yet.
TableValidate validates @table against @rules.
TablePrint outputs @table to strm. The flags supported are outlined below.
Flag | Description |
TBL format: | |
behead | column headings are not included at the top of the output |
delimited | use delimited format - delimiter is the argument (default is tab) |
TableFormat formats @table using flags and returns a set of strings. See TablePrint for a list of the flags supported.
TableFields returns the list of fields in format. Behaviour for custom formats is currently undefined.
TableRecSplit converts a record into a set of name-value pairs using a set of fields (typically returned from TableFields).
TableRecJoin converts a set of name-value pairs into a record using a set of fields (typically returned from TableFields).
TableRecFormat formats a set of name-value pairs into a string using a format string. Behaviour for custom formats is currently undefined.
TableFilter filters a table using an expression.
TableDeleteFields deletes a list of fields from a table.
TableSelectFields selects a list of fields from a table.
TableSort sorts a table by one of more fields. The fields to use are passed in by. If no fields are specified, all fields are used in the order they appear in the table.
TableIndex indexes a table by one of more fields. The fields to use are passed in by. If no fields are specified, all fields are used in the order they appear in the table. index is an associative array where:
- the key is the value of the by fields
- the data is the index in table of the matching record
For multiple-field keys, values are separated by a null character (\000). The index of the first data record is 1 (the field specification record has an index of 0). @duplicates is the list of indices which do not appear in %index. If duplicate keys are found, the highest index is stored in %index for each key.
TableLookup returns the name-value pairs for a given key. @table is the data table. %index is an index created using TableIndex. An empty associative array is returned if no matching record is found.
TableFieldsCheck is a wrapper around TableFields which checks that the fields contains no duplicates. If known is defined, its keys are used to find unknown fields, if any. Any errors encountered are output as such using AppMsg. msg_type can be used to control the message type - error is the default.
2.7.4. Limitations and future directions
When validating field-names, the line number and context should be set to something meaningful. To achieve this, the line number of the format string in the file (if it's in the file, that is!) needs to be saved as part of the table.