_XML_ Library ============= Files: _xml.ss_ Signature: _xml^_ Basic XML Data Types ==================== Document: This structure represents an XML document. The only useful part is the document-element, which contains all the content. The rest of of the structure contains DTD information, which isn't supported, and processing-instructions. Element: Each pair of start/end tags and everything in between is an element. It has the following pieces: a name attributes contents including sub-elements Xexpr: S-expression representations of XML data. The end of this document has more details. Exceptions ========== > (define-struct (exn:invalid-xexpr exn) (code)) Raised by validate-xexpr when passed an invalid Xexpr. Code contains an invalid part of an Xexpr. Functions ========= > read-xml : [Input-port] -> Document reads in an XML document from the given or current input port XML documents contain exactly one element. It throws an xml-read:error if there isn't any element or if there are more than one element. Malformed xml is reported with source locations in the form `l.c/o', where l, c, and o are the line number, column number, and next port position, respectively as returned by port-next-location. Any non-characters other than eof read from the input-port will appear in the document content. Such special values may only appear where XML content may. See make-input-port for information about creating ports that return non-character values. > read-xml/element : [Input-port] -> Element reads an XML element from the port. The next non-whitespace character read must start an XML element. The input-port may contain other data after the element. > syntax:read-xml : [Input-port] -> Syntax reads in an XML document and produces a syntax object version of an xexpression. > syntax:read-xml/element : [Input-port] -> Syntax is just like read-xml/element except it produces a syntax version of an xexpression > write-xml : Document [Output-port] -> Void writes a document to the given or current output port, currently ignoring everything except the document's root element. > write-xml/content : Content [Output-port] -> Void writes a document's contents to the given or current output port > display-xml : Document [Output-port] -> Void just like write-xml, but newlines and indentation make the output more readable, though less technically correct when white space is significant. > display-xml/content : Content [Output-port] -> Void just like write-xml/content, but with indentation and newlines > xml->xexpr : Content -> Xexpr converts the interesting part of an XML document into an Xexpression > xexpr->xml : Xexpr -> Content converts an Xexpression into the interesting part of an XML document > xexpr->string : Xexpression -> String converts an Xexpression into a string representation > eliminate-whitespace : (listof Symbol) (Bool -> Bool) -> Element -> Element Some elements should not contain any text, only other tags, except they often contain whitespace for formating purposes. Given a list of tag names and the identity function, eliminate-whitespace produces a function that filters out pcdata consisting solely of whitespace from those elements and raises an error if any non-whitespace text appears. Passing in the function called "not" instead of the identity function filters all elements which are not named in the list. Using void filters all elements regardless of the list. > xexpr? : any -> Boolean Is the given thing an Xexpr? > validate-xexpr : any -> #t If the given thing is an Xexpr, produce true. Otherwise, raise _exn:invalid-xexpr_, with the message set to "Expected something, given something-else", where "something" is what it expected and "something-else" set to what it was really given; and the code set to the part of the non-Xexpr that caused the exception. > correct-xexpr? : any (-> a) (exn -> a) -> a If the given thing is an Xexpr, produce an a. Otherwise call the second function with an exn:invalid-xexpr. This second function may inspect this structure and decide to return a "correct" value. This is a method of extending the definition of an Xexpr and is used by the web-server's Xexpr/callbacks. (See for an example.) Parameters ========== > empty-tag-shorthand : 'always | 'never | (listof Symbol) Default: 'always This determines if the output functions should use the tag notation instead of writing . If the argument is 'always, the abbreviated notation is always used, and if the argument is 'never, the open/close pair is always generated. If a list of symbols is provided, tags with names in this list will be abbreviated. The first form is the preferred XML notation. However, most browsers designed for HTML will only properly render XHTML if the document uses a mixture of the two formats. _html-empty-tags_ contains the W3 consortium's recommended list of XHTML tags that should use the shorthand. > collapse-whitespace : Bool Default: #f All consecutive whitespace is replaced by a single space. CDATA sections are not affected. > trim-whitespace : Bool This parameter no longer exists. Consider using collapse-whitespace and eliminate-whitespace instead. > read-comments : Bool Default: #f Comments, by definition, should be ignored by programs. However, interoperating with ad hoc extensions to other languages sometimes requires processing comments anyway. > xexpr-drop-empty-attributes : Bool Default: #f It's easier to write functions processing Xexpressions, if they always have a list of attributes. On the other hand, it's less cumbersome to write Xexpresssions by hand without empty lists of attributes everywhere. Normally xml->xexpr leaves in empty attribute lists. Setting this parameter to #t drops them, so further editing the Xexpression by hand is less annoying. Examples ======== Reading an Xexpression: (xml->xexpr (document-element (read-xml input-port))) Writing an Xexpression: (empty-tag-shorthand html-empty-tags) (write-xml/content (xexpr->xml `(html (head (title ,banner)) (body ((bgcolor "white")) ,text))) output-port) What this Library Doesn't Provide ================================= Document Type Declaration (DTD) processing Validation Expanding user-defined entities Reading user-defined entities in attributes Unicode support XML Datatype Details ==================== Note: Users of the XML collection don't need to know most of these definitions. Note: Xexpr is the only important one to understand. Even then, Processing-instructions may be ignored. > Xexpr = String | (cons Symbol (cons (listof (list Symbol String)) (listof Xexpr))) | (cons Symbol (listof Xexpr)) ;; an element with no attributes | Symbol ;; symbolic entities such as   | Number ;; numeric entities like  | Cdata | Misc > Document = (make-document Prolog Element (listof Processing-instruction)) (define-struct document (prolog element misc)) > Prolog = (make-prolog (listof Misc) Document-type [Misc ...]) (define-struct prolog (misc dtd misc2)) The last field is a (listof Misc), but the maker accepts optional arguments instead for backwards compatibility. > Document-type = #f | (make-document-type Symbol External-dtd #f) (define-struct document-type (name external inlined)) > External-dtd = (make-external-dtd/public str str) | (make-external-dtd/system str) | #f (define-struct external-dtd (system)) (define-struct (external-dtd/public external-dtd) (public)) (define-struct (external-dtd/system external-dtd) ()) > Element = (make-element Location Location Symbol (listof Attribute) (listof Content)) (define-struct (element struct:source) (name attributes content)) > Attribute = (make-attribute Location Location Symbol String) (define-struct (attribute struct:source) (name value)) > Content = Pcdata | Element | Entity | Misc Misc = Comment | Processing-instruction > Pcdata = (make-pcdata Location Location String) (define-struct (pcdata struct:source) (string)) > Cdata = (make-cdata Location Location String) (define-struct (cdata struct:source) (string)) Note: The string of a cdata structure is assumed to be of the form "" with proper quoting. If this is an incorrect assumption, this library will generate invalid XML. > Entity = (make-entity Location Location (U Nat Symbol)) (define-struct (entity struct:source) (text)) > Processing-instruction = (make-pi Location Location String String) (define-struct (pi struct:source) (target-name instruction)) > Comment = (make-comment String) (define-struct comment (text)) Source = (make-source Location Location) (define-struct source (start stop)) Location = (make-location Nat Nat Nat) | Symbol (define-struct location (line char offset)) Note: read-xml records location structures, while xexpr->xml inserts a symbol. Other functions that must fabricate XML Locations without prior source location should use a sensible "comment" symbol. The PList Library ================= Files: _plist.ss_ The PList library provides the ability to read and write xml documents which conform to the "plist" DTD, used to store 'dictionaries' of string - value associations. This format is typically used by Mac OS X --- the operating system and its applications --- to store all kinds of data. To Load ======= (require (lib "plist.ss" "xml")) Functions ========= > read-plist : Port -> PLDict reads a plist from a port, and produces a 'dict' x-expression > write-plist : PLDict Port -> Void writes a plist to the given port. May raise the exn:application:type exception if the plist is badly formed. Datatypes ========= NB: all of these are subtypes of x-expression: > PLDict = (list 'dict Assoc-pair ...) > PLAssoc-pair = (list 'assoc-pair String PLValue) > PLValue = String | (list 'true) | (list 'false) | (list 'integer Integer) | (list 'real Real) | PLDict | PLArray > PLArray = (list 'array PLValue ...) In fact, the PList DTD also defines Data and Date types, but we're ignoring these for the moment. Examples ======== Here's a sample PLDict: (define my-dict `(dict (assoc-pair "first-key" "just a string with some whitespace in it") (assoc-pair "second-key" (false)) (assoc-pair "third-key" (dict )) (assoc-pair "fourth-key" (dict (assoc-pair "inner-key" (real 3.432)))) (assoc-pair "fifth-key" (array (integer 14) "another string" (true))) (assoc-pair "sixth-key" (array)))) Let's write it to disk: (call-with-output-file "/Users/clements/tmp.plist" (lambda (port) (write-plist my-dict port)) 'truncate) Let's read it back from the disk: (define new-dict (call-with-input-file "/Users/clements/tmp.plist" (lambda (port) (read-plist port)))) Here's what that (hand-formatted) text file looks like: first-key just a string with some whitespace in it second-key third-key fourth-key inner-key 3.432 fifth-key 14 another string sixth-key