== Calling Graphite2 == === Introduction === The basic model for running graphite is to pass text, font and face information to create a segment. A segment consists of a linked list of slots which each correspond to an output glyph. In addition a segment holds charinfo for each character in the input text. [source, c] ---- include::../tests/examples/simple.c[] ---- <1> The first parameter to the program is the full path to the font file to be used for rendering. This function loads the font and reads all the graphite tables, etc. If there is a fault in the font, it will fail to load and the function will return NULL. <2> A font is merely a face at a given size in pixels per em. It is possible to support hinted advances, but this is done via a callback function. <3> For simplification of memory allocation, graphite works on characters (Unicode codepoints) rather than bytes or gr_uint16s, etc. We need to calculate the number of characters in the input string (the second parameter to the program). Very often applications already know this. If there is an error in the utf-8, the pError variable will point to it and we just exit. But it is possible to render up to that point. + If your string is null terminated, then you don't necessarily have to calculate a precise number of characters. You can use a value that is greater than the number in the string and rely on graphite to stop at the terminating null. It is necessary to pass some value for the number of characters so that graphite can initialise its internal memory structures appropriately and not waste time updating them. Thus for UTF-16 and UTF-32 strings, one could simply pass the number of code units in the string. For UTF-8 it may be preferable to call gr_count_unicode_characters. <4> Here we create a segment. A segment is the results of processing a string of text with graphite. It contains all the information necessary for final rendering including all the glyphs, their positions, relationships between glyphs and underlying characters, etc. <5> A segment primarily consists of a linked list of slots. Each slot corresponds to a glyph in the output. The information about a glyph and its relationships is queried from the slot. Source for this program may be found in tests/examples/simple.c Assuming that graphite2 has been built and installed, this example can be built and run on linux using: ---- gcc -o simple -lgraphite2 simple.c LD_LIBRARY_PATH=/usr/local/lib ./simple ../fonts/Padauk.ttf 'Hello World!' ---- Running simple gives the results: ---- 43(0.000000,0.000000) 72(9.859375,0.000000) 79(17.609375,0.000000) 79(20.796875,0.000000) 82(23.984375,0.000000) 3(32.203125,0.000000) 58(38.109375,0.000000) 82(51.625000,0.000000) 85(59.843750,0.000000) 79(64.875000,0.000000) 71(68.062500,0.000000) 4(76.281250,0.000000) ---- Not very pretty, but reassuring! Graphite isn't a graphical rendering engine, it merely calculates which glyphs should render where and leaves the actual process of displaying those glyphs to other libraries. This example is pretty simple and uses a convenient way to load fonts into Graphite for testing purposes. But when integrating into real applications, that is rarely the most appropriate way. Instead the necessary font information comes to Graphite via some other data structure with its own accessor functions. In the following example we show the same application but using a FreeType font rather than a font file. [source, c] ---- include::../tests/examples/freetype.c[] ---- <1> We cast the pointer to remove its const restriction. Since when the memory was allocated, it was passed to Graphite as a read only memory block, via const, it gets passed back to us as a read only memory block. But we are the owner of the block and so can mess with it (like freeing it). So we are free to break the const restriction here. <2> The structure of an operations structure is to first hold the size of the structure and then the pointers to the functions. Storing the size allows an older application to call a newer version of the Graphite engine which might support more function pointers. In such a case, all those newer pointers are assumed to be NULL. <3> Pass the function pointers structure for creating the font face. The first function is called to load a table from the font and pass it back to Graphite. The second is called by Graphite to say that it no longer needs the table loaded in memory and will make no further reference to it. <4> Pass a function pointers structure for fonts. The two functions (either can be NULL) return the horizontal or vertical advance for a glyph in pixels. Notice that usually fractional advances are preferable to grid fit advances, unless working entirely in a low resolution graphical framework. + The code following is virtually identical to the fileface code, apart from some housekeeping at the end. <5> Note that freetype works with fixed point arithmetic of 26.6, thus 1.0 is stored as 64. We therefore multiply the pointsize by 64 (or shift left by 6) and divide the resulting positions down by 64 to true floating point values. Building and running this example gives similar, but not identical, results to the simple case. Notice that since the advances are grid fit, all the positions are integral, unlike the fractional positioning in the simple application: ---- 43(0.000000,0.000000) 72(9.000000,0.000000) 79(17.000000,0.000000) 79(20.000000,0.000000) 82(23.000000,0.000000) 3(31.000000,0.000000) 58(37.000000,0.000000) 82(51.000000,0.000000) 85(59.000000,0.000000) 79(64.000000,0.000000) 71(67.000000,0.000000) 4(75.000000,0.000000) ---- === Slots === The primary contents of a segment is slots. These slots are organised into a doubly linked list and each corresponds to a glyph to be rendered. The linked list is terminated at each end by a NULL. There are also functions to get the first and last slot in a segment. In addition to the main slot list, slots may be attached to each other. This means that two glyphs have been attached to each other in the GDL. Again, attached slots are held in a separate singly linked list associated with the slot to which they attach. Thus slots will be in the main linked list and may be in an attachment linked list. Each slot in an attachment linked list has the same attachment parent accessed via `gr_slot_attached_to()`. To get the start of the linked list of all the slots directly attached to a parent, one calls `gr_slot_first_attachment()` and then `gr_slot_next_attachment()` to walk forwards through that linked list. Given that a diacritic may attach to another diacritic, an attached slot may in its turn have a linked list of attached slots. In all cases, linked lists terminate with a NULL. image::glyph_string.png[] The core information held by a slot is the glyph id of the glyph the slot corresponds to (`gr_slot_gid()`); the position relative to the start of the segment that the glyph is to be rendered at (`gr_slot_origin_X()` and `gr_slot_origin_Y()`); the advance for the glyph which corresponds to the glyph metric advance as adjusted by kerning. In addition a slot indicates whether the font designer wants to allow a cursor to be placed before this glyph or not. This information is accessible via `gr_slot_can_insert_before()`. === CharInfo === For each unicode character in the input, there is a CharInfo structure that can be queried for such information as the code unit position in the input string, the before slot index (if we are before this character, which is the earliest slot we are before) and the corresponding after slot index. === Face === The `gr_face` type is the memory correspondance of a font. It holds the data structures corresponding to those in a font file as required to process text using that font. In creating a `gr_face` it is necessary to pass a function by which graphite can get hold of font tables. The tables that graphite queries for must be available for the lifetime of the `gr_face`, except when a `gr_face` is created with the faceOptions of `gr_face_preloadAll`. This then loads everything from the font at `gr_face` construction time, leaving nothing further to be read from the font when the `gr_face` is used. This reduces the required lifetime of the in memory font tables to the `gr_make_face` call. In situations where the tables are only stored for the purposes of creating a `gr_face`, it can save memory to preload everything and delete the tables. === Caching === Graphite2 has the capability to make use of a subsegmental cache. What this does is to chop each run of characters at a word break, as defined by the linebreaking pass. Each sub run is then looked up in the cache rather than calculating the values from scratch. The cache is most effective when similar runs of text are processed. For raw benchmark testing against wordlists, the cache can be slightly slower than uncached processing. But most people use real text in their documents and that has a much higher level of redundancy. To use the cache, one simply creates a cached face, specifying the size of the cache in elements. A cache size of 5,000 to 10,000 has produced a good compromise between time and space. In the example above, point 1 becomes: ---- gr_face *face = make_file_face_with_seg_cache(argv[1]); ---- === Clustering === It is common for applications to work with simplified clusters, these are sequences of glyphs associated with a sequence of characters, such that these simplified clusters are as small as possible and are never reordered or split in relation to each other. In addition, a cursor may occur between simplified clusters. The following code gives an example algorithm for calculating such clusters: [source, c] ---- include::../tests/examples/cluster.c[] ---- <1> Create a segment as per the example in the introduction. <2> If this slot starts before the start of this cluster, then merge this cluster with the previous one and try again until this slot is within the current cluster. <3> If this slot starts after the end of the current cluster, then create a new cluster for it. You can't start a new cluster with a glyph that cannot take a cursor position before it. Also don't make a new cluster the first time around (i.e. at the start of the string). <4> If this slot ends after the end of this cluster then extend this cluster to include it. <5> Output a line break between each cluster. === Line Breaking and Justification === Whilst most applications will convert glyphs and positions out of the gr_slot structure into some internal structure, if graphite is to be used for justification, then it is necessary to line break the text and justify it within graphite's data structures. Graphite provides two functions to help with this. The first is gr_slot_linebreak_before() which will chop the slot linked list before a given slot. The application needs to keep track of the start of each of the subsequent linked lists itself, since graphite does not do that. After line breaking, the application may call gr_seg_justify() on each line linked list. The following example shows how this might be done in an application. Notice that this example does not take into considering whitespace hanging outside the right margin. [source, c] ---- include::../tests/examples/linebreak.c[] ---- <1> Create a segment as per the example in the introduction <2> Create an area to store line starts. There won't be more line starts than characters in the text. The first line starts at the start of the segment. <3> A simplistic approach would scan forwards using gr_slot_next_sibling_attachment, thus walking the bases. The bases are guaranteed to be ordered graphically, and so we can chop when we pass the line end. But in some cases, particularly Arabic, fonts are implemented with one base per word and all the other base characters are attached in a chain from that. So we need to walk the segment by slot, even considering attached slots. This is not a problem since attached slots are not going to have a wildly different position to their base and if one leaks over the end of the line, the breakweight considerations will get us back to a good base. <4> Scan through the slots, if we are past the end of the line then find somewhere to chop. <5> We use 15 (word break) as an appropriate break value. <6> Scan backwards for a valid linebreak location. <7> Break the line here. <8> Update the line width for the new line based on the start of the new line. <9> Justify each line to be width wide. And tell it to skip final whitespace (as if that whitespace were outside the width). <10> Each line is a complete linked list that we can iterate over. We can no longer iterate over the whole segment. We have to do it line by line now. === Bidi === Bidirectional processing is complex; not so much because of any algorithms involved, but because of the tendency for applications to address bidi text processing differently. Some try to do everything themselves, inverting the text order, etc. While others do nothing, expecting the shaper to resolve all the orders. In addition, there is the question of mirroring characters and where that is done. Graphite2 adds the complexity that it tries to enable extensions to the bidi algorithm by giving PUA characters directionality. To facilitate all these different ways of working, Graphite2 uses the `rtl` attribute to pass various bits to control bidi processing within the Graphite engine. [width="100%",cols="^2,^3,<15",options="header"] |======================================================= | gr_nobidi | gr_nomirror | Description | 0 | 0 | Runs the bidi algorithm and does all mirroring | 0 | 1 | Runs the bidi algorithm and mirrors those chars that don't have char replacements. It also un/remirrors anything that ends up in the opposite direction to the stated text direction on input. | 1 | 0 | Doesn't run the bidi algorithm but does do mirroring of all characters if direction is rtl. | 1 | 1 | Doesn't run the bidi algorithm and only mirrors those glyphs for which there is no corresponding mirroring character. |=======================================================