Top |
Instances of PopplerStructureElement are used to describe the structure
of a PopplerDocument. To access the elements in the structure of the
document, use poppler_structure_element_iter_new()
to obtain an iterator
for the top-level PopplerStructure, and then use the
PopplerStructureElementIter methods to traverse the structure tree.
PopplerStructureElementIter *
poppler_structure_element_iter_new (PopplerDocument *poppler_document
);
Returns the root PopplerStructureElementIter for document
, or NULL
. The
returned value must be freed with poppler_structure_element_iter_free()
.
Documents may have an associated structure tree &mdashmostly, Tagged-PDF compliant documents— which can be used to obtain information about the document structure and its contents. Each node in the tree contains a PopplerStructureElement.
Here is a simple example that walks the whole tree:
static void walk_structure (PopplerStructureElementIter *iter) { do { /* Get the element and do something with it */ PopplerStructureElementIter *child = poppler_structure_element_iter_get_child (iter); if (child) walk_structure (child); poppler_structure_element_iter_free (child); } while (poppler_structure_element_iter_next (iter)); } ... { iter = poppler_structure_element_iter_new (document); walk_structure (iter); poppler_structure_element_iter_free (iter); }
a new PopplerStructureElementIter, or NULL
if document
doesn't have structure tree.
[transfer full]
Since 0.26
gboolean
poppler_structure_element_iter_next (PopplerStructureElementIter *iter
);
Sets iter
to point to the next structure element at the current level
of the tree, if valid. See poppler_structure_element_iter_new()
for more
information.
Since 0.26
PopplerStructureElementIter *
poppler_structure_element_iter_copy (PopplerStructureElementIter *iter
);
Creates a new PopplerStructureElementIter as a copy of iter
. The
returned value must be freed with poppler_structure_element_iter_free()
.
Since 0.26
void
poppler_structure_element_iter_free (PopplerStructureElementIter *iter
);
Frees iter
.
Since 0.26
PopplerStructureElementIter *
poppler_structure_element_iter_get_child
(PopplerStructureElementIter *parent
);
Returns a new iterator to the children elements of the
PopplerStructureElement associated with iter
. The returned value must
be freed with poppler_structure_element_iter_free()
.
Since 0.26
PopplerStructureElement *
poppler_structure_element_iter_get_element
(PopplerStructureElementIter *iter
);
Returns the PopplerStructureElementIter associated with iter
.
Since 0.26
PopplerStructureElementKind
poppler_structure_element_get_kind (PopplerStructureElement *poppler_structure_element
);
Since 0.26
gint
poppler_structure_element_get_page (PopplerStructureElement *poppler_structure_element
);
Obtains the page number in which the element is contained.
Since 0.26
gboolean
poppler_structure_element_is_content (PopplerStructureElement *poppler_structure_element
);
Checks whether an element is actual document content.
Since 0.26
gboolean
poppler_structure_element_is_inline (PopplerStructureElement *poppler_structure_element
);
Checks whether an element is an inline element.
Since 0.26
gboolean
poppler_structure_element_is_block (PopplerStructureElement *poppler_structure_element
);
Checks whether an element is a block element.
Since 0.26
gboolean
poppler_structure_element_is_grouping (PopplerStructureElement *poppler_structure_element
);
Checks whether an element is a grouping element.
Since 0.26
gchar *
poppler_structure_element_get_id (PopplerStructureElement *poppler_structure_element
);
Obtains the identifier of an element.
Since 0.26
gchar *
poppler_structure_element_get_title (PopplerStructureElement *poppler_structure_element
);
Obtains the title of an element.
Since 0.26
gchar *
poppler_structure_element_get_abbreviation
(PopplerStructureElement *poppler_structure_element
);
gchar *
poppler_structure_element_get_language
(PopplerStructureElement *poppler_structure_element
);
Obtains the language and country code for the content in an element,
in two-letter ISO format, e.g. en_ES
, or NULL
if not
defined.
Since 0.26
gchar * poppler_structure_element_get_text (PopplerStructureElement *poppler_structure_element
,PopplerStructureGetTextFlags flags
);
Obtains the text enclosed by an element, or the text enclosed by the elements in the subtree (including the element itself).
poppler_structure_element |
||
flags |
A PopplerStructureGetTextFlags value, or
|
Since 0.26
gchar *
poppler_structure_element_get_alt_text
(PopplerStructureElement *poppler_structure_element
);
Obtains the “alternate” text representation of the element (and its child elements). This is mostly used for non-text elements like images and figures, to specify a textual description of the element.
Note that for elements containing proper text, the function
poppler_structure_element_get_text()
must be used instead.
Since 0.26
gchar *
poppler_structure_element_get_actual_text
(PopplerStructureElement *poppler_structure_element
);
Obtains the actual text enclosed by the element (and its child elements). The actual text is mostly used for non-text elements like images and figures which do have the graphical appearance of text, like a logo. For those the actual text is the equivalent text to those graphical elements which look like text when rendered.
Note that for elements containing proper text, the function
poppler_structure_element_get_text()
must be used instead.
Since 0.26
PopplerTextSpan ** poppler_structure_element_get_text_spans (PopplerStructureElement *poppler_structure_element
,guint *n_text_spans
);
Obtains the text enclosed by an element, as an array of PopplerTextSpan structures. Each item in the list is a piece of text which share the same attributes, plus its attributes. The following example shows how to obtain and free the text spans of an element:
guint i, n_spans; PopplerTextSpan **text_spans = poppler_structure_element_get_text_spans (element, &n_spans); /* Use the text spans */ for (i = 0; i < n_spans; i++) poppler_text_span_free (text_spans[i]); g_free (text_spans);
poppler_structure_element |
||
n_text_spans |
A pointer to the location where the number of elements in the returned array will be stored. |
[out] |
An array of PopplerTextSpan elments.
[transfer full][array length=n_text_spans][element-type PopplerTextSpan]
Since 0.26
PopplerStructurePlacement
poppler_structure_element_get_placement
(PopplerStructureElement *poppler_structure_element
);
Obtains the placement type of the structure element.
Since 0.26
PopplerStructureWritingMode
poppler_structure_element_get_writing_mode
(PopplerStructureElement *poppler_structure_element
);
Obtains the writing mode (writing direction) of the content associated with a structure element.
Since 0.26
gboolean poppler_structure_element_get_background_color (PopplerStructureElement *poppler_structure_element
,PopplerColor *color
);
Obtains the background color of the element. If this attribute is not specified, the element shall be treated as if it were transparent.
Since 0.26
gboolean poppler_structure_element_get_border_color (PopplerStructureElement *poppler_structure_element
,PopplerColor *colors
);
Obtains the color of border around the element. The result values are in before-after-start-end ordering (for the typical Western left-to-right writing, that is top-bottom-left-right). If this attribute is not specified, the border color for this element shall be the current text fill color in effect at the start of its associated content.
poppler_structure_element |
||
colors |
An array of four PopplerColor. |
[out][array fixed-size=4][element-type PopplerColor] |
Since 0.26
void poppler_structure_element_get_border_style (PopplerStructureElement *poppler_structure_element
,PopplerStructureBorderStyle *border_styles
);
Obtains the border style of a structure element. The result values are in before-after-start-end ordering. For example, using Western left-to-right writing, that is top-bottom-left-right.
poppler_structure_element |
||
border_styles |
An array of four PopplerStructureBorderStyle elements. |
[out][array fixed-size=4][element-type PopplerStructureBorderStyle] |
Since 0.26
gboolean poppler_structure_element_get_border_thickness (PopplerStructureElement *poppler_structure_element
,gdouble *border_thicknesses
);
Obtains the thickness of the border of an element. The result values are in before-after-start-end ordering (for the typical Western left-to-right writing, that is top-bottom-left-right). A value of 0 indicates that the border shall not be drawn.
poppler_structure_element |
||
border_thicknesses |
Array with the four values of border thicknesses. |
[out][array fixed-size=4][element-type gdouble] |
Since 0.26
void poppler_structure_element_get_padding (PopplerStructureElement *poppler_structure_element
,gdouble *paddings
);
Obtains the padding of an element (space around it). The result values are in before-after-start-end ordering. For example using Western left-to-right writing, that is top-bottom-left-right.
poppler_structure_element |
||
paddings |
Padding for the four sides of the element. |
[out][array fixed-size=4][element-type gdouble] |
Since 0.26
gboolean poppler_structure_element_get_color (PopplerStructureElement *poppler_structure_element
,PopplerColor *color
);
Obtains the color of the content contained in the element. If this attribute is not specified, the color for this element shall be the current text fill color in effect at the start of its associated content.
Since 0.26
gdouble
poppler_structure_element_get_space_before
(PopplerStructureElement *poppler_structure_element
);
Obtains the amount of empty space before the block-level structure element.
Since 0.26
gdouble
poppler_structure_element_get_space_after
(PopplerStructureElement *poppler_structure_element
);
Obtains the amount of empty space after the block-level structure element.
Since 0.26
gdouble
poppler_structure_element_get_start_indent
(PopplerStructureElement *poppler_structure_element
);
Obtains the amount of indentation at the beginning of the block-level structure element.
Since 0.26
gdouble
poppler_structure_element_get_end_indent
(PopplerStructureElement *poppler_structure_element
);
Obtains the amount of indentation at the end of the block-level structure element.
Since 0.26
gdouble
poppler_structure_element_get_text_indent
(PopplerStructureElement *poppler_structure_element
);
Obtains the amount of indentation of the text contained in the block-level structure element.
Since 0.26
PopplerStructureTextAlign
poppler_structure_element_get_text_align
(PopplerStructureElement *poppler_structure_element
);
Obtains the text alignment mode of the text contained into a block-level structure element.
Since 0.26
gboolean poppler_structure_element_get_bounding_box (PopplerStructureElement *poppler_structure_element
,PopplerRectangle *bounding_box
);
Obtains the size of the bounding box of a block-level structure element.
Since 0.26
gdouble
poppler_structure_element_get_width (PopplerStructureElement *poppler_structure_element
);
Obtains the width of the block-level structure element. Note that for elements which do not specify a width, it has to be calculated, and in this case -1 is returned.
A positive value if a width is defined, or -1 if the width is to be calculated automatically.
Since 0.26
gdouble
poppler_structure_element_get_height (PopplerStructureElement *poppler_structure_element
);
Obtains the height of the block-level structure element. Note that for elements which do not specify a height, it has to be calculated, and in this case -1 is returned.
A positive value if a width is defined, or -1 if the height is to be calculated automatically.
Since 0.26
PopplerStructureBlockAlign
poppler_structure_element_get_block_align
(PopplerStructureElement *poppler_structure_element
);
Obtains the block-alignment mode of the block-level structure element.
Since 0.26
PopplerStructureInlineAlign
poppler_structure_element_get_inline_align
(PopplerStructureElement *poppler_structure_element
);
Obtains the inline-alignment mode of the block-level structure element.
Since 0.26
void poppler_structure_element_get_table_border_style (PopplerStructureElement *poppler_structure_element
,PopplerStructureBorderStyle *border_styles
);
Obtains the table cell border style of a block-level structure element. The result values are in before-after-start-end ordering. For example, using Western left-to-right writing, that is top-bottom-left-right.
poppler_structure_element |
||
border_styles |
An array of four PopplerStructureBorderStyle elements. |
[out][array fixed-size=4][element-type PopplerStructureBorderStyle] |
Since 0.26
void poppler_structure_element_get_table_padding (PopplerStructureElement *poppler_structure_element
,gdouble *paddings
);
Obtains the padding between the table cell’s content rectangle and the surrounding border of a block-level structure element. The result values are in before-after-start-end ordering (for the typical Western left-to-right writing, that is top-bottom-left-right).
poppler_structure_element |
||
paddings |
Padding for the four sides of the element. |
[out][array fixed-size=4][element-type gdouble] |
Since 0.26
gdouble
poppler_structure_element_get_baseline_shift
(PopplerStructureElement *poppler_structure_element
);
Obtains how much the text contained in the inline-level structure element should be shifted, measuring from the baseline of the glyphs.
Since 0.26
gdouble
poppler_structure_element_get_line_height
(PopplerStructureElement *poppler_structure_element
);
Obtains the line height for the text contained in the inline-level structure element. Note that for elements which do not specify a line height, it has to be calculated, and in this case -1 is returned.
A positive value if a line height is defined, or -1 if the height is to be calculated automatically.
Since 0.26
gboolean poppler_structure_element_get_text_decoration_color (PopplerStructureElement *poppler_structure_element
,PopplerColor *color
);
Obtains the color of the text decoration for the text contained in the inline-level structure element. If this attribute is not specified, the color for this element shall be the current fill color in effect at the start of its associated content.
Since 0.26
gdouble
poppler_structure_element_get_text_decoration_thickness
(PopplerStructureElement *poppler_structure_element
);
Obtains the thickness of the text decoration for the text contained in the inline-level structure element. If this attribute is not specified, it shall be derived from the current stroke thickness in effect at the start of the element’s associated content.
Since 0.26
PopplerStructureTextDecoration
poppler_structure_element_get_text_decoration_type
(PopplerStructureElement *poppler_structure_element
);
Obtains the text decoration type of the text contained in the inline-level structure element.
Since 0.26
PopplerStructureRubyAlign
poppler_structure_element_get_ruby_align
(PopplerStructureElement *poppler_structure_element
);
Obtains the alignment for the ruby text contained in a inline-level structure element.
Since 0.26
PopplerStructureRubyPosition
poppler_structure_element_get_ruby_position
(PopplerStructureElement *poppler_structure_element
);
Obtains the position for the ruby text contained in a inline-level structure element.
Since 0.26
PopplerStructureGlyphOrientation
poppler_structure_element_get_glyph_orientation
(PopplerStructureElement *poppler_structure_element
);
Obtains the glyph orientation for the text contained in a inline-level structure element.
Since 0.26
guint
poppler_structure_element_get_column_count
(PopplerStructureElement *poppler_structure_element
);
Obtains the number of columns used to lay out the content contained in the grouping element.
Since 0.26
gdouble * poppler_structure_element_get_column_gaps (PopplerStructureElement *poppler_structure_element
,guint *n_values
);
Obtains the size of the gaps in between adjacent columns. Returns an array of elements: the first one is the size of the gap in between columns 1 and 2, second is the size between columns 2 and 3, and so on.
For elements which use a single column, NULL
is returned and n_values
is set to zero.
If the attribute is undefined, NULL
is returned and n_values
is set
to a non-zero value.
The array with the results is allocated by the function. When it is
not needed anymore, be sure to call g_free()
on it.
Array containing the values for the column gaps, or NULL
if the
array is empty or the attribute is not defined.
[transfer full][array length=n_values][element-type gdouble]
Since 0.26
gdouble * poppler_structure_element_get_column_widths (PopplerStructureElement *poppler_structure_element
,guint *n_values
);
Obtains an array with the widths of the columns.
The array with the results is allocated by the function. When it is
not needed anymore, be sure to call g_free()
on it.
Array containing widths of the columns, or NULL
if the attribute
is not defined.
[transfer full][array length=n_values][element-type gdouble]
Since 0.26
PopplerStructureListNumbering
poppler_structure_element_get_list_numbering
(PopplerStructureElement *poppler_structure_element
);
Obtains the list numbering style for list items.
Since 0.26
PopplerStructureFormRole
poppler_structure_element_get_form_role
(PopplerStructureElement *poppler_structure_element
);
Obtains the role of a form structure element that is part of a form, or is a form field. This hints how the control for the element is intended to be rendered.
Since 0.26
PopplerStructureFormState
poppler_structure_element_get_form_state
(PopplerStructureElement *poppler_structure_element
);
For a structure element that is a form field, obtains in which state the associated control is expected to be rendered.
Since 0.26
gchar *
poppler_structure_element_get_form_description
(PopplerStructureElement *poppler_structure_element
);
Obtains the textual description of the form element. Note that the description is for informative purposes, and it is not intended to be rendered. For example, assistive technologies may use the description field to provide an alternate way of presenting an element to the user.
The returned string is allocated by the function. When it is
not needed anymore, be sure to call g_free()
on it.
Since 0.26
guint
poppler_structure_element_get_table_row_span
(PopplerStructureElement *poppler_structure_element
);
Obtains the number of rows the table element spans to.
Since 0.26
guint
poppler_structure_element_get_table_column_span
(PopplerStructureElement *poppler_structure_element
);
Obtains the number of columns the table element spans to.
Since 0.26
gchar **
poppler_structure_element_get_table_headers
(PopplerStructureElement *poppler_structure_element
);
Obtains an array with the names of the table column headers. This is only useful for table header row elements.
The array with the results is allocated by the function. The number
of items in the returned array can be obtained with g_strv_length()
.
The returned value must be freed using g_strfreev()
.
Zero-terminated array of strings with the table header names,
or NULL
if the attribute is not defined.
[transfer full][array zero-terminated=1][element-type gchar*]
Since 0.26
PopplerStructureTableScope
poppler_structure_element_get_table_scope
(PopplerStructureElement *poppler_structure_element
);
Obtains the scope of a table structure element.
Since 0.26
gchar *
poppler_structure_element_get_table_summary
(PopplerStructureElement *poppler_structure_element
);
Obtains the textual summary of the contents of the table element. Note that the summary is meant for informative purposes, and it is not intended to be rendered. For example, assistive technologies may use the description field to provide an alternate way of presenting an element to the user, or a document indexer may want to scan it for additional keywords.
The returned string is allocated by the function. When it is
not needed anymore, be sure to call g_free()
on it.
Since 0.26
PopplerTextSpan *
poppler_text_span_copy (PopplerTextSpan *poppler_text_span
);
Makes a copy of a text span.
Since 0.26
void
poppler_text_span_free (PopplerTextSpan *poppler_text_span
);
Frees a text span.
Since 0.26
gboolean
poppler_text_span_is_fixed_width_font (PopplerTextSpan *poppler_text_span
);
Check wether a text span is meant to be rendered using a fixed-width font.
Since 0.26
gboolean
poppler_text_span_is_serif_font (PopplerTextSpan *poppler_text_span
);
Check whether a text span is meant to be rendered using a serif font.
Since 0.26
gboolean
poppler_text_span_is_bold_font (PopplerTextSpan *poppler_text_span
);
Check whether a text span is meant to be rendered using a bold font.
Since 0.26
void poppler_text_span_get_color (PopplerTextSpan *poppler_text_span
,PopplerColor *color
);
Obtains the color in which the text is to be rendered.
Since 0.26
const gchar *
poppler_text_span_get_text (PopplerTextSpan *poppler_text_span
);
Obtains the text contained in the span.
Since 0.26
const gchar *
poppler_text_span_get_font_name (PopplerTextSpan *poppler_text_span
);
Obtains the name of the font in which the span is to be rendered.
Since 0.26
typedef struct _PopplerStructureElementIter PopplerStructureElementIter;