Input / Output¶

The dh_segment.io module implements input / output functions and classes.

Input functions for `tf.Estimator`¶

Input function

input_fn(input_data, params[, …])

Input_fn for estimator

Data augmentation

`data_augmentation_fn`(input_image, label_image)	Applies data augmentation to both images and label images.
`extract_patches_fn`(image, patch_shape, offsets)	Will cut a given image into patches.
`rotate_crop`(image, rotation[, crop, …])	Rotates and crops the images.

Resizing function

`resize_image`(image, size[, interpolation])	Resizes the image
`load_and_resize_image`(filename, channels[, …])	Loads an image from its filename and resizes it to the desired output size.

Tensorflow serving functions¶

`serving_input_filename`(resized_size)
`serving_input_image`()

PAGE XML and JSON import / export¶

PAGE classes

`PAGE.Point`(y, x)	Point (x,y) class.
`PAGE.Text`([text_equiv, alternatives, score])	Text entity produced by a transcription system.
`PAGE.Border`([coords, id])	Region containing the page.
`PAGE.TextRegion`([id, coords, text_lines, …])	Region containing text lines.
`PAGE.TextLine`([id, coords, baseline, text, …])	Region corresponding to a text line.
`PAGE.GraphicRegion`([id, coords, …])	Region containing simple graphics.
`PAGE.TableRegion`([id, coords, rows, …])	Tabular data in any form.
`PAGE.SeparatorRegion`(id[, coords, …])	Lines separating columns or paragraphs.
`PAGE.GroupSegment`([id, coords, segment_ids, …])	Set of regions that make a bigger region (group).
`PAGE.Metadata`([creator, created, …])	Metadata information.
`PAGE.Page`(**kwargs)	Class following PAGE-XML object.

Abstract classes

`PAGE.BaseElement`	Base page element class.
`PAGE.Region`([id, coords, custom_attribute])	Region base class.

Parsing and helpers

`PAGE.parse_file`(filename)	Parses the files to create the corresponding `Page` object.
`PAGE.json_serialize`(dict_to_serialize[, …])	Serialize a dictionary in order to export it.

VGG Image Annotator helpers¶

VIA objects

`via.WorkingItem`	A container for annotated images.
`via.VIAttribute`	A container for VIA attributes.

Creating masks with VIA annotations

`via.load_annotation_data`(via_data_filename)	Load the content of via annotation files.
`via.export_annotation_dict`(annotation_dict, …)	Export the annotations to json file.
`via.get_annotations_per_file`(via_dict, name_file)	From VIA json content, get annotations relative to the given name_file.
`via.parse_via_attributes`(via_attributes)	Parses the VIA attribute dictionary and returns a list of VIAttribute instances
`via.get_via_attributes`(annotation_dict[, …])	Gets the attributes of the annotated data and returns a list of VIAttribute.
`via.collect_working_items`(via_annotations, …)	Given VIA annotation input, collect all info on WorkingItem object.
`via.create_masks`(masks_dir, working_items, …)	For each annotation, create a corresponding binary mask and resize it (h = 2000).

Formatting in VIA JSON format

`via.create_via_region_from_coordinates`(…)	Formats coordinates to a VIA region (dict).
`via.create_via_annotation_single_image`(…)	Returns a dictionary item {key: annotation} in VIA format to further export to .json file

dh_segment.io.input_fn(input_data, params, input_label_dir=None, data_augmentation=False, batch_size=5, make_patches=False, num_epochs=1, num_threads=4, image_summaries=False)¶

Input_fn for estimator

Parameters

input_data (Union[str, List[str]]) – input data. It can be a directory containing the images, it can be a list of image filenames, or it can be a path to a csv file.
params (dict) – params from utils.Params object
input_label_dir (Optional[str]) – directory containing the label images
data_augmentation (bool) – boolean, if True will scale, roatate, … the images
batch_size (int) – size of the bach
make_patches (bool) – bool, whether to make patches (crop image in smaller pieces) or not
num_epochs (int) – number of epochs to cycle trough data (set it to None for infinite repeat)
num_threads (int) – number of thread to use in parallele when usin tf.data.Dataset.map
image_summaries (bool) – boolean, whether to make tf.Summary to watch on tensorboard

Returns

fn

dh_segment.io.serving_input_filename(resized_size)¶

dh_segment.io.serving_input_image()¶

dh_segment.io.data_augmentation_fn(input_image, label_image, flip_lr=True, flip_ud=True, color=True)¶

Applies data augmentation to both images and label images. Includes left-right flip, up-down flip and color change.

Parameters

input_image (tensorflow.Tensor) – images to be augmented [B, H, W, C]
label_image (tensorflow.Tensor) – corresponding label images [B, H, W, C]
flip_lr (bool) – option to flip image in left-right direction
flip_ud (bool) – option to flip image in up-down direction
color (bool) – option to change color of images

Return type

(tensorflow.Tensor, tensorflow.Tensor)

Returns

the tuple (augmented images, augmented label images) [B, H, W, C]

dh_segment.io.rotate_crop(image, rotation, crop=True, minimum_shape=[0, 0], interpolation='NEAREST')¶

Rotates and crops the images.

Parameters

image (tensorflow.Tensor) – image to be rotated and cropped [H, W, C]
rotation (float) – angle of rotation (in radians)
crop (bool) – option to crop rotated image to avoid black borders due to rotation
minimum_shape (Tuple[int, int]) – minimum shape of the rotated image / cropped image
interpolation (str) – which interpolation to use NEAREST or BILINEAR

Return type

tensorflow.Tensor

Returns

dh_segment.io.resize_image(image, size, interpolation='BILINEAR')¶

Resizes the image

Parameters

image (tensorflow.Tensor) – image to be resized [H, W, C]
size (int) – size of the resized image (in pixels)
interpolation (str) – which interpolation to use, NEAREST or BILINEAR

Return type

tensorflow.Tensor

Returns

resized image

dh_segment.io.load_and_resize_image(filename, channels, size=None, interpolation='BILINEAR')¶

Loads an image from its filename and resizes it to the desired output size.

Parameters

filename (str) – string tensor
channels (int) – number of channels for the decoded image
size (Optional[int]) – number of desired pixels in the resized image, tf.Tensor or int (None for no resizing)
interpolation (str) –
return_original_shape – returns the original shape of the image before resizing if this flag is True

Return type

tensorflow.Tensor

Returns

decoded and resized float32 tensor [h, w, channels],

dh_segment.io.extract_patches_fn(image, patch_shape, offsets)¶

Will cut a given image into patches.

Parameters

image (tensorflow.Tensor) – tf.Tensor
patch_shape (Tuple[int, int]) – shape of the extracted patches [h, w]
offsets (Tuple[int, int]) – offset to add to the origin of first patch top-right coordinate, useful during data augmentation to have slighlty different patches each time. This value will be multiplied by [h/2, w/2] (range values [0,1])

Return type

tensorflow.Tensor

Returns

patches [batch_patches, h, w, c]

dh_segment.io.local_entropy(tf_binary_img, sigma=3)¶

Parameters

tf_binary_img (tensorflow.Tensor) –
sigma (float) –

Return type

tensorflow.Tensor

Returns

class dh_segment.io.PAGE.BaseElement¶

Base page element class. (Abstract)

classmethod check_tag(tag)¶

classmethod full_tag()¶

Return type: str

tag = None¶

class dh_segment.io.PAGE.Border(coords=None, id=None)¶

Region containing the page. It is the border of the actual page of the document (if the scanned image contains parts not belonging to the page).

Variables: coords – coordinates of the Border region

classmethod from_dict(dictionary)¶

Return type: Border

classmethod from_xml(e)¶

Return type: Border

tag = 'Border'¶

to_dict(non_serializable_keys=[])¶

Return type: dict

to_xml()¶

Return type: Element

class dh_segment.io.PAGE.GraphicRegion(id=None, coords=None, custom_attribute=None)¶

Region containing simple graphics. Company logos for example should be marked as graphic regions.

Variables

id – identifier of the GraphicRegion
coords – coordinates of the GraphicRegion

classmethod from_dict(dictionary)¶

From a seralized dictionary creates a dictionary of the atributes (non serialized)

Parameters: dictionary (dict) – serialized dictionary
Return type: GraphicRegion
Returns: non serialized dictionary

classmethod from_xml(e)¶

Creates a dictionary from a XML structure in order to create the inherited objects

Parameters: etree_element – a xml etree
Return type: GraphicRegion
Returns: a dictionary with keys ‘id’ and ‘coords’

tag = 'GraphicRegion'¶

to_xml(name_element='GraphicRegion')¶

Converts a Region object to a xml structure

Parameters: name_element – name of the object (optional)
Return type: Element
Returns: a etree structure

class dh_segment.io.PAGE.GroupSegment(id=None, coords=None, segment_ids=None, custom_attribute=None)¶

Set of regions that make a bigger region (group). GroupSegment is a region containing several TextLine and that form a bigger region. It is used mainly to make line / column regions. Only for JSON export (no PAGE XML correspondence).

Variables

id – identifier of the GroupSegment
coords – coordinates of the GroupSegment
segment_ids – list of the regions ids belonging to the group

classmethod from_dict(dictionary)¶

From a seralized dictionary creates a dictionary of the atributes (non serialized)

Parameters: dictionary (dict) – serialized dictionary
Return type: GroupSegment
Returns: non serialized dictionary

class dh_segment.io.PAGE.Metadata(creator=None, created=None, last_change=None, comments=None)¶

Metadata information.

Variables

creator – name of the process of person that created the exported file
created – time of creation of the file
last_change – time of last modification of the file
comments – comments on the process

classmethod from_dict(dictionary)¶

Return type: Metadata

classmethod from_xml(e)¶

Return type: Metadata

tag = 'Metadata'¶

to_dict()¶

to_xml()¶

Return type: Element

class dh_segment.io.PAGE.Page(**kwargs)¶

Class following PAGE-XML object. This class is used to represent the information of the processed image. It is possible to export this info as PAGE-XML or JSON format.

Variables

image_filename – filename of the image
image_width – width of the original image
image_height – height of the original image
text_regions – list of TextRegion
graphic_regions – list of GraphicRegion
page_border – Border of the page
separator_regions – list of SeparatorRegion
table_regions – list of TableRegion
metadata – Metadata of the image and process
line_groups – list of GroupSegment forming lines
column_groups – list of GroupSegment forming columns

draw_baselines(img_canvas, color=(255, 0, 0), thickness=2, endpoint_radius=4, autoscale=True)¶

Given an image, draws the TextLines.baselines.

Parameters

img_canvas (ndarray) – 3 channel image in which the region will be drawn. The image is modified inplace.
color (Tuple[int, int, int]) – (R, G, B) value color
thickness (int) – the thickness of the line
endpoint_radius (int) – the radius of the endpoints of line s(first and last coordinates of line)
autoscale (bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio

draw_column_groups(img_canvas, color=(0, 255, 0), fill=False, thickness=5, autoscale=True)¶

It will draw column groups (in case of a table). This is only valid when parsing JSON files.

Parameters

img_canvas (ndarray) – 3 channel image in which the region will be drawn. The image is modified inplace
color (Tuple[int, int, int]) – (R, G, B) value color
fill (bool) – either to fill the region (True) of only draw the external contours (False)
thickness (int) – in case fill=False the thickness of the line
autoscale (bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio

draw_graphic_regions(img_canvas, color=(255, 0, 0), fill=True, thickness=3, autoscale=True)¶

Given an image, draws the GraphicRegions, either fills it (fill=True) or draws the contours (fill=False)

Parameters

img_canvas (ndarray) – 3 channel image in which the region will be drawn. The image is modified inplace.
color (Tuple[int, int, int]) – (R, G, B) value color
fill (bool) – either to fill the region (True) of only draw the external contours (False)
thickness (int) – in case fill=True the thickness of the line
autoscale (bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio

draw_line_groups(img_canvas, color=(0, 255, 0), fill=False, thickness=5, autoscale=True)¶

It will draw line groups. This is only valid when parsing JSON files.

Parameters

img_canvas (ndarray) – 3 channel image in which the region will be drawn. The image is modified inplace.
color (Tuple[int, int, int]) – (R, G, B) value color
fill (bool) – either to fill the region (True) of only draw the external contours (False)
thickness (int) – in case fill=False the thickness of the line
autoscale (bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio

draw_lines(img_canvas, color=(255, 0, 0), thickness=2, fill=True, autoscale=True)¶

Given an image, draws the polygons containing text lines, i.e TextLines.coords

Parameters

img_canvas (ndarray) – 3 channel image in which the region will be drawn. The image is modified inplace.
color (Tuple[int, int, int]) – (R, G, B) value color
thickness (int) – the thickness of the line
fill (bool) – if True fills the polygon
autoscale (bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio

draw_page_border(img_canvas, color=(255, 0, 0), fill=True, thickness=5, autoscale=True)¶

Given an image, draws the page border, either fills it (fill=True) or draws the contours (fill=False)

Parameters

img_canvas – 3 channel image in which the region will be drawn. The image is modified inplace.
color (Tuple[int, int, int]) – (R, G, B) value color
fill (bool) – either to fill the region (True) of only draw the external contours (False)
thickness (int) – in case fill=True the thickness of the line
autoscale (bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio

draw_separator_lines(img_canvas, color=(0, 255, 0), thickness=3, filter_by_id='', autoscale=True)¶

Given an image, draws the SeparatorRegion.

Parameters

img_canvas (ndarray) – 3 channel image in which the region will be drawn. The image is modified inplace.
color (Tuple[int, int, int]) – (R, G, B) value color
thickness (int) – thickness of the line
filter_by_id (str) – string to filter the lines by id. For example vertical/horizontal lines can be filtered if ‘vertical’ or ‘horizontal’ is mentioned in the id.
autoscale (bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio

draw_text(img_canvas, color=(255, 0, 0), thickness=5, font=cv2.FONT_HERSHEY_SIMPLEX, font_scale=1.0, autoscale=True)¶

Writes the text of the TextLine on the given image.

Parameters

img_canvas (ndarray) – 3 channel image in which the region will be drawn. The image is modified inplace
color (Tuple[int, int, int]) – (R, G, B) value color
thickness (int) – the thickness of the characters
font – the type of font (cv2 constant)
font_scale (float) – the scale of font
autoscale (bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio

draw_text_regions(img_canvas, color=(255, 0, 0), fill=True, thickness=3, autoscale=True)¶

Given an image, draws the TextRegions, either fills it (fill=True) or draws the contours (fill=False)

Parameters

img_canvas (ndarray) – 3 channel image in which the region will be drawn. The image is modified inplace.
color (Tuple[int, int, int]) – (R, G, B) value color
fill (bool) – either to fill the region (True) of only draw the external contours (False)
thickness (int) – in case fill=True the thickness of the line
autoscale (bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio

classmethod from_dict(dictionary)¶

Return type: Page

classmethod from_xml(e)¶

Return type: Page

tag = 'Page'¶

to_json()¶

Return type: dict

to_xml()¶

Return type: Element

write_to_file(filename, creator_name='dhSegment', comments='')¶

Export Page object to json or page-xml format. Will assume the format based on the extension of the filename, if there is no extension will export as an xml file.

Parameters

filename (str) – filename of the file to be exported
creator_name (str) – name of the creator (process or person) creating the file
comments (str) – optionnal comment to add to the metadata of the file.

Return type

None

class dh_segment.io.PAGE.Point(y, x)¶

Point (x,y) class.

Variables

y – vertical coordinate
x – horizontal coordinate

classmethod array_to_list(array)¶

Converts an np.array to a list of coordinates

Parameters: array (ndarray) – an array of coordinates. Must be of shape (N, 2)
Return type: list
Returns: list of coordinates, shape (N,2)

classmethod array_to_point(array)¶

Converts an np.array to a list of Point

Parameters: array (ndarray) – an array of coordinates. Must be of shape (N, 2)
Return type: list
Returns: list of Point

classmethod cv2_to_point_list(cv2_array)¶

Converts an opencv-formatted set of coordinates to a list of Point

Parameters: cv2_array (ndarray) – opencv-formatted set of coordinates, shape (N,1,2)
Return type: List[Point]
Returns: list of Point

classmethod list_from_xml(etree_elem)¶

Converts a PAGEXML-formatted set of coordinates to a list of Point

Parameters: etree_elem (Element) – etree XML element containing a set of coordinates
Return type: List[Point]
Returns: a list of coordinates as Point

classmethod list_point_to_string(list_points)¶

Converts a list of Point to a string ‘x,y’

Parameters: list_points (List[Point]) – list of coordinates with Point format
Return type: str
Returns: a string with the coordinates

classmethod list_to_cv2poly(list_points)¶

Converts a list of Point to opencv format set of coordinates

Parameters: list_points (List[Point]) – set of coordinates
Return type: ndarray
Returns: opencv-formatted set of points, shape (N,1,2)

classmethod list_to_point(list_coords)¶

Converts a list of coordinates to a list of Point

Parameters: list_coords (list) – list of coordinates, shape (N, 2)
Return type: List[Point]
Returns: list of Point

classmethod point_to_list(points)¶

Converts a list of Point to a list of coordinates

Parameters: points (List[Point]) – list of Points
Return type: list
Returns: list of shape (N,2)

to_dict()¶

class dh_segment.io.PAGE.Region(id=None, coords=None, custom_attribute=None)¶

Region base class. (Abstract) This is the superclass for all the extracted regions

Variables

id – identifier of the Region
coords – coordinates of the Region
custom_attribute – Any custom attribute that may be linked with the region (usually this is added in PAGEXML files, not in JSON files)

classmethod from_dict(dictionary)¶

From a seralized dictionary creates a dictionary of the atributes (non serialized)

Parameters: dictionary (dict) – serialized dictionary
Return type: dict
Returns: non serialized dictionary

classmethod from_xml(etree_element)¶

Creates a dictionary from a XML structure in order to create the inherited objects

Parameters: etree_element (Element) – a xml etree
Return type: dict
Returns: a dictionary with keys ‘id’ and ‘coords’

tag = 'Region'¶

to_dict(non_serializable_keys=[])¶

Converts a Region object to a dictionary.

Parameters: non_serializable_keys (List[str]) – list of keys that can’t be directly serialized and that need some internal serialization
Return type: dict
Returns: a dictionary with the atributes of the object serialized

to_xml(name_element=None)¶

Converts a Region object to a xml structure

Parameters: name_element (Optional[str]) – name of the object (optional)
Return type: Element
Returns: a etree structure

class dh_segment.io.PAGE.SeparatorRegion(id, coords=None, custom_attribute=None)¶

Lines separating columns or paragraphs. Separators are lines that lie between columns and paragraphs and can be used to logically separate different articles from each other.

Variables

id – identifier of the SeparatorRegion
coords – coordinates of the SeparatorRegion

classmethod from_dict(dictionary)¶

From a seralized dictionary creates a dictionary of the atributes (non serialized)

Parameters: dictionary (dict) – serialized dictionary
Return type: SeparatorRegion
Returns: non serialized dictionary

classmethod from_xml(e)¶

Creates a dictionary from a XML structure in order to create the inherited objects

Parameters: etree_element – a xml etree
Return type: SeparatorRegion
Returns: a dictionary with keys ‘id’ and ‘coords’

tag = 'SeparatorRegion'¶

to_xml(name_element='SeparatorRegion')¶

Converts a Region object to a xml structure

Parameters: name_element – name of the object (optional)
Return type: Element
Returns: a etree structure

class dh_segment.io.PAGE.TableRegion(id=None, coords=None, rows=None, columns=None, embedded_text=None, custom_attribute=None)¶

Tabular data in any form. Tabular data is represented with a table region. Rows and columns may or may not have separator lines; these lines are not separator regions.

Variables

id – identifier of the TableRegion
coords – coordinates of the TableRegion
rows – number of rows in the table
columns – number of columns in the table
embedded_text – if text is embedded in the table

classmethod from_dict(dictionary)¶

From a seralized dictionary creates a dictionary of the atributes (non serialized)

Parameters: dictionary (dict) – serialized dictionary
Return type: TableRegion
Returns: non serialized dictionary

classmethod from_xml(e)¶

Creates a dictionary from a XML structure in order to create the inherited objects

Parameters: etree_element – a xml etree
Return type: TableRegion
Returns: a dictionary with keys ‘id’ and ‘coords’

tag = 'TableRegion'¶

to_xml(name_element='TableRegion')¶

Converts a Region object to a xml structure

Parameters: name_element – name of the object (optional)
Return type: Element
Returns: a etree structure

class dh_segment.io.PAGE.Text(text_equiv=None, alternatives=None, score=None)¶

Text entity produced by a transcription system.

Variables

text_equiv – the transcription of the text
alternatives – alternative transcriptions
score – the confidence of the transcription output by the transcription system

to_dict()¶

Return type: dict

class dh_segment.io.PAGE.TextLine(id=None, coords=None, baseline=None, text=None, line_group_id=None, column_group_id=None, custom_attribute=None)¶

Region corresponding to a text line.

Variables

id – identifier of the TextLine
coords – coordinates of the Texline line
baseline – coordinates of the Texline baseline
text – Text class containing the transcription of the TextLine
line_group_id – identifier of the line group the instance belongs to
column_group_id – identifier of the column group the instance belongs to
custom_attribute – Any custom attribute that may be linked with the region (usually this is added in PAGEXML files, not in JSON files)

classmethod from_array(cv2_coords=None, baseline_coords=None, text_equiv=None, id=None)¶

classmethod from_dict(dictionary)¶

From a seralized dictionary creates a dictionary of the atributes (non serialized)

Parameters: dictionary (dict) – serialized dictionary
Return type: TextLine
Returns: non serialized dictionary

classmethod from_xml(etree_element)¶

Creates a dictionary from a XML structure in order to create the inherited objects

Parameters: etree_element (Element) – a xml etree
Return type: TextLine
Returns: a dictionary with keys ‘id’ and ‘coords’

scale_baseline_points(ratio)¶

Scales the points of the baseline by a factor ratio.

Parameters: ratio (float) – factor to rescale the baseline coordinates

tag = 'TextLine'¶

to_dict(non_serializable_keys=[])¶

Converts a Region object to a dictionary.

Parameters: non_serializable_keys (List[str]) – list of keys that can’t be directly serialized and that need some internal serialization
Returns: a dictionary with the atributes of the object serialized

to_xml(name_element='TextLine')¶

Converts a Region object to a xml structure

Parameters: name_element – name of the object (optional)
Return type: Element
Returns: a etree structure

class dh_segment.io.PAGE.TextRegion(id=None, coords=None, text_lines=None, text_equiv='', region_type=None, custom_attribute=None)¶

Region containing text lines. It can represent a paragraph or a page for instance.

Variables

id – identifier of the TextRegion
coords – coordinates of the TextRegion
text_equiv – the resulting text of the Text contained in the TextLines
text_lines – a list of TextLine objects
region_type – the type of a TextRegion (can be any string). Example : header, paragraph, page-number…
custom_attribute – Any custom attribute that may be linked with the region (usually this is added in PAGEXML files, not in JSON files)

classmethod from_dict(dictionary)¶

From a seralized dictionary creates a dictionary of the atributes (non serialized)

Parameters: dictionary (dict) – serialized dictionary
Return type: TextRegion
Returns: non serialized dictionary

classmethod from_xml(e)¶

Creates a dictionary from a XML structure in order to create the inherited objects

Parameters: etree_element – a xml etree
Return type: TextRegion
Returns: a dictionary with keys ‘id’ and ‘coords’

sort_text_lines(top_to_bottom=True)¶

Sorts TextLine from top to bottom according to their mean y coordinate (centroid)

Parameters: top_to_bottom (bool) – order lines from top to bottom of image, default=True
Return type: None

tag = 'TextRegion'¶

to_dict(non_serializable_keys=[])¶

Converts a Region object to a dictionary.

Parameters: non_serializable_keys (List[str]) – list of keys that can’t be directly serialized and that need some internal serialization
Returns: a dictionary with the atributes of the object serialized

to_xml(name_element='TextRegion')¶

Converts a Region object to a xml structure

Parameters: name_element – name of the object (optional)
Return type: Element
Returns: a etree structure

dh_segment.io.PAGE.get_unique_tags_from_xml_text_regions(xml_filename, tag_pattern='{type:.*;}')¶

Get a list of all the values of labels/tags

Parameters

xml_filename (str) – filename of the xml file
tag_pattern (str) – regular expression pattern to look for in TextRegion.custom_attribute

Returns

dh_segment.io.PAGE.json_serialize(dict_to_serialize, non_serializable_keys=[])¶

Serialize a dictionary in order to export it.

Parameters

dict_to_serialize (dict) – dictionary to serialize
non_serializable_keys (List[str]) – keys that are not directly seriazable sucha as python objects

Return type

dict

Returns

the serialized dictionnary

dh_segment.io.PAGE.parse_file(filename)¶

Parses the files to create the corresponding Page object. The files can be a .xml or a .json.

Parameters: filename (str) – file to parse (either json of page xml)
Return type: Page
Returns: Page object containing all the parsed elements

dh_segment.io.PAGE.save_baselines(filename, baselines, ratio=(1, 1), predictions_shape=None)¶

Parameters

filename (str) – filename to save baselines to
baselines – list of baselines
ratio (Tuple[int, int]) – ratio of prediction shape over original shape
predictions_shape (Optional[Tuple[int, int]]) – shape of the masks output by the network

Return type

Page

Returns

class dh_segment.io.via.VIAttribute¶

A container for VIA attributes.

Parameters

name (str) – The name of attribute
type (str) – The type of the annotation (dropdown, markbox, …)
options (list) – The options / labels possible for this attribute.

property name¶: Alias for field number 0

property options¶: Alias for field number 2

property type¶: Alias for field number 1

class dh_segment.io.via.WorkingItem¶

A container for annotated images.

Parameters

collection (str) – name of the collection
image_name (str) – name of the image
original_x (int) – original image x size (width)
original_y (int) – original image y size (height)
reduced_x (int) – resized x size
reduced_y (int) – resized y size
iiif (str) – iiif url
annotations (dict) – VIA ‘region_attributes’

property annotations¶: Alias for field number 7

property collection¶: Alias for field number 0

property iiif¶: Alias for field number 6

property image_name¶: Alias for field number 1

property original_x¶: Alias for field number 2

property original_y¶: Alias for field number 3

property reduced_x¶: Alias for field number 4

property reduced_y¶: Alias for field number 5

dh_segment.io.via.collect_working_items(via_annotations, collection_name, images_dir=None, via_version=2)¶

Given VIA annotation input, collect all info on WorkingItem object. This function will take care of separating images from local files and images from IIIF urls.

Parameters

via_annotations (dict) – via annotations (‘regions’ field)
images_dir (Optional[str]) – directory where to find the images
collection_name (str) – name of the collection
via_version (int) – version of the VIA tool used to produce the annotations (1 or 2)

Return type

List[WorkingItem]

Returns

list of WorkingItem

dh_segment.io.via.convert_via_region_page_text_region(working_item, structure_label)¶

Parameters

working_item (WorkingItem) –
structure_label (str) –

Return type

Page

Returns

dh_segment.io.via.create_masks(masks_dir, working_items, via_attributes, collection, contours_only=False)¶

For each annotation, create a corresponding binary mask and resize it (h = 2000). Only valid for VIA 2.0. Several annotations of the same class on the same image produce one image with several masks.

Parameters

masks_dir (str) – where to output the masks
working_items (List[WorkingItem]) – infos to work with
via_attributes (List[VIAttribute]) – VIAttributes computed by get_via_attributes function.
collection (str) – name of the nollection
contours_only (bool) – creates the binary masks only for the contours of the object (thickness of contours : 20 px)

Return type

dict

Returns

annotation_summary, a dictionary containing a list of labels per image

dh_segment.io.via.create_via_annotation_single_image(img_filename, via_regions, file_attributes=None)¶

Returns a dictionary item {key: annotation} in VIA format to further export to .json file

Parameters

img_filename (str) – path to the image
via_regions (List[dict]) – regions in VIA format (output from create_via_region_from_coordinates)
file_attributes (Optional[dict]) – file attributes (usually None)

Return type

Dict[str, dict]

Returns

dictionary item with key and annotations in VIA format

dh_segment.io.via.create_via_region_from_coordinates(coordinates, region_attributes, type_region)¶

Formats coordinates to a VIA region (dict).

Parameters

coordinates (<built-in function array>) – (N, 2) coordinates (x, y)
region_attributes (dict) – dictionary with keys : name of labels, values : values of labels
type_region (str) – via region annotation type (‘rect’, ‘polygon’)

Return type

dict

Returns

a region in VIA style (dict/json)

dh_segment.io.via.export_annotation_dict(annotation_dict, filename)¶

Export the annotations to json file.

Parameters

annotation_dict (dict) – VIA annotations
filename (str) – filename to export the data (json file)

Return type

None

Returns

dh_segment.io.via.get_annotations_per_file(via_dict, name_file)¶

From VIA json content, get annotations relative to the given name_file.

Parameters

via_dict (dict) – VIA annotations content (originally json)
name_file (str) – the file to look for (it can be a iiif path or a file path)

Return type

dict

Returns

dict

dh_segment.io.via.get_via_attributes(annotation_dict, via_version=2)¶

Gets the attributes of the annotated data and returns a list of VIAttribute.

Parameters

annotation_dict (dict) – json content of the VIA exported file
via_version (int) – either 1 or 2 (for VIA v 1.0 or VIA v 2.0)

Return type

List[VIAttribute]

Returns

A list containing VIAttributes

dh_segment.io.via.load_annotation_data(via_data_filename, only_img_annotations=False, via_version=2)¶

Load the content of via annotation files.

Parameters

via_data_filename (str) – via annotations json file
only_img_annotations (bool) – load only the images annotations (‘_via_img_metadata’ field)
via_version (int) –

Return type

dict

Returns

the content of json file containing the region annotated

dh_segment.io.via.parse_via_attributes(via_attributes)¶

Parses the VIA attribute dictionary and returns a list of VIAttribute instances

Parameters: via_attributes (dict) – attributes from VIA annotation (‘_via_attributes’ field)
Return type: List[VIAttribute]
Returns: list of VIAttribute

Input / Output¶

Input functions for tf.Estimator¶

Tensorflow serving functions¶

PAGE XML and JSON import / export¶

VGG Image Annotator helpers¶

Input functions for `tf.Estimator`¶