Input / Output¶
The dh_segment.io
module implements input / output functions and classes.
Input functions for tf.Estimator
¶
Input function
|
Input_fn for estimator |
Data augmentation
|
Applies data augmentation to both images and label images. |
|
Will cut a given image into patches. |
|
Rotates and crops the images. |
Resizing function
|
Resizes the image |
|
Loads an image from its filename and resizes it to the desired output size. |
Tensorflow serving functions¶
|
|
PAGE XML and JSON import / export¶
PAGE classes
|
Point (x,y) class. |
|
Text entity produced by a transcription system. |
|
Region containing the page. |
|
Region containing text lines. |
|
Region corresponding to a text line. |
|
Region containing simple graphics. |
|
Tabular data in any form. |
|
Lines separating columns or paragraphs. |
|
Set of regions that make a bigger region (group). |
|
Metadata information. |
|
Class following PAGE-XML object. |
Abstract classes
Base page element class. |
|
|
Region base class. |
Parsing and helpers
|
Parses the files to create the corresponding |
|
Serialize a dictionary in order to export it. |
VGG Image Annotator helpers¶
VIA objects
A container for annotated images. |
|
A container for VIA attributes. |
Creating masks with VIA annotations
|
Load the content of via annotation files. |
|
Export the annotations to json file. |
|
From VIA json content, get annotations relative to the given name_file. |
|
Parses the VIA attribute dictionary and returns a list of VIAttribute instances |
|
Gets the attributes of the annotated data and returns a list of VIAttribute. |
|
Given VIA annotation input, collect all info on WorkingItem object. |
|
For each annotation, create a corresponding binary mask and resize it (h = 2000). |
Formatting in VIA JSON format
Formats coordinates to a VIA region (dict). |
|
Returns a dictionary item {key: annotation} in VIA format to further export to .json file |
-
dh_segment.io.
input_fn
(input_data, params, input_label_dir=None, data_augmentation=False, batch_size=5, make_patches=False, num_epochs=1, num_threads=4, image_summaries=False)¶ Input_fn for estimator
- Parameters
input_data (
Union
[str
,List
[str
]]) – input data. It can be a directory containing the images, it can be a list of image filenames, or it can be a path to a csv file.params (
dict
) – params from utils.Params objectinput_label_dir (
Optional
[str
]) – directory containing the label imagesdata_augmentation (
bool
) – boolean, if True will scale, roatate, … the imagesbatch_size (
int
) – size of the bachmake_patches (
bool
) – bool, whether to make patches (crop image in smaller pieces) or notnum_epochs (
int
) – number of epochs to cycle trough data (set it to None for infinite repeat)num_threads (
int
) – number of thread to use in parallele when usin tf.data.Dataset.mapimage_summaries (
bool
) – boolean, whether to make tf.Summary to watch on tensorboard
- Returns
fn
-
dh_segment.io.
serving_input_filename
(resized_size)¶
-
dh_segment.io.
serving_input_image
()¶
-
dh_segment.io.
data_augmentation_fn
(input_image, label_image, flip_lr=True, flip_ud=True, color=True)¶ Applies data augmentation to both images and label images. Includes left-right flip, up-down flip and color change.
- Parameters
input_image (tensorflow.Tensor) – images to be augmented [B, H, W, C]
label_image (tensorflow.Tensor) – corresponding label images [B, H, W, C]
flip_lr (
bool
) – option to flip image in left-right directionflip_ud (
bool
) – option to flip image in up-down directioncolor (
bool
) – option to change color of images
- Return type
(tensorflow.Tensor, tensorflow.Tensor)
- Returns
the tuple (augmented images, augmented label images) [B, H, W, C]
-
dh_segment.io.
rotate_crop
(image, rotation, crop=True, minimum_shape=[0, 0], interpolation='NEAREST')¶ Rotates and crops the images.
- Parameters
image (tensorflow.Tensor) – image to be rotated and cropped [H, W, C]
rotation (
float
) – angle of rotation (in radians)crop (
bool
) – option to crop rotated image to avoid black borders due to rotationminimum_shape (
Tuple
[int
,int
]) – minimum shape of the rotated image / cropped imageinterpolation (
str
) – which interpolation to useNEAREST
orBILINEAR
- Return type
tensorflow.Tensor
- Returns
-
dh_segment.io.
resize_image
(image, size, interpolation='BILINEAR')¶ Resizes the image
- Parameters
image (tensorflow.Tensor) – image to be resized [H, W, C]
size (
int
) – size of the resized image (in pixels)interpolation (
str
) – which interpolation to use,NEAREST
orBILINEAR
- Return type
tensorflow.Tensor
- Returns
resized image
-
dh_segment.io.
load_and_resize_image
(filename, channels, size=None, interpolation='BILINEAR')¶ Loads an image from its filename and resizes it to the desired output size.
- Parameters
filename (
str
) – string tensorchannels (
int
) – number of channels for the decoded imagesize (
Optional
[int
]) – number of desired pixels in the resized image, tf.Tensor or int (None for no resizing)interpolation (
str
) –return_original_shape – returns the original shape of the image before resizing if this flag is True
- Return type
tensorflow.Tensor
- Returns
decoded and resized float32 tensor [h, w, channels],
-
dh_segment.io.
extract_patches_fn
(image, patch_shape, offsets)¶ Will cut a given image into patches.
- Parameters
image (tensorflow.Tensor) – tf.Tensor
patch_shape (
Tuple
[int
,int
]) – shape of the extracted patches [h, w]offsets (
Tuple
[int
,int
]) – offset to add to the origin of first patch top-right coordinate, useful during data augmentation to have slighlty different patches each time. This value will be multiplied by [h/2, w/2] (range values [0,1])
- Return type
tensorflow.Tensor
- Returns
patches [batch_patches, h, w, c]
-
dh_segment.io.
local_entropy
(tf_binary_img, sigma=3)¶ - Parameters
tf_binary_img (tensorflow.Tensor) –
sigma (
float
) –
- Return type
tensorflow.Tensor
- Returns
-
class
dh_segment.io.PAGE.
BaseElement
¶ Base page element class. (Abstract)
-
classmethod
check_tag
(tag)¶
-
classmethod
full_tag
()¶ - Return type
str
-
tag
= None¶
-
classmethod
-
class
dh_segment.io.PAGE.
Border
(coords=None, id=None)¶ Region containing the page. It is the border of the actual page of the document (if the scanned image contains parts not belonging to the page).
- Variables
coords – coordinates of the Border region
-
tag
= 'Border'¶
-
to_dict
(non_serializable_keys=[])¶ - Return type
dict
-
to_xml
()¶ - Return type
Element
-
class
dh_segment.io.PAGE.
GraphicRegion
(id=None, coords=None, custom_attribute=None)¶ Region containing simple graphics. Company logos for example should be marked as graphic regions.
- Variables
id – identifier of the GraphicRegion
coords – coordinates of the GraphicRegion
-
classmethod
from_dict
(dictionary)¶ From a seralized dictionary creates a dictionary of the atributes (non serialized)
- Parameters
dictionary (
dict
) – serialized dictionary- Return type
- Returns
non serialized dictionary
-
classmethod
from_xml
(e)¶ Creates a dictionary from a XML structure in order to create the inherited objects
- Parameters
etree_element – a xml etree
- Return type
- Returns
a dictionary with keys ‘id’ and ‘coords’
-
tag
= 'GraphicRegion'¶
-
to_xml
(name_element='GraphicRegion')¶ Converts a Region object to a xml structure
- Parameters
name_element – name of the object (optional)
- Return type
Element
- Returns
a etree structure
-
class
dh_segment.io.PAGE.
GroupSegment
(id=None, coords=None, segment_ids=None, custom_attribute=None)¶ Set of regions that make a bigger region (group). GroupSegment is a region containing several TextLine and that form a bigger region. It is used mainly to make line / column regions. Only for JSON export (no PAGE XML correspondence).
- Variables
id – identifier of the GroupSegment
coords – coordinates of the GroupSegment
segment_ids – list of the regions ids belonging to the group
-
classmethod
from_dict
(dictionary)¶ From a seralized dictionary creates a dictionary of the atributes (non serialized)
- Parameters
dictionary (
dict
) – serialized dictionary- Return type
- Returns
non serialized dictionary
-
class
dh_segment.io.PAGE.
Metadata
(creator=None, created=None, last_change=None, comments=None)¶ Metadata information.
- Variables
creator – name of the process of person that created the exported file
created – time of creation of the file
last_change – time of last modification of the file
comments – comments on the process
-
tag
= 'Metadata'¶
-
to_dict
()¶
-
to_xml
()¶ - Return type
Element
-
class
dh_segment.io.PAGE.
Page
(**kwargs)¶ Class following PAGE-XML object. This class is used to represent the information of the processed image. It is possible to export this info as PAGE-XML or JSON format.
- Variables
image_filename – filename of the image
image_width – width of the original image
image_height – height of the original image
text_regions – list of TextRegion
graphic_regions – list of GraphicRegion
page_border – Border of the page
separator_regions – list of SeparatorRegion
table_regions – list of TableRegion
metadata – Metadata of the image and process
line_groups – list of GroupSegment forming lines
column_groups – list of GroupSegment forming columns
-
draw_baselines
(img_canvas, color=(255, 0, 0), thickness=2, endpoint_radius=4, autoscale=True)¶ Given an image, draws the TextLines.baselines.
- Parameters
img_canvas (
ndarray
) – 3 channel image in which the region will be drawn. The image is modified inplace.color (
Tuple
[int
,int
,int
]) – (R, G, B) value colorthickness (
int
) – the thickness of the lineendpoint_radius (
int
) – the radius of the endpoints of line s(first and last coordinates of line)autoscale (
bool
) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
draw_column_groups
(img_canvas, color=(0, 255, 0), fill=False, thickness=5, autoscale=True)¶ It will draw column groups (in case of a table). This is only valid when parsing JSON files.
- Parameters
img_canvas (
ndarray
) – 3 channel image in which the region will be drawn. The image is modified inplacecolor (
Tuple
[int
,int
,int
]) – (R, G, B) value colorfill (
bool
) – either to fill the region (True) of only draw the external contours (False)thickness (
int
) – in case fill=False the thickness of the lineautoscale (
bool
) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
draw_graphic_regions
(img_canvas, color=(255, 0, 0), fill=True, thickness=3, autoscale=True)¶ Given an image, draws the GraphicRegions, either fills it (fill=True) or draws the contours (fill=False)
- Parameters
img_canvas (
ndarray
) – 3 channel image in which the region will be drawn. The image is modified inplace.color (
Tuple
[int
,int
,int
]) – (R, G, B) value colorfill (
bool
) – either to fill the region (True) of only draw the external contours (False)thickness (
int
) – in case fill=True the thickness of the lineautoscale (
bool
) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
draw_line_groups
(img_canvas, color=(0, 255, 0), fill=False, thickness=5, autoscale=True)¶ It will draw line groups. This is only valid when parsing JSON files.
- Parameters
img_canvas (
ndarray
) – 3 channel image in which the region will be drawn. The image is modified inplace.color (
Tuple
[int
,int
,int
]) – (R, G, B) value colorfill (
bool
) – either to fill the region (True) of only draw the external contours (False)thickness (
int
) – in case fill=False the thickness of the lineautoscale (
bool
) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
draw_lines
(img_canvas, color=(255, 0, 0), thickness=2, fill=True, autoscale=True)¶ Given an image, draws the polygons containing text lines, i.e TextLines.coords
- Parameters
img_canvas (
ndarray
) – 3 channel image in which the region will be drawn. The image is modified inplace.color (
Tuple
[int
,int
,int
]) – (R, G, B) value colorthickness (
int
) – the thickness of the linefill (
bool
) – if True fills the polygonautoscale (
bool
) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
draw_page_border
(img_canvas, color=(255, 0, 0), fill=True, thickness=5, autoscale=True)¶ Given an image, draws the page border, either fills it (fill=True) or draws the contours (fill=False)
- Parameters
img_canvas – 3 channel image in which the region will be drawn. The image is modified inplace.
color (
Tuple
[int
,int
,int
]) – (R, G, B) value colorfill (
bool
) – either to fill the region (True) of only draw the external contours (False)thickness (
int
) – in case fill=True the thickness of the lineautoscale (
bool
) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
draw_separator_lines
(img_canvas, color=(0, 255, 0), thickness=3, filter_by_id='', autoscale=True)¶ Given an image, draws the SeparatorRegion.
- Parameters
img_canvas (
ndarray
) – 3 channel image in which the region will be drawn. The image is modified inplace.color (
Tuple
[int
,int
,int
]) – (R, G, B) value colorthickness (
int
) – thickness of the linefilter_by_id (
str
) – string to filter the lines by id. For example vertical/horizontal lines can be filtered if ‘vertical’ or ‘horizontal’ is mentioned in the id.autoscale (
bool
) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
draw_text
(img_canvas, color=(255, 0, 0), thickness=5, font=cv2.FONT_HERSHEY_SIMPLEX, font_scale=1.0, autoscale=True)¶ Writes the text of the TextLine on the given image.
- Parameters
img_canvas (
ndarray
) – 3 channel image in which the region will be drawn. The image is modified inplacecolor (
Tuple
[int
,int
,int
]) – (R, G, B) value colorthickness (
int
) – the thickness of the charactersfont – the type of font (
cv2
constant)font_scale (
float
) – the scale of fontautoscale (
bool
) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
draw_text_regions
(img_canvas, color=(255, 0, 0), fill=True, thickness=3, autoscale=True)¶ Given an image, draws the TextRegions, either fills it (fill=True) or draws the contours (fill=False)
- Parameters
img_canvas (
ndarray
) – 3 channel image in which the region will be drawn. The image is modified inplace.color (
Tuple
[int
,int
,int
]) – (R, G, B) value colorfill (
bool
) – either to fill the region (True) of only draw the external contours (False)thickness (
int
) – in case fill=True the thickness of the lineautoscale (
bool
) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
tag
= 'Page'¶
-
to_json
()¶ - Return type
dict
-
to_xml
()¶ - Return type
Element
-
write_to_file
(filename, creator_name='dhSegment', comments='')¶ Export Page object to json or page-xml format. Will assume the format based on the extension of the filename, if there is no extension will export as an xml file.
- Parameters
filename (
str
) – filename of the file to be exportedcreator_name (
str
) – name of the creator (process or person) creating the filecomments (
str
) – optionnal comment to add to the metadata of the file.
- Return type
None
-
class
dh_segment.io.PAGE.
Point
(y, x)¶ Point (x,y) class.
- Variables
y – vertical coordinate
x – horizontal coordinate
-
classmethod
array_to_list
(array)¶ Converts an np.array to a list of coordinates
- Parameters
array (
ndarray
) – an array of coordinates. Must be of shape (N, 2)- Return type
list
- Returns
list of coordinates, shape (N,2)
-
classmethod
array_to_point
(array)¶ Converts an np.array to a list of Point
- Parameters
array (
ndarray
) – an array of coordinates. Must be of shape (N, 2)- Return type
list
- Returns
list of Point
-
classmethod
cv2_to_point_list
(cv2_array)¶ Converts an opencv-formatted set of coordinates to a list of Point
- Parameters
cv2_array (
ndarray
) – opencv-formatted set of coordinates, shape (N,1,2)- Return type
List
[Point
]- Returns
list of Point
-
classmethod
list_from_xml
(etree_elem)¶ Converts a PAGEXML-formatted set of coordinates to a list of Point
- Parameters
etree_elem (
Element
) – etree XML element containing a set of coordinates- Return type
List
[Point
]- Returns
a list of coordinates as Point
-
classmethod
list_point_to_string
(list_points)¶ Converts a list of Point to a string ‘x,y’
- Parameters
list_points (
List
[Point
]) – list of coordinates with Point format- Return type
str
- Returns
a string with the coordinates
-
classmethod
list_to_cv2poly
(list_points)¶ Converts a list of Point to opencv format set of coordinates
- Parameters
list_points (
List
[Point
]) – set of coordinates- Return type
ndarray
- Returns
opencv-formatted set of points, shape (N,1,2)
-
classmethod
list_to_point
(list_coords)¶ Converts a list of coordinates to a list of Point
- Parameters
list_coords (
list
) – list of coordinates, shape (N, 2)- Return type
List
[Point
]- Returns
list of Point
-
classmethod
point_to_list
(points)¶ Converts a list of Point to a list of coordinates
- Parameters
points (
List
[Point
]) – list of Points- Return type
list
- Returns
list of shape (N,2)
-
to_dict
()¶
-
class
dh_segment.io.PAGE.
Region
(id=None, coords=None, custom_attribute=None)¶ Region base class. (Abstract) This is the superclass for all the extracted regions
- Variables
id – identifier of the Region
coords – coordinates of the Region
custom_attribute – Any custom attribute that may be linked with the region (usually this is added in PAGEXML files, not in JSON files)
-
classmethod
from_dict
(dictionary)¶ From a seralized dictionary creates a dictionary of the atributes (non serialized)
- Parameters
dictionary (
dict
) – serialized dictionary- Return type
dict
- Returns
non serialized dictionary
-
classmethod
from_xml
(etree_element)¶ Creates a dictionary from a XML structure in order to create the inherited objects
- Parameters
etree_element (
Element
) – a xml etree- Return type
dict
- Returns
a dictionary with keys ‘id’ and ‘coords’
-
tag
= 'Region'¶
-
to_dict
(non_serializable_keys=[])¶ Converts a Region object to a dictionary.
- Parameters
non_serializable_keys (
List
[str
]) – list of keys that can’t be directly serialized and that need some internal serialization- Return type
dict
- Returns
a dictionary with the atributes of the object serialized
-
to_xml
(name_element=None)¶ Converts a Region object to a xml structure
- Parameters
name_element (
Optional
[str
]) – name of the object (optional)- Return type
Element
- Returns
a etree structure
-
class
dh_segment.io.PAGE.
SeparatorRegion
(id, coords=None, custom_attribute=None)¶ Lines separating columns or paragraphs. Separators are lines that lie between columns and paragraphs and can be used to logically separate different articles from each other.
- Variables
id – identifier of the SeparatorRegion
coords – coordinates of the SeparatorRegion
-
classmethod
from_dict
(dictionary)¶ From a seralized dictionary creates a dictionary of the atributes (non serialized)
- Parameters
dictionary (
dict
) – serialized dictionary- Return type
- Returns
non serialized dictionary
-
classmethod
from_xml
(e)¶ Creates a dictionary from a XML structure in order to create the inherited objects
- Parameters
etree_element – a xml etree
- Return type
- Returns
a dictionary with keys ‘id’ and ‘coords’
-
tag
= 'SeparatorRegion'¶
-
to_xml
(name_element='SeparatorRegion')¶ Converts a Region object to a xml structure
- Parameters
name_element – name of the object (optional)
- Return type
Element
- Returns
a etree structure
-
class
dh_segment.io.PAGE.
TableRegion
(id=None, coords=None, rows=None, columns=None, embedded_text=None, custom_attribute=None)¶ Tabular data in any form. Tabular data is represented with a table region. Rows and columns may or may not have separator lines; these lines are not separator regions.
- Variables
id – identifier of the TableRegion
coords – coordinates of the TableRegion
rows – number of rows in the table
columns – number of columns in the table
embedded_text – if text is embedded in the table
-
classmethod
from_dict
(dictionary)¶ From a seralized dictionary creates a dictionary of the atributes (non serialized)
- Parameters
dictionary (
dict
) – serialized dictionary- Return type
- Returns
non serialized dictionary
-
classmethod
from_xml
(e)¶ Creates a dictionary from a XML structure in order to create the inherited objects
- Parameters
etree_element – a xml etree
- Return type
- Returns
a dictionary with keys ‘id’ and ‘coords’
-
tag
= 'TableRegion'¶
-
to_xml
(name_element='TableRegion')¶ Converts a Region object to a xml structure
- Parameters
name_element – name of the object (optional)
- Return type
Element
- Returns
a etree structure
-
class
dh_segment.io.PAGE.
Text
(text_equiv=None, alternatives=None, score=None)¶ Text entity produced by a transcription system.
- Variables
text_equiv – the transcription of the text
alternatives – alternative transcriptions
score – the confidence of the transcription output by the transcription system
-
to_dict
()¶ - Return type
dict
-
class
dh_segment.io.PAGE.
TextLine
(id=None, coords=None, baseline=None, text=None, line_group_id=None, column_group_id=None, custom_attribute=None)¶ Region corresponding to a text line.
- Variables
id – identifier of the TextLine
coords – coordinates of the Texline line
baseline – coordinates of the Texline baseline
text – Text class containing the transcription of the TextLine
line_group_id – identifier of the line group the instance belongs to
column_group_id – identifier of the column group the instance belongs to
custom_attribute – Any custom attribute that may be linked with the region (usually this is added in PAGEXML files, not in JSON files)
-
classmethod
from_array
(cv2_coords=None, baseline_coords=None, text_equiv=None, id=None)¶
-
classmethod
from_dict
(dictionary)¶ From a seralized dictionary creates a dictionary of the atributes (non serialized)
- Parameters
dictionary (
dict
) – serialized dictionary- Return type
- Returns
non serialized dictionary
-
classmethod
from_xml
(etree_element)¶ Creates a dictionary from a XML structure in order to create the inherited objects
- Parameters
etree_element (
Element
) – a xml etree- Return type
- Returns
a dictionary with keys ‘id’ and ‘coords’
-
scale_baseline_points
(ratio)¶ Scales the points of the baseline by a factor ratio.
- Parameters
ratio (
float
) – factor to rescale the baseline coordinates
-
tag
= 'TextLine'¶
-
to_dict
(non_serializable_keys=[])¶ Converts a Region object to a dictionary.
- Parameters
non_serializable_keys (
List
[str
]) – list of keys that can’t be directly serialized and that need some internal serialization- Returns
a dictionary with the atributes of the object serialized
-
to_xml
(name_element='TextLine')¶ Converts a Region object to a xml structure
- Parameters
name_element – name of the object (optional)
- Return type
Element
- Returns
a etree structure
-
class
dh_segment.io.PAGE.
TextRegion
(id=None, coords=None, text_lines=None, text_equiv='', region_type=None, custom_attribute=None)¶ Region containing text lines. It can represent a paragraph or a page for instance.
- Variables
id – identifier of the TextRegion
coords – coordinates of the TextRegion
text_equiv – the resulting text of the Text contained in the TextLines
text_lines – a list of TextLine objects
region_type – the type of a TextRegion (can be any string). Example : header, paragraph, page-number…
custom_attribute – Any custom attribute that may be linked with the region (usually this is added in PAGEXML files, not in JSON files)
-
classmethod
from_dict
(dictionary)¶ From a seralized dictionary creates a dictionary of the atributes (non serialized)
- Parameters
dictionary (
dict
) – serialized dictionary- Return type
- Returns
non serialized dictionary
-
classmethod
from_xml
(e)¶ Creates a dictionary from a XML structure in order to create the inherited objects
- Parameters
etree_element – a xml etree
- Return type
- Returns
a dictionary with keys ‘id’ and ‘coords’
-
sort_text_lines
(top_to_bottom=True)¶ Sorts
TextLine
from top to bottom according to their mean y coordinate (centroid)- Parameters
top_to_bottom (
bool
) – order lines from top to bottom of image, default=True- Return type
None
-
tag
= 'TextRegion'¶
-
to_dict
(non_serializable_keys=[])¶ Converts a Region object to a dictionary.
- Parameters
non_serializable_keys (
List
[str
]) – list of keys that can’t be directly serialized and that need some internal serialization- Returns
a dictionary with the atributes of the object serialized
-
to_xml
(name_element='TextRegion')¶ Converts a Region object to a xml structure
- Parameters
name_element – name of the object (optional)
- Return type
Element
- Returns
a etree structure
Get a list of all the values of labels/tags
- Parameters
xml_filename (
str
) – filename of the xml filetag_pattern (
str
) – regular expression pattern to look for in TextRegion.custom_attribute
- Returns
-
dh_segment.io.PAGE.
json_serialize
(dict_to_serialize, non_serializable_keys=[])¶ Serialize a dictionary in order to export it.
- Parameters
dict_to_serialize (
dict
) – dictionary to serializenon_serializable_keys (
List
[str
]) – keys that are not directly seriazable sucha as python objects
- Return type
dict
- Returns
the serialized dictionnary
-
dh_segment.io.PAGE.
parse_file
(filename)¶ Parses the files to create the corresponding
Page
object. The files can be a .xml or a .json.- Parameters
filename (
str
) – file to parse (either json of page xml)- Return type
- Returns
Page object containing all the parsed elements
-
dh_segment.io.PAGE.
save_baselines
(filename, baselines, ratio=(1, 1), predictions_shape=None)¶ - Parameters
filename (
str
) – filename to save baselines tobaselines – list of baselines
ratio (
Tuple
[int
,int
]) – ratio of prediction shape over original shapepredictions_shape (
Optional
[Tuple
[int
,int
]]) – shape of the masks output by the network
- Return type
- Returns
-
class
dh_segment.io.via.
VIAttribute
¶ A container for VIA attributes.
- Parameters
name (str) – The name of attribute
type (str) – The type of the annotation (dropdown, markbox, …)
options (list) – The options / labels possible for this attribute.
-
property
name
¶ Alias for field number 0
-
property
options
¶ Alias for field number 2
-
property
type
¶ Alias for field number 1
-
class
dh_segment.io.via.
WorkingItem
¶ A container for annotated images.
- Parameters
collection (str) – name of the collection
image_name (str) – name of the image
original_x (int) – original image x size (width)
original_y (int) – original image y size (height)
reduced_x (int) – resized x size
reduced_y (int) – resized y size
iiif (str) – iiif url
annotations (dict) – VIA ‘region_attributes’
-
property
annotations
¶ Alias for field number 7
-
property
collection
¶ Alias for field number 0
-
property
iiif
¶ Alias for field number 6
-
property
image_name
¶ Alias for field number 1
-
property
original_x
¶ Alias for field number 2
-
property
original_y
¶ Alias for field number 3
-
property
reduced_x
¶ Alias for field number 4
-
property
reduced_y
¶ Alias for field number 5
-
dh_segment.io.via.
collect_working_items
(via_annotations, collection_name, images_dir=None, via_version=2)¶ Given VIA annotation input, collect all info on WorkingItem object. This function will take care of separating images from local files and images from IIIF urls.
- Parameters
via_annotations (
dict
) – via annotations (‘regions’ field)images_dir (
Optional
[str
]) – directory where to find the imagescollection_name (
str
) – name of the collectionvia_version (
int
) – version of the VIA tool used to produce the annotations (1 or 2)
- Return type
List
[WorkingItem
]- Returns
list of WorkingItem
-
dh_segment.io.via.
convert_via_region_page_text_region
(working_item, structure_label)¶ - Parameters
working_item (
WorkingItem
) –structure_label (
str
) –
- Return type
- Returns
-
dh_segment.io.via.
create_masks
(masks_dir, working_items, via_attributes, collection, contours_only=False)¶ For each annotation, create a corresponding binary mask and resize it (h = 2000). Only valid for VIA 2.0. Several annotations of the same class on the same image produce one image with several masks.
- Parameters
masks_dir (
str
) – where to output the masksworking_items (
List
[WorkingItem
]) – infos to work withvia_attributes (
List
[VIAttribute
]) – VIAttributes computed byget_via_attributes
function.collection (
str
) – name of the nollectioncontours_only (
bool
) – creates the binary masks only for the contours of the object (thickness of contours : 20 px)
- Return type
dict
- Returns
annotation_summary, a dictionary containing a list of labels per image
-
dh_segment.io.via.
create_via_annotation_single_image
(img_filename, via_regions, file_attributes=None)¶ Returns a dictionary item {key: annotation} in VIA format to further export to .json file
- Parameters
img_filename (
str
) – path to the imagevia_regions (
List
[dict
]) – regions in VIA format (output fromcreate_via_region_from_coordinates
)file_attributes (
Optional
[dict
]) – file attributes (usually None)
- Return type
Dict
[str
,dict
]- Returns
dictionary item with key and annotations in VIA format
-
dh_segment.io.via.
create_via_region_from_coordinates
(coordinates, region_attributes, type_region)¶ Formats coordinates to a VIA region (dict).
- Parameters
coordinates (<built-in function array>) – (N, 2) coordinates (x, y)
region_attributes (
dict
) – dictionary with keys : name of labels, values : values of labelstype_region (
str
) – via region annotation type (‘rect’, ‘polygon’)
- Return type
dict
- Returns
a region in VIA style (dict/json)
-
dh_segment.io.via.
export_annotation_dict
(annotation_dict, filename)¶ Export the annotations to json file.
- Parameters
annotation_dict (
dict
) – VIA annotationsfilename (
str
) – filename to export the data (json file)
- Return type
None
- Returns
-
dh_segment.io.via.
get_annotations_per_file
(via_dict, name_file)¶ From VIA json content, get annotations relative to the given name_file.
- Parameters
via_dict (
dict
) – VIA annotations content (originally json)name_file (
str
) – the file to look for (it can be a iiif path or a file path)
- Return type
dict
- Returns
dict
-
dh_segment.io.via.
get_via_attributes
(annotation_dict, via_version=2)¶ Gets the attributes of the annotated data and returns a list of VIAttribute.
- Parameters
annotation_dict (
dict
) – json content of the VIA exported filevia_version (
int
) – either 1 or 2 (for VIA v 1.0 or VIA v 2.0)
- Return type
List
[VIAttribute
]- Returns
A list containing VIAttributes
-
dh_segment.io.via.
load_annotation_data
(via_data_filename, only_img_annotations=False, via_version=2)¶ Load the content of via annotation files.
- Parameters
via_data_filename (
str
) – via annotations json fileonly_img_annotations (
bool
) – load only the images annotations (‘_via_img_metadata’ field)via_version (
int
) –
- Return type
dict
- Returns
the content of json file containing the region annotated
-
dh_segment.io.via.
parse_via_attributes
(via_attributes)¶ Parses the VIA attribute dictionary and returns a list of VIAttribute instances
- Parameters
via_attributes (
dict
) – attributes from VIA annotation (‘_via_attributes’ field)- Return type
List
[VIAttribute
]- Returns
list of
VIAttribute