Input / Output¶
The dh_segment.io module implements input / output functions and classes.
Input functions for tf.Estimator¶
Input function
|
Input_fn for estimator |
Data augmentation
|
Applies data augmentation to both images and label images. |
|
Will cut a given image into patches. |
|
Rotates and crops the images. |
Resizing function
|
Resizes the image |
|
Loads an image from its filename and resizes it to the desired output size. |
Tensorflow serving functions¶
|
|
PAGE XML and JSON import / export¶
PAGE classes
|
Point (x,y) class. |
|
Text entity produced by a transcription system. |
|
Region containing the page. |
|
Region containing text lines. |
|
Region corresponding to a text line. |
|
Region containing simple graphics. |
|
Tabular data in any form. |
|
Lines separating columns or paragraphs. |
|
Set of regions that make a bigger region (group). |
|
Metadata information. |
|
Class following PAGE-XML object. |
Abstract classes
Base page element class. |
|
|
Region base class. |
Parsing and helpers
|
Parses the files to create the corresponding |
|
Serialize a dictionary in order to export it. |
VGG Image Annotator helpers¶
VIA objects
A container for annotated images. |
|
A container for VIA attributes. |
Creating masks with VIA annotations
|
Load the content of via annotation files. |
|
Export the annotations to json file. |
|
From VIA json content, get annotations relative to the given name_file. |
|
Parses the VIA attribute dictionary and returns a list of VIAttribute instances |
|
Gets the attributes of the annotated data and returns a list of VIAttribute. |
|
Given VIA annotation input, collect all info on WorkingItem object. |
|
For each annotation, create a corresponding binary mask and resize it (h = 2000). |
Formatting in VIA JSON format
Formats coordinates to a VIA region (dict). |
|
Returns a dictionary item {key: annotation} in VIA format to further export to .json file |
-
dh_segment.io.input_fn(input_data, params, input_label_dir=None, data_augmentation=False, batch_size=5, make_patches=False, num_epochs=1, num_threads=4, image_summaries=False)¶ Input_fn for estimator
- Parameters
input_data (
Union[str,List[str]]) – input data. It can be a directory containing the images, it can be a list of image filenames, or it can be a path to a csv file.params (
dict) – params from utils.Params objectinput_label_dir (
Optional[str]) – directory containing the label imagesdata_augmentation (
bool) – boolean, if True will scale, roatate, … the imagesbatch_size (
int) – size of the bachmake_patches (
bool) – bool, whether to make patches (crop image in smaller pieces) or notnum_epochs (
int) – number of epochs to cycle trough data (set it to None for infinite repeat)num_threads (
int) – number of thread to use in parallele when usin tf.data.Dataset.mapimage_summaries (
bool) – boolean, whether to make tf.Summary to watch on tensorboard
- Returns
fn
-
dh_segment.io.serving_input_filename(resized_size)¶
-
dh_segment.io.serving_input_image()¶
-
dh_segment.io.data_augmentation_fn(input_image, label_image, flip_lr=True, flip_ud=True, color=True)¶ Applies data augmentation to both images and label images. Includes left-right flip, up-down flip and color change.
- Parameters
input_image (tensorflow.Tensor) – images to be augmented [B, H, W, C]
label_image (tensorflow.Tensor) – corresponding label images [B, H, W, C]
flip_lr (
bool) – option to flip image in left-right directionflip_ud (
bool) – option to flip image in up-down directioncolor (
bool) – option to change color of images
- Return type
(tensorflow.Tensor, tensorflow.Tensor)
- Returns
the tuple (augmented images, augmented label images) [B, H, W, C]
-
dh_segment.io.rotate_crop(image, rotation, crop=True, minimum_shape=[0, 0], interpolation='NEAREST')¶ Rotates and crops the images.
- Parameters
image (tensorflow.Tensor) – image to be rotated and cropped [H, W, C]
rotation (
float) – angle of rotation (in radians)crop (
bool) – option to crop rotated image to avoid black borders due to rotationminimum_shape (
Tuple[int,int]) – minimum shape of the rotated image / cropped imageinterpolation (
str) – which interpolation to useNEARESTorBILINEAR
- Return type
tensorflow.Tensor
- Returns
-
dh_segment.io.resize_image(image, size, interpolation='BILINEAR')¶ Resizes the image
- Parameters
image (tensorflow.Tensor) – image to be resized [H, W, C]
size (
int) – size of the resized image (in pixels)interpolation (
str) – which interpolation to use,NEARESTorBILINEAR
- Return type
tensorflow.Tensor
- Returns
resized image
-
dh_segment.io.load_and_resize_image(filename, channels, size=None, interpolation='BILINEAR')¶ Loads an image from its filename and resizes it to the desired output size.
- Parameters
filename (
str) – string tensorchannels (
int) – number of channels for the decoded imagesize (
Optional[int]) – number of desired pixels in the resized image, tf.Tensor or int (None for no resizing)interpolation (
str) –return_original_shape – returns the original shape of the image before resizing if this flag is True
- Return type
tensorflow.Tensor
- Returns
decoded and resized float32 tensor [h, w, channels],
-
dh_segment.io.extract_patches_fn(image, patch_shape, offsets)¶ Will cut a given image into patches.
- Parameters
image (tensorflow.Tensor) – tf.Tensor
patch_shape (
Tuple[int,int]) – shape of the extracted patches [h, w]offsets (
Tuple[int,int]) – offset to add to the origin of first patch top-right coordinate, useful during data augmentation to have slighlty different patches each time. This value will be multiplied by [h/2, w/2] (range values [0,1])
- Return type
tensorflow.Tensor
- Returns
patches [batch_patches, h, w, c]
-
dh_segment.io.local_entropy(tf_binary_img, sigma=3)¶ - Parameters
tf_binary_img (tensorflow.Tensor) –
sigma (
float) –
- Return type
tensorflow.Tensor
- Returns
-
class
dh_segment.io.PAGE.BaseElement¶ Base page element class. (Abstract)
-
classmethod
check_tag(tag)¶
-
classmethod
full_tag()¶ - Return type
str
-
tag= None¶
-
classmethod
-
class
dh_segment.io.PAGE.Border(coords=None, id=None)¶ Region containing the page. It is the border of the actual page of the document (if the scanned image contains parts not belonging to the page).
- Variables
coords – coordinates of the Border region
-
tag= 'Border'¶
-
to_dict(non_serializable_keys=[])¶ - Return type
dict
-
to_xml()¶ - Return type
Element
-
class
dh_segment.io.PAGE.GraphicRegion(id=None, coords=None, custom_attribute=None)¶ Region containing simple graphics. Company logos for example should be marked as graphic regions.
- Variables
id – identifier of the GraphicRegion
coords – coordinates of the GraphicRegion
-
classmethod
from_dict(dictionary)¶ From a seralized dictionary creates a dictionary of the atributes (non serialized)
- Parameters
dictionary (
dict) – serialized dictionary- Return type
- Returns
non serialized dictionary
-
classmethod
from_xml(e)¶ Creates a dictionary from a XML structure in order to create the inherited objects
- Parameters
etree_element – a xml etree
- Return type
- Returns
a dictionary with keys ‘id’ and ‘coords’
-
tag= 'GraphicRegion'¶
-
to_xml(name_element='GraphicRegion')¶ Converts a Region object to a xml structure
- Parameters
name_element – name of the object (optional)
- Return type
Element- Returns
a etree structure
-
class
dh_segment.io.PAGE.GroupSegment(id=None, coords=None, segment_ids=None, custom_attribute=None)¶ Set of regions that make a bigger region (group). GroupSegment is a region containing several TextLine and that form a bigger region. It is used mainly to make line / column regions. Only for JSON export (no PAGE XML correspondence).
- Variables
id – identifier of the GroupSegment
coords – coordinates of the GroupSegment
segment_ids – list of the regions ids belonging to the group
-
classmethod
from_dict(dictionary)¶ From a seralized dictionary creates a dictionary of the atributes (non serialized)
- Parameters
dictionary (
dict) – serialized dictionary- Return type
- Returns
non serialized dictionary
-
class
dh_segment.io.PAGE.Metadata(creator=None, created=None, last_change=None, comments=None)¶ Metadata information.
- Variables
creator – name of the process of person that created the exported file
created – time of creation of the file
last_change – time of last modification of the file
comments – comments on the process
-
tag= 'Metadata'¶
-
to_dict()¶
-
to_xml()¶ - Return type
Element
-
class
dh_segment.io.PAGE.Page(**kwargs)¶ Class following PAGE-XML object. This class is used to represent the information of the processed image. It is possible to export this info as PAGE-XML or JSON format.
- Variables
image_filename – filename of the image
image_width – width of the original image
image_height – height of the original image
text_regions – list of TextRegion
graphic_regions – list of GraphicRegion
page_border – Border of the page
separator_regions – list of SeparatorRegion
table_regions – list of TableRegion
metadata – Metadata of the image and process
line_groups – list of GroupSegment forming lines
column_groups – list of GroupSegment forming columns
-
draw_baselines(img_canvas, color=(255, 0, 0), thickness=2, endpoint_radius=4, autoscale=True)¶ Given an image, draws the TextLines.baselines.
- Parameters
img_canvas (
ndarray) – 3 channel image in which the region will be drawn. The image is modified inplace.color (
Tuple[int,int,int]) – (R, G, B) value colorthickness (
int) – the thickness of the lineendpoint_radius (
int) – the radius of the endpoints of line s(first and last coordinates of line)autoscale (
bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
draw_column_groups(img_canvas, color=(0, 255, 0), fill=False, thickness=5, autoscale=True)¶ It will draw column groups (in case of a table). This is only valid when parsing JSON files.
- Parameters
img_canvas (
ndarray) – 3 channel image in which the region will be drawn. The image is modified inplacecolor (
Tuple[int,int,int]) – (R, G, B) value colorfill (
bool) – either to fill the region (True) of only draw the external contours (False)thickness (
int) – in case fill=False the thickness of the lineautoscale (
bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
draw_graphic_regions(img_canvas, color=(255, 0, 0), fill=True, thickness=3, autoscale=True)¶ Given an image, draws the GraphicRegions, either fills it (fill=True) or draws the contours (fill=False)
- Parameters
img_canvas (
ndarray) – 3 channel image in which the region will be drawn. The image is modified inplace.color (
Tuple[int,int,int]) – (R, G, B) value colorfill (
bool) – either to fill the region (True) of only draw the external contours (False)thickness (
int) – in case fill=True the thickness of the lineautoscale (
bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
draw_line_groups(img_canvas, color=(0, 255, 0), fill=False, thickness=5, autoscale=True)¶ It will draw line groups. This is only valid when parsing JSON files.
- Parameters
img_canvas (
ndarray) – 3 channel image in which the region will be drawn. The image is modified inplace.color (
Tuple[int,int,int]) – (R, G, B) value colorfill (
bool) – either to fill the region (True) of only draw the external contours (False)thickness (
int) – in case fill=False the thickness of the lineautoscale (
bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
draw_lines(img_canvas, color=(255, 0, 0), thickness=2, fill=True, autoscale=True)¶ Given an image, draws the polygons containing text lines, i.e TextLines.coords
- Parameters
img_canvas (
ndarray) – 3 channel image in which the region will be drawn. The image is modified inplace.color (
Tuple[int,int,int]) – (R, G, B) value colorthickness (
int) – the thickness of the linefill (
bool) – if True fills the polygonautoscale (
bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
draw_page_border(img_canvas, color=(255, 0, 0), fill=True, thickness=5, autoscale=True)¶ Given an image, draws the page border, either fills it (fill=True) or draws the contours (fill=False)
- Parameters
img_canvas – 3 channel image in which the region will be drawn. The image is modified inplace.
color (
Tuple[int,int,int]) – (R, G, B) value colorfill (
bool) – either to fill the region (True) of only draw the external contours (False)thickness (
int) – in case fill=True the thickness of the lineautoscale (
bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
draw_separator_lines(img_canvas, color=(0, 255, 0), thickness=3, filter_by_id='', autoscale=True)¶ Given an image, draws the SeparatorRegion.
- Parameters
img_canvas (
ndarray) – 3 channel image in which the region will be drawn. The image is modified inplace.color (
Tuple[int,int,int]) – (R, G, B) value colorthickness (
int) – thickness of the linefilter_by_id (
str) – string to filter the lines by id. For example vertical/horizontal lines can be filtered if ‘vertical’ or ‘horizontal’ is mentioned in the id.autoscale (
bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
draw_text(img_canvas, color=(255, 0, 0), thickness=5, font=cv2.FONT_HERSHEY_SIMPLEX, font_scale=1.0, autoscale=True)¶ Writes the text of the TextLine on the given image.
- Parameters
img_canvas (
ndarray) – 3 channel image in which the region will be drawn. The image is modified inplacecolor (
Tuple[int,int,int]) – (R, G, B) value colorthickness (
int) – the thickness of the charactersfont – the type of font (
cv2constant)font_scale (
float) – the scale of fontautoscale (
bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
draw_text_regions(img_canvas, color=(255, 0, 0), fill=True, thickness=3, autoscale=True)¶ Given an image, draws the TextRegions, either fills it (fill=True) or draws the contours (fill=False)
- Parameters
img_canvas (
ndarray) – 3 channel image in which the region will be drawn. The image is modified inplace.color (
Tuple[int,int,int]) – (R, G, B) value colorfill (
bool) – either to fill the region (True) of only draw the external contours (False)thickness (
int) – in case fill=True the thickness of the lineautoscale (
bool) – whether to scale the coordinates to the size of img_canvas. If True, it will use the dimensions provided in Page.image_width and Page.image_height to compute the scaling ratio
-
tag= 'Page'¶
-
to_json()¶ - Return type
dict
-
to_xml()¶ - Return type
Element
-
write_to_file(filename, creator_name='dhSegment', comments='')¶ Export Page object to json or page-xml format. Will assume the format based on the extension of the filename, if there is no extension will export as an xml file.
- Parameters
filename (
str) – filename of the file to be exportedcreator_name (
str) – name of the creator (process or person) creating the filecomments (
str) – optionnal comment to add to the metadata of the file.
- Return type
None
-
class
dh_segment.io.PAGE.Point(y, x)¶ Point (x,y) class.
- Variables
y – vertical coordinate
x – horizontal coordinate
-
classmethod
array_to_list(array)¶ Converts an np.array to a list of coordinates
- Parameters
array (
ndarray) – an array of coordinates. Must be of shape (N, 2)- Return type
list- Returns
list of coordinates, shape (N,2)
-
classmethod
array_to_point(array)¶ Converts an np.array to a list of Point
- Parameters
array (
ndarray) – an array of coordinates. Must be of shape (N, 2)- Return type
list- Returns
list of Point
-
classmethod
cv2_to_point_list(cv2_array)¶ Converts an opencv-formatted set of coordinates to a list of Point
- Parameters
cv2_array (
ndarray) – opencv-formatted set of coordinates, shape (N,1,2)- Return type
List[Point]- Returns
list of Point
-
classmethod
list_from_xml(etree_elem)¶ Converts a PAGEXML-formatted set of coordinates to a list of Point
- Parameters
etree_elem (
Element) – etree XML element containing a set of coordinates- Return type
List[Point]- Returns
a list of coordinates as Point
-
classmethod
list_point_to_string(list_points)¶ Converts a list of Point to a string ‘x,y’
- Parameters
list_points (
List[Point]) – list of coordinates with Point format- Return type
str- Returns
a string with the coordinates
-
classmethod
list_to_cv2poly(list_points)¶ Converts a list of Point to opencv format set of coordinates
- Parameters
list_points (
List[Point]) – set of coordinates- Return type
ndarray- Returns
opencv-formatted set of points, shape (N,1,2)
-
classmethod
list_to_point(list_coords)¶ Converts a list of coordinates to a list of Point
- Parameters
list_coords (
list) – list of coordinates, shape (N, 2)- Return type
List[Point]- Returns
list of Point
-
classmethod
point_to_list(points)¶ Converts a list of Point to a list of coordinates
- Parameters
points (
List[Point]) – list of Points- Return type
list- Returns
list of shape (N,2)
-
to_dict()¶
-
class
dh_segment.io.PAGE.Region(id=None, coords=None, custom_attribute=None)¶ Region base class. (Abstract) This is the superclass for all the extracted regions
- Variables
id – identifier of the Region
coords – coordinates of the Region
custom_attribute – Any custom attribute that may be linked with the region (usually this is added in PAGEXML files, not in JSON files)
-
classmethod
from_dict(dictionary)¶ From a seralized dictionary creates a dictionary of the atributes (non serialized)
- Parameters
dictionary (
dict) – serialized dictionary- Return type
dict- Returns
non serialized dictionary
-
classmethod
from_xml(etree_element)¶ Creates a dictionary from a XML structure in order to create the inherited objects
- Parameters
etree_element (
Element) – a xml etree- Return type
dict- Returns
a dictionary with keys ‘id’ and ‘coords’
-
tag= 'Region'¶
-
to_dict(non_serializable_keys=[])¶ Converts a Region object to a dictionary.
- Parameters
non_serializable_keys (
List[str]) – list of keys that can’t be directly serialized and that need some internal serialization- Return type
dict- Returns
a dictionary with the atributes of the object serialized
-
to_xml(name_element=None)¶ Converts a Region object to a xml structure
- Parameters
name_element (
Optional[str]) – name of the object (optional)- Return type
Element- Returns
a etree structure
-
class
dh_segment.io.PAGE.SeparatorRegion(id, coords=None, custom_attribute=None)¶ Lines separating columns or paragraphs. Separators are lines that lie between columns and paragraphs and can be used to logically separate different articles from each other.
- Variables
id – identifier of the SeparatorRegion
coords – coordinates of the SeparatorRegion
-
classmethod
from_dict(dictionary)¶ From a seralized dictionary creates a dictionary of the atributes (non serialized)
- Parameters
dictionary (
dict) – serialized dictionary- Return type
- Returns
non serialized dictionary
-
classmethod
from_xml(e)¶ Creates a dictionary from a XML structure in order to create the inherited objects
- Parameters
etree_element – a xml etree
- Return type
- Returns
a dictionary with keys ‘id’ and ‘coords’
-
tag= 'SeparatorRegion'¶
-
to_xml(name_element='SeparatorRegion')¶ Converts a Region object to a xml structure
- Parameters
name_element – name of the object (optional)
- Return type
Element- Returns
a etree structure
-
class
dh_segment.io.PAGE.TableRegion(id=None, coords=None, rows=None, columns=None, embedded_text=None, custom_attribute=None)¶ Tabular data in any form. Tabular data is represented with a table region. Rows and columns may or may not have separator lines; these lines are not separator regions.
- Variables
id – identifier of the TableRegion
coords – coordinates of the TableRegion
rows – number of rows in the table
columns – number of columns in the table
embedded_text – if text is embedded in the table
-
classmethod
from_dict(dictionary)¶ From a seralized dictionary creates a dictionary of the atributes (non serialized)
- Parameters
dictionary (
dict) – serialized dictionary- Return type
- Returns
non serialized dictionary
-
classmethod
from_xml(e)¶ Creates a dictionary from a XML structure in order to create the inherited objects
- Parameters
etree_element – a xml etree
- Return type
- Returns
a dictionary with keys ‘id’ and ‘coords’
-
tag= 'TableRegion'¶
-
to_xml(name_element='TableRegion')¶ Converts a Region object to a xml structure
- Parameters
name_element – name of the object (optional)
- Return type
Element- Returns
a etree structure
-
class
dh_segment.io.PAGE.Text(text_equiv=None, alternatives=None, score=None)¶ Text entity produced by a transcription system.
- Variables
text_equiv – the transcription of the text
alternatives – alternative transcriptions
score – the confidence of the transcription output by the transcription system
-
to_dict()¶ - Return type
dict
-
class
dh_segment.io.PAGE.TextLine(id=None, coords=None, baseline=None, text=None, line_group_id=None, column_group_id=None, custom_attribute=None)¶ Region corresponding to a text line.
- Variables
id – identifier of the TextLine
coords – coordinates of the Texline line
baseline – coordinates of the Texline baseline
text – Text class containing the transcription of the TextLine
line_group_id – identifier of the line group the instance belongs to
column_group_id – identifier of the column group the instance belongs to
custom_attribute – Any custom attribute that may be linked with the region (usually this is added in PAGEXML files, not in JSON files)
-
classmethod
from_array(cv2_coords=None, baseline_coords=None, text_equiv=None, id=None)¶
-
classmethod
from_dict(dictionary)¶ From a seralized dictionary creates a dictionary of the atributes (non serialized)
- Parameters
dictionary (
dict) – serialized dictionary- Return type
- Returns
non serialized dictionary
-
classmethod
from_xml(etree_element)¶ Creates a dictionary from a XML structure in order to create the inherited objects
- Parameters
etree_element (
Element) – a xml etree- Return type
- Returns
a dictionary with keys ‘id’ and ‘coords’
-
scale_baseline_points(ratio)¶ Scales the points of the baseline by a factor ratio.
- Parameters
ratio (
float) – factor to rescale the baseline coordinates
-
tag= 'TextLine'¶
-
to_dict(non_serializable_keys=[])¶ Converts a Region object to a dictionary.
- Parameters
non_serializable_keys (
List[str]) – list of keys that can’t be directly serialized and that need some internal serialization- Returns
a dictionary with the atributes of the object serialized
-
to_xml(name_element='TextLine')¶ Converts a Region object to a xml structure
- Parameters
name_element – name of the object (optional)
- Return type
Element- Returns
a etree structure
-
class
dh_segment.io.PAGE.TextRegion(id=None, coords=None, text_lines=None, text_equiv='', region_type=None, custom_attribute=None)¶ Region containing text lines. It can represent a paragraph or a page for instance.
- Variables
id – identifier of the TextRegion
coords – coordinates of the TextRegion
text_equiv – the resulting text of the Text contained in the TextLines
text_lines – a list of TextLine objects
region_type – the type of a TextRegion (can be any string). Example : header, paragraph, page-number…
custom_attribute – Any custom attribute that may be linked with the region (usually this is added in PAGEXML files, not in JSON files)
-
classmethod
from_dict(dictionary)¶ From a seralized dictionary creates a dictionary of the atributes (non serialized)
- Parameters
dictionary (
dict) – serialized dictionary- Return type
- Returns
non serialized dictionary
-
classmethod
from_xml(e)¶ Creates a dictionary from a XML structure in order to create the inherited objects
- Parameters
etree_element – a xml etree
- Return type
- Returns
a dictionary with keys ‘id’ and ‘coords’
-
sort_text_lines(top_to_bottom=True)¶ Sorts
TextLinefrom top to bottom according to their mean y coordinate (centroid)- Parameters
top_to_bottom (
bool) – order lines from top to bottom of image, default=True- Return type
None
-
tag= 'TextRegion'¶
-
to_dict(non_serializable_keys=[])¶ Converts a Region object to a dictionary.
- Parameters
non_serializable_keys (
List[str]) – list of keys that can’t be directly serialized and that need some internal serialization- Returns
a dictionary with the atributes of the object serialized
-
to_xml(name_element='TextRegion')¶ Converts a Region object to a xml structure
- Parameters
name_element – name of the object (optional)
- Return type
Element- Returns
a etree structure
Get a list of all the values of labels/tags
- Parameters
xml_filename (
str) – filename of the xml filetag_pattern (
str) – regular expression pattern to look for in TextRegion.custom_attribute
- Returns
-
dh_segment.io.PAGE.json_serialize(dict_to_serialize, non_serializable_keys=[])¶ Serialize a dictionary in order to export it.
- Parameters
dict_to_serialize (
dict) – dictionary to serializenon_serializable_keys (
List[str]) – keys that are not directly seriazable sucha as python objects
- Return type
dict- Returns
the serialized dictionnary
-
dh_segment.io.PAGE.parse_file(filename)¶ Parses the files to create the corresponding
Pageobject. The files can be a .xml or a .json.- Parameters
filename (
str) – file to parse (either json of page xml)- Return type
- Returns
Page object containing all the parsed elements
-
dh_segment.io.PAGE.save_baselines(filename, baselines, ratio=(1, 1), predictions_shape=None)¶ - Parameters
filename (
str) – filename to save baselines tobaselines – list of baselines
ratio (
Tuple[int,int]) – ratio of prediction shape over original shapepredictions_shape (
Optional[Tuple[int,int]]) – shape of the masks output by the network
- Return type
- Returns
-
class
dh_segment.io.via.VIAttribute¶ A container for VIA attributes.
- Parameters
name (str) – The name of attribute
type (str) – The type of the annotation (dropdown, markbox, …)
options (list) – The options / labels possible for this attribute.
-
property
name¶ Alias for field number 0
-
property
options¶ Alias for field number 2
-
property
type¶ Alias for field number 1
-
class
dh_segment.io.via.WorkingItem¶ A container for annotated images.
- Parameters
collection (str) – name of the collection
image_name (str) – name of the image
original_x (int) – original image x size (width)
original_y (int) – original image y size (height)
reduced_x (int) – resized x size
reduced_y (int) – resized y size
iiif (str) – iiif url
annotations (dict) – VIA ‘region_attributes’
-
property
annotations¶ Alias for field number 7
-
property
collection¶ Alias for field number 0
-
property
iiif¶ Alias for field number 6
-
property
image_name¶ Alias for field number 1
-
property
original_x¶ Alias for field number 2
-
property
original_y¶ Alias for field number 3
-
property
reduced_x¶ Alias for field number 4
-
property
reduced_y¶ Alias for field number 5
-
dh_segment.io.via.collect_working_items(via_annotations, collection_name, images_dir=None, via_version=2)¶ Given VIA annotation input, collect all info on WorkingItem object. This function will take care of separating images from local files and images from IIIF urls.
- Parameters
via_annotations (
dict) – via annotations (‘regions’ field)images_dir (
Optional[str]) – directory where to find the imagescollection_name (
str) – name of the collectionvia_version (
int) – version of the VIA tool used to produce the annotations (1 or 2)
- Return type
List[WorkingItem]- Returns
list of WorkingItem
-
dh_segment.io.via.convert_via_region_page_text_region(working_item, structure_label)¶ - Parameters
working_item (
WorkingItem) –structure_label (
str) –
- Return type
- Returns
-
dh_segment.io.via.create_masks(masks_dir, working_items, via_attributes, collection, contours_only=False)¶ For each annotation, create a corresponding binary mask and resize it (h = 2000). Only valid for VIA 2.0. Several annotations of the same class on the same image produce one image with several masks.
- Parameters
masks_dir (
str) – where to output the masksworking_items (
List[WorkingItem]) – infos to work withvia_attributes (
List[VIAttribute]) – VIAttributes computed byget_via_attributesfunction.collection (
str) – name of the nollectioncontours_only (
bool) – creates the binary masks only for the contours of the object (thickness of contours : 20 px)
- Return type
dict- Returns
annotation_summary, a dictionary containing a list of labels per image
-
dh_segment.io.via.create_via_annotation_single_image(img_filename, via_regions, file_attributes=None)¶ Returns a dictionary item {key: annotation} in VIA format to further export to .json file
- Parameters
img_filename (
str) – path to the imagevia_regions (
List[dict]) – regions in VIA format (output fromcreate_via_region_from_coordinates)file_attributes (
Optional[dict]) – file attributes (usually None)
- Return type
Dict[str,dict]- Returns
dictionary item with key and annotations in VIA format
-
dh_segment.io.via.create_via_region_from_coordinates(coordinates, region_attributes, type_region)¶ Formats coordinates to a VIA region (dict).
- Parameters
coordinates (<built-in function array>) – (N, 2) coordinates (x, y)
region_attributes (
dict) – dictionary with keys : name of labels, values : values of labelstype_region (
str) – via region annotation type (‘rect’, ‘polygon’)
- Return type
dict- Returns
a region in VIA style (dict/json)
-
dh_segment.io.via.export_annotation_dict(annotation_dict, filename)¶ Export the annotations to json file.
- Parameters
annotation_dict (
dict) – VIA annotationsfilename (
str) – filename to export the data (json file)
- Return type
None- Returns
-
dh_segment.io.via.get_annotations_per_file(via_dict, name_file)¶ From VIA json content, get annotations relative to the given name_file.
- Parameters
via_dict (
dict) – VIA annotations content (originally json)name_file (
str) – the file to look for (it can be a iiif path or a file path)
- Return type
dict- Returns
dict
-
dh_segment.io.via.get_via_attributes(annotation_dict, via_version=2)¶ Gets the attributes of the annotated data and returns a list of VIAttribute.
- Parameters
annotation_dict (
dict) – json content of the VIA exported filevia_version (
int) – either 1 or 2 (for VIA v 1.0 or VIA v 2.0)
- Return type
List[VIAttribute]- Returns
A list containing VIAttributes
-
dh_segment.io.via.load_annotation_data(via_data_filename, only_img_annotations=False, via_version=2)¶ Load the content of via annotation files.
- Parameters
via_data_filename (
str) – via annotations json fileonly_img_annotations (
bool) – load only the images annotations (‘_via_img_metadata’ field)via_version (
int) –
- Return type
dict- Returns
the content of json file containing the region annotated
-
dh_segment.io.via.parse_via_attributes(via_attributes)¶ Parses the VIA attribute dictionary and returns a list of VIAttribute instances
- Parameters
via_attributes (
dict) – attributes from VIA annotation (‘_via_attributes’ field)- Return type
List[VIAttribute]- Returns
list of
VIAttribute