cerebras.modelzoo.data.multimodal.datasets.features.Bbox#
- class cerebras.modelzoo.data.multimodal.datasets.features.Bbox(XMin, YMin, XMax, YMax, ClassLabel, ClassIntID, ClassID=None, IsOccluded=None, IsTruncated=None, IsGroupOf=None, IsDepiction=None, IsInside=None, IsTrainable=None, Source=None, Confidence=None)[source]#
Bases:
object
- Source: indicates how the box was made:
xclick: are manually drawn boxes using the method presented in [1], were the annotators click on the four extreme points of the object. In V6 we release the actual 4 extreme points for all xclick boxes in train (13M), see below. activemil: are boxes produced using an enhanced version of the method [2]. These are human verified to be accurate at IoU>0.7.
LabelName: the MID of the object class this box belongs to. Confidence: a dummy value, always 1. XMin, XMax, YMin, YMax: coordinates of the box, in normalized image coordinates. XMin is in [0,1], where 0 is the leftmost pixel, and 1 is the rightmost pixel in the image. Y coordinates go from the top pixel (0) to the bottom pixel (1). IsOccluded: Indicates that the object is occluded by another object in the image. IsTruncated: Indicates that the object extends beyond the boundary of the image. IsGroupOf: Indicates that the box spans a group of objects (e.g., a bed of flowers or a crowd of people). We asked annotators to use this tag for cases with more than 5 instances which are heavily occluding each other and are physically touching. IsDepiction: Indicates that the object is a depiction (e.g., a cartoon or drawing of the object, not a real physical instance). IsInside: Indicates a picture taken from the inside of the object (e.g., a car interior or inside of a building). For each of them, value 1 indicates present, 0 not present, and -1 unknown.
Methods
bbox_to_tensor
labelID_to_tensor
Attributes
ClassID
Confidence
IsDepiction
IsGroupOf
IsInside
IsOccluded
IsTrainable
IsTruncated
Source
XMin
YMin
XMax
YMax
ClassLabel
ClassIntID