Augmentations (albumentations.augmentations)

Transforms

class albumentations.augmentations.transforms.Blur(blur_limit=7, always_apply=False, p=0.5)[source]

Blur the input image using a random-sized kernel.

Parameters:
  • blur_limit (int) – maximum kernel size for blurring the input image. Default: 7.
  • p (float) – probability of applying the transform. Default: 0.5.
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.VerticalFlip(always_apply=False, p=0.5)[source]

Flip the input vertically around the x-axis.

Parameters:p (float) – probability of applying the transform. Default: 0.5.
Targets:
image, mask, bboxes, keypoints
Image types:
uint8, float32
class albumentations.augmentations.transforms.HorizontalFlip(always_apply=False, p=0.5)[source]

Flip the input horizontally around the y-axis.

Parameters:p (float) – probability of applying the transform. Default: 0.5.
Targets:
image, mask, bboxes, keypoints
Image types:
uint8, float32
class albumentations.augmentations.transforms.Flip(always_apply=False, p=0.5)[source]

Flip the input either horizontally, vertically or both horizontally and vertically.

Parameters:p (float) – probability of applying the transform. Default: 0.5.
Targets:
image, mask, bboxes, keypoints
Image types:
uint8, float32
apply(img, d=0, **params)[source]

Args: d (int): code that specifies how to flip the input. 0 for vertical flipping, 1 for horizontal flipping,

-1 for both vertical and horizontal flipping (which is also could be seen as rotating the input by 180 degrees).
class albumentations.augmentations.transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), max_pixel_value=255.0, always_apply=False, p=1.0)[source]

Divide pixel values by 255 = 2**8 - 1, subtract mean per channel and divide by std per channel.

Parameters:
  • mean (float, float, float) – mean values
  • std (float, float, float) – std values
  • max_pixel_value (float) – maximum possible pixel value
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.Transpose(always_apply=False, p=0.5)[source]

Transpose the input by swapping rows and columns.

Parameters:p (float) – probability of applying the transform. Default: 0.5.
Targets:
image, mask, bboxes
Image types:
uint8, float32
class albumentations.augmentations.transforms.RandomCrop(height, width, always_apply=False, p=1.0)[source]

Crop a random part of the input.

Parameters:
  • height (int) – height of the crop.
  • width (int) – width of the crop.
  • p (float) – probability of applying the transform. Default: 1.
Targets:
image, mask, bboxes, keypoints
Image types:
uint8, float32
class albumentations.augmentations.transforms.RandomGamma(gamma_limit=(80, 120), always_apply=False, p=0.5)[source]
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.RandomRotate90(always_apply=False, p=0.5)[source]

Randomly rotate the input by 90 degrees zero or more times.

Parameters:p (float) – probability of applying the transform. Default: 0.5.
Targets:
image, mask, bboxes, keypoints
Image types:
uint8, float32
apply(img, factor=0, **params)[source]
Parameters:factor (int) – number of times the input will be rotated by 90 degrees.
class albumentations.augmentations.transforms.Rotate(limit=90, interpolation=1, border_mode=4, value=None, mask_value=None, always_apply=False, p=0.5)[source]

Rotate the input by an angle selected randomly from the uniform distribution.

Parameters:
  • limit ((int, int) or int) – range from which a random angle is picked. If limit is a single int an angle is picked from (-limit, limit). Default: 90
  • interpolation (OpenCV flag) – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
  • border_mode (OpenCV flag) – flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101
  • value (list of ints [r, g, b]) – padding value if border_mode is cv2.BORDER_CONSTANT.
  • mask_value (scalar or list of ints) – padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.
  • p (float) – probability of applying the transform. Default: 0.5.
Targets:
image, mask, bboxes, keypoints
Image types:
uint8, float32
class albumentations.augmentations.transforms.ShiftScaleRotate(shift_limit=0.0625, scale_limit=0.1, rotate_limit=45, interpolation=1, border_mode=4, value=None, mask_value=None, always_apply=False, p=0.5)[source]

Randomly apply affine transforms: translate, scale and rotate the input.

Parameters:
  • shift_limit ((float, float) or float) – shift factor range for both height and width. If shift_limit is a single float value, the range will be (-shift_limit, shift_limit). Absolute values for lower and upper bounds should lie in range [0, 1]. Default: 0.0625.
  • scale_limit ((float, float) or float) – scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Default: 0.1.
  • rotate_limit ((int, int) or int) – rotation range. If rotate_limit is a single int value, the range will be (-rotate_limit, rotate_limit). Default: 45.
  • interpolation (OpenCV flag) – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
  • border_mode (OpenCV flag) – flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101
  • value (list of ints [r, g, b]) – padding value if border_mode is cv2.BORDER_CONSTANT.
  • mask_value (scalar or list of ints) – padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.
  • p (float) – probability of applying the transform. Default: 0.5.
Targets:
image, mask, keypoints
Image types:
uint8, float32
class albumentations.augmentations.transforms.CenterCrop(height, width, always_apply=False, p=1.0)[source]

Crop the central part of the input.

Parameters:
  • height (int) – height of the crop.
  • width (int) – width of the crop.
  • p (float) – probability of applying the transform. Default: 1.
Targets:
image, mask, bboxes, keypoints
Image types:
uint8, float32

Note

It is recommended to use uint8 images as input. Otherwise the operation will require internal conversion float32 -> uint8 -> float32 that causes worse performance.

class albumentations.augmentations.transforms.OpticalDistortion(distort_limit=0.05, shift_limit=0.05, interpolation=1, border_mode=4, value=None, mask_value=None, always_apply=False, p=0.5)[source]
Targets:
image, mask
Image types:
uint8, float32
class albumentations.augmentations.transforms.GridDistortion(num_steps=5, distort_limit=0.3, interpolation=1, border_mode=4, value=None, mask_value=None, always_apply=False, p=0.5)[source]
Targets:
image, mask
Image types:
uint8, float32
class albumentations.augmentations.transforms.ElasticTransform(alpha=1, sigma=50, alpha_affine=50, interpolation=1, border_mode=4, value=None, mask_value=None, always_apply=False, approximate=False, p=0.5)[source]

Elastic deformation of images as described in [Simard2003] (with modifications). Based on https://gist.github.com/erniejunior/601cdf56d2b424757de5

[Simard2003]Simard, Steinkraus and Platt, “Best Practices for Convolutional Neural Networks applied to Visual Document Analysis”, in Proc. of the International Conference on Document Analysis and Recognition, 2003.
Parameters:approximate (boolean) – Whether to smooth displacement map with fixed kernel size. Enabling this option gives ~2X speedup on large images.
Targets:
image, mask
Image types:
uint8, float32
class albumentations.augmentations.transforms.HueSaturationValue(hue_shift_limit=20, sat_shift_limit=30, val_shift_limit=20, always_apply=False, p=0.5)[source]

Randomly change hue, saturation and value of the input image.

Parameters:
  • hue_shift_limit ((int, int) or int) – range for changing hue. If hue_shift_limit is a single int, the range will be (-hue_shift_limit, hue_shift_limit). Default: 20.
  • sat_shift_limit ((int, int) or int) – range for changing saturation. If sat_shift_limit is a single int, the range will be (-sat_shift_limit, sat_shift_limit). Default: 30.
  • val_shift_limit ((int, int) or int) – range for changing value. If val_shift_limit is a single int, the range will be (-val_shift_limit, val_shift_limit). Default: 20.
  • p (float) – probability of applying the transform. Default: 0.5.
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.PadIfNeeded(min_height=1024, min_width=1024, border_mode=4, value=None, mask_value=None, always_apply=False, p=1.0)[source]

Pad side of the image / max if side is less than desired number.

Parameters:
  • p (float) – probability of applying the transform. Default: 1.0.
  • value (list of ints [r, g, b]) – padding value if border_mode is cv2.BORDER_CONSTANT.
  • mask_value (int) – padding value for mask if border_mode is cv2.BORDER_CONSTANT.
Targets:
image, mask, bbox, keypoints
Image types:
uint8, float32
class albumentations.augmentations.transforms.RGBShift(r_shift_limit=20, g_shift_limit=20, b_shift_limit=20, always_apply=False, p=0.5)[source]

Randomly shift values for each channel of the input RGB image.

Parameters:
  • r_shift_limit ((int, int) or int) – range for changing values for the red channel. If r_shift_limit is a single int, the range will be (-r_shift_limit, r_shift_limit). Default: 20.
  • g_shift_limit ((int, int) or int) – range for changing values for the green channel. If g_shift_limit is a single int, the range will be (-g_shift_limit, g_shift_limit). Default: 20.
  • b_shift_limit ((int, int) or int) – range for changing values for the blue channel. If b_shift_limit is a single int, the range will be (-b_shift_limit, b_shift_limit). Default: 20.
  • p (float) – probability of applying the transform. Default: 0.5.
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.RandomBrightness(limit=0.2, always_apply=False, p=0.5)[source]
class albumentations.augmentations.transforms.RandomContrast(limit=0.2, always_apply=False, p=0.5)[source]
class albumentations.augmentations.transforms.MotionBlur(blur_limit=7, always_apply=False, p=0.5)[source]

Apply motion blur to the input image using a random-sized kernel.

Parameters:
  • blur_limit (int) – maximum kernel size for blurring the input image. Default: 7.
  • p (float) – probability of applying the transform. Default: 0.5.
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.MedianBlur(blur_limit=7, always_apply=False, p=0.5)[source]

Blur the input image using using a median filter with a random aperture linear size.

Parameters:
  • blur_limit (int) – maximum aperture linear size for blurring the input image. Default: 7.
  • p (float) – probability of applying the transform. Default: 0.5.
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.GaussianBlur(blur_limit=7, always_apply=False, p=0.5)[source]

Blur the input image using using a Gaussian filter with a random kernel size.

Parameters:
  • blur_limit (int) – maximum Gaussian kernel size for blurring the input image. Default: 7.
  • p (float) – probability of applying the transform. Default: 0.5.
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.GaussNoise(var_limit=(10.0, 50.0), always_apply=False, p=0.5)[source]

Apply gaussian noise to the input image.

Parameters:
  • var_limit ((float, float) or float) – variance range for noise. If var_limit is a single float, the range will be (-var_limit, var_limit). Default: (10., 50.).
  • p (float) – probability of applying the transform. Default: 0.5.
Targets:
image
Image types:
uint8
class albumentations.augmentations.transforms.CLAHE(clip_limit=4.0, tile_grid_size=(8, 8), always_apply=False, p=0.5)[source]

Apply Contrast Limited Adaptive Histogram Equalization to the input image.

Parameters:
  • clip_limit (float) – upper threshold value for contrast limiting. Default: 4.0. tile_grid_size ((int, int)): size of grid for histogram equalization. Default: (8, 8).
  • p (float) – probability of applying the transform. Default: 0.5.
Targets:
image
Image types:
uint8
class albumentations.augmentations.transforms.ChannelShuffle(always_apply=False, p=0.5)[source]

Randomly rearrange channels of the input RGB image.

Parameters:p (float) – probability of applying the transform. Default: 0.5.
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.InvertImg(always_apply=False, p=0.5)[source]

Invert the input image by subtracting pixel values from 255.

Parameters:p (float) – probability of applying the transform. Default: 0.5.
Targets:
image
Image types:
uint8
class albumentations.augmentations.transforms.ToGray(always_apply=False, p=0.5)[source]

Convert the input RGB image to grayscale. If the mean pixel value for the resulting image is greater than 127, invert the resulting grayscale image.

Parameters:p (float) – probability of applying the transform. Default: 0.5.
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.JpegCompression(quality_lower=99, quality_upper=100, always_apply=False, p=0.5)[source]

Decrease Jpeg compression of an image.

Parameters:
  • quality_lower (float) – lower bound on the jpeg quality. Should be in [0, 100] range
  • quality_upper (float) – upper bound on the jpeg quality. Should be in [0, 100] range
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.Cutout(num_holes=8, max_h_size=8, max_w_size=8, fill_value=0, always_apply=False, p=0.5)[source]

CoarseDropout of the square regions in the image. :param num_holes: number of regions to zero out :type num_holes: int :param max_h_size: maximum height of the hole :type max_h_size: int :param max_w_size: maximum width of the hole :type max_w_size: int

Targets:
image
Image types:
uint8, float32

Reference: | https://arxiv.org/abs/1708.04552 | https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py | https://github.com/aleju/imgaug/blob/master/imgaug/augmenters/arithmetic.py

class albumentations.augmentations.transforms.CoarseDropout(max_holes=8, max_height=8, max_width=8, min_holes=None, min_height=None, min_width=None, fill_value=0, always_apply=False, p=0.5)[source]

CoarseDropout of the rectangular regions in the image.

Parameters:
  • max_holes (int) – Maximum number of regions to zero out.
  • max_height (int) – Maximum height of the hole.
  • min_width (int) – Maximum width of the hole.
  • min_holes (int) – Minimum number of regions to zero out. If None, min_holes is be set to max_holes. Default: None.
  • min_height (int) – Minimum height of the hole. Default: None. If None, min_height is set to max_height. Default: None.
  • min_width – Minimum width of the hole. If None, min_height is set to max_width. Default: None.
Targets:
image
Image types:
uint8, float32

Reference: | https://arxiv.org/abs/1708.04552 | https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py | https://github.com/aleju/imgaug/blob/master/imgaug/augmenters/arithmetic.py

class albumentations.augmentations.transforms.ToFloat(max_value=None, always_apply=False, p=1.0)[source]

Divide pixel values by max_value to get a float32 output array where all values lie in the range [0, 1.0]. If max_value is None the transform will try to infer the maximum value by inspecting the data type of the input image.

See also

FromFloat

Parameters:
  • max_value (float) – maximum possible input value. Default: None.
  • p (float) – probability of applying the transform. Default: 1.0.
Targets:
image
Image types:
any type
class albumentations.augmentations.transforms.FromFloat(dtype='uint16', max_value=None, always_apply=False, p=1.0)[source]

Take an input array where all values should lie in the range [0, 1.0], multiply them by max_value and then cast the resulted value to a type specified by dtype. If max_value is None the transform will try to infer the maximum value for the data type from the dtype argument.

This is the inverse transform for ToFloat.

Parameters:
  • max_value (float) – maximum possible input value. Default: None.
  • dtype (string or numpy data type) – data type of the output. See the ‘Data types’ page from the NumPy docs. Default: ‘uint16’.
  • p (float) – probability of applying the transform. Default: 1.0.
Targets:
image
Image types:
float32
class albumentations.augmentations.transforms.Crop(x_min=0, y_min=0, x_max=1024, y_max=1024, always_apply=False, p=1.0)[source]

Crop region from image.

Parameters:
  • x_min (int) – minimum upper left x coordinate
  • y_min (int) – minimum upper left y coordinate
  • x_max (int) – maximum lower right x coordinate
  • y_max (int) – maximum lower right y coordinate
Targets:
image, mask, bboxes
Image types:
uint8, float32
class albumentations.augmentations.transforms.RandomScale(scale_limit=0.1, interpolation=1, always_apply=False, p=0.5)[source]

Randomly resize the input. Output image size is different from the input image size.

Parameters:
  • scale_limit ((float, float) or float) – scaling factor range. If scale_limit is a single float value, the range will be (1 - scale_limit, 1 + scale_limit). Default: 0.1.
  • interpolation (OpenCV flag) – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
  • p (float) – probability of applying the transform. Default: 0.5.
Targets:
image, mask, bboxes, keypoints
Image types:
uint8, float32
class albumentations.augmentations.transforms.LongestMaxSize(max_size=1024, interpolation=1, always_apply=False, p=1)[source]

Rescale an image so that maximum side is equal to max_size, keeping the aspect ratio of the initial image.

Parameters:
  • p (float) – probability of applying the transform. Default: 1.
  • max_size (int) – maximum size of the image after the transformation
Targets:
image, mask, bboxes
Image types:
uint8, float32
class albumentations.augmentations.transforms.SmallestMaxSize(max_size=1024, interpolation=1, always_apply=False, p=1)[source]

Rescale an image so that minimum side is equal to max_size, keeping the aspect ratio of the initial image.

Parameters:
  • p (float) – probability of applying the transform. Default: 1.
  • max_size (int) – maximum size of smallest side of the image after the transformation
Targets:
image, mask, bboxes
Image types:
uint8, float32
class albumentations.augmentations.transforms.Resize(height, width, interpolation=1, always_apply=False, p=1)[source]

Resize the input to the given height and width.

Parameters:
  • p (float) – probability of applying the transform. Default: 1.
  • height (int) – desired height of the output.
  • width (int) – desired width of the output.
  • interpolation (OpenCV flag) – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
Targets:
image, mask, bboxes
Image types:
uint8, float32
class albumentations.augmentations.transforms.RandomSizedCrop(min_max_height, height, width, w2h_ratio=1.0, interpolation=1, always_apply=False, p=1.0)[source]

Crop a random part of the input and rescale it to some size.

Parameters:
  • min_max_height ((int, int)) – crop size limits.
  • height (int) – height after crop and resize.
  • width (int) – width after crop and resize.
  • w2h_ratio (float) – aspect ratio of crop.
  • interpolation (OpenCV flag) – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
  • p (float) – probability of applying the transform. Default: 1.
Targets:
image, mask, bboxes, keypoints
Image types:
uint8, float32
class albumentations.augmentations.transforms.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, always_apply=False, p=0.5)[source]

Randomly change brightness and contrast of the input image.

Parameters:
  • brightness_limit ((float, float) or float) – factor range for changing brightness. If limit is a single float, the range will be (-limit, limit). Default: 0.2.
  • contrast_limit ((float, float) or float) – factor range for changing contrast. If limit is a single float, the range will be (-limit, limit). Default: 0.2.
  • p (float) – probability of applying the transform. Default: 0.5.
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.RandomCropNearBBox(max_part_shift=0.3, always_apply=False, p=1.0)[source]

Crop bbox from image with random shift by x,y coordinates

Parameters:
  • max_part_shift (float) – float value in (0.0, 1.0) range. Default 0.3
  • p (float) – probability of applying the transform. Default: 1.
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.RandomSizedBBoxSafeCrop(height, width, erosion_rate=0.0, interpolation=1, always_apply=False, p=1.0)[source]

Crop a random part of the input and rescale it to some size without loss of bboxes.

Parameters:
  • height (int) – height after crop and resize.
  • width (int) – width after crop and resize.
  • erosion_rate (float) – erosion rate applied on input image height before crop.
  • interpolation (OpenCV flag) – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
  • p (float) – probability of applying the transform. Default: 1.
Targets:
image, mask, bboxes
Image types:
uint8, float32
class albumentations.augmentations.transforms.RandomSnow(snow_point_lower=0.1, snow_point_upper=0.3, brightness_coeff=2.5, always_apply=False, p=0.5)[source]

Bleach out some pixel values simulating snow.

From https://github.com/UjjwalSaxena/Automold–Road-Augmentation-Library

Parameters:
  • snow_point_lower (float) – lower_bond of the amount of snow. Should be in [0, 1] range
  • snow_point_upper (float) – upper_bond of the amount of snow. Should be in [0, 1] range
  • brightness_coeff (float) – larger number will lead to a more snow on the image. Should be >= 0
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.RandomRain(slant_lower=-10, slant_upper=10, drop_length=20, drop_width=1, drop_color=(200, 200, 200), blur_value=7, brightness_coefficient=0.7, rain_type=None, always_apply=False, p=0.5)[source]

Adds rain effects.

From https://github.com/UjjwalSaxena/Automold–Road-Augmentation-Library

Parameters:
  • slant_lower
  • slant_upper
  • drop_length
  • drop_width
  • drop_color
  • blur_value (int) – rainy view are blurry
  • brightness_coefficient (float) – rainy days are usually shady
  • rain_type – [None, “drizzle”, “heavy”, “torrestial”]
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.RandomFog(fog_coef_lower=0.3, fog_coef_upper=1, alpha_coef=0.08, always_apply=False, p=0.5)[source]

Simulates fog for the image

From https://github.com/UjjwalSaxena/Automold–Road-Augmentation-Library

Parameters:
  • fog_coef_lower (float) – lower limit for fog intensity coefficient. Should be in [0, 1] range.
  • fog_coef_upper (float) – upper limit for fog intensity coefficient. Should be in [0, 1] range.
  • alpha_coef (float) – transparence of the fog circles. Should be in [0, 1] range.
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.RandomSunFlare(flare_roi=(0, 0, 1, 0.5), angle_lower=0, angle_upper=1, num_flare_circles_lower=6, num_flare_circles_upper=10, src_radius=400, src_color=(255, 255, 255), always_apply=False, p=0.5)[source]

Simulates Sun Flare for the image

From https://github.com/UjjwalSaxena/Automold–Road-Augmentation-Library

Parameters:
  • flare_roi (float, float, float, float) – region of the image where flare will appear (x_min, y_min, x_max, y_max)
  • angle_lower (float) –
  • angle_upper (float) –
  • num_flare_circles_lower (int) – lower limit for the number of flare circles.
  • num_flare_circles_upper (int) – upper limit for the number of flare circles.
  • src_radius (int) –
  • src_color (int, int, int) – color of the flare
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.RandomShadow(shadow_roi=(0, 0.5, 1, 1), num_shadows_lower=1, num_shadows_upper=2, shadow_dimension=5, always_apply=False, p=0.5)[source]

Simulates shadows for the image

From https://github.com/UjjwalSaxena/Automold–Road-Augmentation-Library

Parameters:
  • shadow_roi (float, float, float, float) – region of the image where shadows will appear (x_min, y_min, x_max, y_max)
  • num_shadows_lower (int) – Lower limit for the possible number of shadows.
  • num_shadows_upper (int) – Lower limit for the possible number of shadows.
  • shadow_dimension (int) – number of edges in the shadow polygons
Targets:
image
Image types:
uint8, float32
class albumentations.augmentations.transforms.Lambda(image=None, mask=None, keypoint=None, bbox=None, name=None, always_apply=False, p=1.0)[source]

A flexible transformation class for using user-defined transformation functions per targets. Function signature must include **kwargs to accept optinal arguments like interpolation method, image size, etc:

Parameters:
  • image (callable) – Image transformation function.
  • mask (callable) – Mask transformation function.
  • keypoint (callable) – Keypoint transformation function.
  • bbox (callable) – BBox transformation function.
  • always_apply (bool) – Indicates whether this transformation should be always applied.
  • p (float) – probability of applying the transform. Default: 1.0.
Targets:
image, mask, bboxes, keypoints
Image types:
Any
class albumentations.augmentations.transforms.ChannelDropout(channel_drop_range=(1, 1), fill_value=0, always_apply=False, p=0.5)[source]

Randomly Drop Channels in the input Image.

Parameters:
  • channel_drop_range (int, int) – range from which we choose the number of channels to drop.
  • fill_value – pixel value for the dropped channel.
  • p (float) – probability of applying the transform. Default: 0.5.
Targets:
image
Image types:
uint8, uint16, unit32, float32
class albumentations.augmentations.transforms.ISONoise(color_shift=(0.01, 0.05), intensity=(0.1, 0.5), always_apply=False, p=0.5)[source]

Apply camera sensor noise.

Parameters:
  • color_shift (float, float) – variance range for color hue change. Measured as a fraction of 360 degree Hue angle in HLS colorspace.
  • intensity ((float, float) – Multiplicative factor that control strength of color and luminace noise.
  • p (float) – probability of applying the transform. Default: 0.5.
Targets:
image
Image types:
uint8

Functional transforms

albumentations.augmentations.functional.add_fog(img, fog_coef, alpha_coef, haze_list)[source]

Add fog to the image.

From https://github.com/UjjwalSaxena/Automold–Road-Augmentation-Library

Parameters:
  • img (np.array) –
  • fog_coef (float) –
  • alpha_coef (float) –
  • haze_list (list) –

Returns:

albumentations.augmentations.functional.add_rain(img, slant, drop_length, drop_width, drop_color, blur_value, brightness_coefficient, rain_drops)[source]

From https://github.com/UjjwalSaxena/Automold–Road-Augmentation-Library

Parameters:
  • img (np.uint8) –
  • slant (int) –
  • drop_length
  • drop_width
  • drop_color
  • blur_value (int) – rainy view are blurry
  • brightness_coefficient (float) – rainy days are usually shady
  • rain_drops

Returns:

albumentations.augmentations.functional.add_shadow(img, vertices_list)[source]

Add shadows to the image.

From https://github.com/UjjwalSaxena/Automold–Road-Augmentation-Library

Parameters:
  • img (np.array) –
  • vertices_list (list) –

Returns:

albumentations.augmentations.functional.add_snow(img, snow_point, brightness_coeff)[source]

Bleaches out pixels, mitation snow.

From https://github.com/UjjwalSaxena/Automold–Road-Augmentation-Library

Parameters:
  • img
  • snow_point
  • brightness_coeff

Returns:

albumentations.augmentations.functional.add_sun_flare(img, flare_center_x, flare_center_y, src_radius, src_color, circles)[source]

Add sun flare.

From https://github.com/UjjwalSaxena/Automold–Road-Augmentation-Library

Parameters:
  • img (np.array) –
  • flare_center_x (float) –
  • flare_center_y (float) –
  • src_radius
  • src_color (int, int, int) –
  • circles (list) –

Returns:

albumentations.augmentations.functional.bbox_flip(bbox, d, rows, cols)[source]

Flip a bounding box either vertically, horizontally or both depending on the value of d.

Raises:ValueError – if value of d is not -1, 0 or 1.
albumentations.augmentations.functional.bbox_hflip(bbox, rows, cols)[source]

Flip a bounding box horizontally around the y-axis.

albumentations.augmentations.functional.bbox_rot90(bbox, factor, rows, cols)[source]

Rotates a bounding box by 90 degrees CCW (see np.rot90)

Parameters:
  • bbox (tuple) – A tuple (x_min, y_min, x_max, y_max).
  • factor (int) – Number of CCW rotations. Must be in range [0;3] See np.rot90.
  • rows (int) – Image rows.
  • cols (int) – Image cols.
albumentations.augmentations.functional.bbox_rotate(bbox, angle, rows, cols, interpolation)[source]

Rotates a bounding box by angle degrees

Parameters:
  • bbox (tuple) – A tuple (x_min, y_min, x_max, y_max).
  • angle (int) – Angle of rotation in degrees
  • rows (int) – Image rows.
  • cols (int) – Image cols.
  • interpolation (int) – interpolation method.
  • a tuple (return) –
albumentations.augmentations.functional.bbox_transpose(bbox, axis, rows, cols)[source]

Transposes a bounding box along given axis.

Parameters:
  • bbox (tuple) – A tuple (x_min, y_min, x_max, y_max).
  • axis (int) – 0 - main axis, 1 - secondary axis.
  • rows (int) – Image rows.
  • cols (int) – Image cols.
albumentations.augmentations.functional.bbox_vflip(bbox, rows, cols)[source]

Flip a bounding box vertically around the x-axis.

albumentations.augmentations.functional.crop_bbox_by_coords(bbox, crop_coords, crop_height, crop_width, rows, cols)[source]

Crop a bounding box using the provided coordinates of bottom-left and top-right corners in pixels and the required height and width of the crop.

albumentations.augmentations.functional.crop_keypoint_by_coords(keypoint, crop_coords, crop_height, crop_width, rows, cols)[source]

Crop a keypoint using the provided coordinates of bottom-left and top-right corners in pixels and the required height and width of the crop.

albumentations.augmentations.functional.elastic_transform(image, alpha, sigma, alpha_affine, interpolation=1, border_mode=4, value=None, random_state=None, approximate=False)[source]

Elastic deformation of images as described in [Simard2003] (with modifications). Based on https://gist.github.com/erniejunior/601cdf56d2b424757de5

[Simard2003]Simard, Steinkraus and Platt, “Best Practices for Convolutional Neural Networks applied to Visual Document Analysis”, in Proc. of the International Conference on Document Analysis and Recognition, 2003.
albumentations.augmentations.functional.elastic_transform_approx(image, alpha, sigma, alpha_affine, interpolation=1, border_mode=4, value=None, random_state=None)[source]

Elastic deformation of images as described in [Simard2003] (with modifications for speed). Based on https://gist.github.com/erniejunior/601cdf56d2b424757de5

[Simard2003]Simard, Steinkraus and Platt, “Best Practices for Convolutional Neural Networks applied to Visual Document Analysis”, in Proc. of the International Conference on Document Analysis and Recognition, 2003.
albumentations.augmentations.functional.grid_distortion(img, num_steps=10, xsteps=[], ysteps=[], interpolation=1, border_mode=4, value=None)[source]
Reference:
http://pythology.blogspot.sg/2014/03/interpolation-on-regular-distorted-grid.html
albumentations.augmentations.functional.iso_noise(image, color_shift=0.05, intensity=0.5, random_state=None, **kwargs)[source]

Apply poisson noise to image to simulate camera sensor noise.

Parameters:
  • image – Input image, currently, only RGB, uint8 images are supported.
  • intensity – Multiplication factor for noise values. Values of ~0.5 are produce noticeable, yet acceptable level of noise.
  • random_state
  • **kwargs
Returns:

Noised image

albumentations.augmentations.functional.keypoint_flip(bbox, d, rows, cols)[source]

Flip a keypoint either vertically, horizontally or both depending on the value of d.

Raises:ValueError – if value of d is not -1, 0 or 1.
albumentations.augmentations.functional.keypoint_hflip(kp, rows, cols)[source]

Flip a keypoint horizontally around the y-axis.

albumentations.augmentations.functional.keypoint_rot90(keypoint, factor, rows, cols, **params)[source]

Rotates a keypoint by 90 degrees CCW (see np.rot90)

Parameters:
  • keypoint (tuple) – A tuple (x, y, angle, scale).
  • factor (int) – Number of CCW rotations. Must be in range [0;3] See np.rot90.
  • rows (int) – Image rows.
  • cols (int) – Image cols.
albumentations.augmentations.functional.keypoint_scale(keypoint, scale_x, scale_y, **params)[source]

Scales a keypoint by scale_x and scale_y.

albumentations.augmentations.functional.keypoint_vflip(kp, rows, cols)[source]

Flip a keypoint vertically around the x-axis.

albumentations.augmentations.functional.optical_distortion(img, k=0, dx=0, dy=0, interpolation=1, border_mode=4, value=None)[source]

Barrel / pincushion distortion. Unconventional augment.

Reference:
albumentations.augmentations.functional.preserve_channel_dim(func)[source]

Preserve dummy channel dim.

albumentations.augmentations.functional.preserve_shape(func)[source]

Preserve shape of the image

albumentations.augmentations.functional.py3round(number)[source]

Unified rounding in all python versions.

Helper functions for working with bounding boxes

albumentations.augmentations.bbox_utils.normalize_bbox(bbox, rows, cols)[source]

Normalize coordinates of a bounding box. Divide x-coordinates by image width and y-coordinates by image height.

albumentations.augmentations.bbox_utils.denormalize_bbox(bbox, rows, cols)[source]

Denormalize coordinates of a bounding box. Multiply x-coordinates by image width and y-coordinates by image height. This is an inverse operation for normalize_bbox().

albumentations.augmentations.bbox_utils.normalize_bboxes(bboxes, rows, cols)[source]

Normalize a list of bounding boxes.

albumentations.augmentations.bbox_utils.denormalize_bboxes(bboxes, rows, cols)[source]

Denormalize a list of bounding boxes.

albumentations.augmentations.bbox_utils.calculate_bbox_area(bbox, rows, cols)[source]

Calculate the area of a bounding box in pixels.

albumentations.augmentations.bbox_utils.filter_bboxes_by_visibility(original_shape, bboxes, transformed_shape, transformed_bboxes, threshold=0.0, min_area=0.0)[source]

Filter bounding boxes and return only those boxes whose visibility after transformation is above the threshold and minimal area of bounding box in pixels is more then min_area.

Parameters:
  • original_shape (tuple) – original image shape
  • bboxes (list) – original bounding boxes
  • transformed_shape (tuple) – transformed image
  • transformed_bboxes (list) – transformed bounding boxes
  • threshold (float) – visibility threshold. Should be a value in the range [0.0, 1.0].
  • min_area (float) – Minimal area threshold.
albumentations.augmentations.bbox_utils.convert_bbox_to_albumentations(bbox, source_format, rows, cols, check_validity=False)[source]

Convert a bounding box from a format specified in source_format to the format used by albumentations: normalized coordinates of bottom-left and top-right corners of the bounding box in a form of [x_min, y_min, x_max, y_max] e.g. [0.15, 0.27, 0.67, 0.5].

Parameters:
  • bbox (list) – bounding box
  • source_format (str) – format of the bounding box. Should be ‘coco’ or ‘pascal_voc’.
  • check_validity (bool) – check if all boxes are valid boxes
  • rows (int) – image height
  • cols (int) – image width

Note

The coco format of a bounding box looks like [x_min, y_min, width, height], e.g. [97, 12, 150, 200]. The pascal_voc format of a bounding box looks like [x_min, y_min, x_max, y_max], e.g. [97, 12, 247, 212].

Raises:ValueError – if target_format is not equal to coco or pascal_voc.
albumentations.augmentations.bbox_utils.convert_bbox_from_albumentations(bbox, target_format, rows, cols, check_validity=False)[source]

Convert a bounding box from the format used by albumentations to a format, specified in target_format.

Parameters:
  • bbox (list) – bounding box with coordinates in the format used by albumentations
  • target_format (str) – required format of the output bounding box. Should be ‘coco’ or ‘pascal_voc’.
  • rows (int) – image height
  • cols (int) – image width
  • check_validity (bool) – check if all boxes are valid boxes

Note

The coco format of a bounding box looks like [x_min, y_min, width, height], e.g. [97, 12, 150, 200]. The pascal_voc format of a bounding box looks like [x_min, y_min, x_max, y_max], e.g. [97, 12, 247, 212].

Raises:ValueError – if target_format is not equal to coco or pascal_voc.
albumentations.augmentations.bbox_utils.convert_bboxes_to_albumentations(bboxes, source_format, rows, cols, check_validity=False)[source]

Convert a list bounding boxes from a format specified in source_format to the format used by albumentations

albumentations.augmentations.bbox_utils.convert_bboxes_from_albumentations(bboxes, target_format, rows, cols, check_validity=False)[source]

Convert a list of bounding boxes from the format used by albumentations to a format, specified in target_format.

Parameters:
  • bboxes (list) – List of bounding box with coordinates in the format used by albumentations
  • target_format (str) – required format of the output bounding box. Should be ‘coco’ or ‘pascal_voc’.
  • rows (int) – image height
  • cols (int) – image width
  • check_validity (bool) – check if all boxes are valid boxes