Extracting quality square crops of images from non-square images
I need to automatically extract quality square thumbnails from non-square images. By 'quality', I mean what most people would recognize as being a good crop. I'm particularly interested in processing images of animals and plants.
e.g. if you take this tiger image, you would expect a good square crop to include the full head, and as much of the body as can fit.
I just tried using Vision Studio (https://portal.vision.cognitive.azure.com/demo/image-smart-cropping), and this is the crop it came up with, that it claims "emphasizes the images' most important areas":
Question: is there a way to get better results from this API?
Azure Computer Vision
-
dupammi 8,465 Reputation points • Microsoft Vendor
2024-05-07T15:12:44.3866667+00:00 Hi @David Ebbo
Thank you for your question.
The Azure AI Vision smart-cropping utility takes one or more aspect ratios in the range [0.75, 1.80] and returns the bounding box coordinates (in pixels) of the region(s) identified. Your app can then crop and return the image using those coordinates.
Try to define an aspect ratio (width / height) in the range of [0.75, 1.80]. This helps in controlling the shape of the bounding box generated by the smart cropping feature, ensuring that the resulting thumbnail includes the full head and as much of the body as possible.
That said, the API works fine for human faces, as per the documentation
This feature uses face detection to help determine important regions in the image. The detection does not involve distinguishing one face from another face, predicting or classifying facial attributes, or creating a facial template (a unique set of numbers generated from an image that represents the distinctive features of a face).
Experiment with different aspect ratios and settings can help fine-tune the results to meet your specific requirements. If you have many images to crop, then those images should be centered for the aspect ratio to work.
Also consider exploring the Resize and crop thumbnail images / Azure Custom Vision service.
I hope you understand. Thank you.
-
David Ebbo 5 Reputation points
2024-05-08T13:42:13.09+00:00 Hi @dupammi . I think you misunderstood my question. I need to generate square crops, so the aspect ration is not something that can vary.
-
dupammi 8,465 Reputation points • Microsoft Vendor
2024-05-09T01:56:03.88+00:00 Hi @David Ebbo
Thank you for the details.
Unfortunately achieving perfect square crops might be challenging due to the default aspect ratio constraints getting applied by the Azure AI Vision smart-cropping API utility. As a result, obtaining square crops that include the full head and much of the body within the bounding box might not be directly feasible. Consider exploring alternative solutions such as resizing and cropping thumbnail images using the Azure Custom Vision service.
In addition, you can consider writing your own custom solution using OpenCV that performs preprocessing steps including converting to grayscale, Gaussian blur, and adaptive thresholding to obtain a binary image. Then identify the largest contour in the binary image, calculate the bounding box, and extend it slightly based on predefined adjustments to ensure the entire object is covered.
By finding the bounding box coordinates of the detected object, you can adjust them to create a perfect square crop that includes the full head and much of the body. Below is the code I wrote, that demonstrates how to achieve this by extending the bounding box to create a square crop. You can further refine this approach by experimenting with different adjustment values to fine-tune the cropping results for your specific images.
import cv2 from google.colab.patches import cv2_imshow # Load image img = cv2.imread("/content/DAK_Panthera_tigris_02a-cropped.jpg") # Convert to grayscale gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # Preprocessing: Gaussian Blur blurred = cv2.GaussianBlur(gray, (5, 5), 0) # Adaptive Thresholding thresh_gray = cv2.adaptiveThreshold(blurred, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, 11, 4) # Find contours contours, hierarchy = cv2.findContours(thresh_gray, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) # Find the largest contour largest_contour = max(contours, key=cv2.contourArea) # Get bounding box of the largest contour x, y, w, h = cv2.boundingRect(largest_contour) # Calculate distances from left top corner to edges of bounding box dist_left = x dist_top = y dist_right = img.shape[1] - (x + w) dist_bottom = img.shape[0] - (y + h) # Define the amount of adjustment adjustment = 10 # Extend bounding box based on these distances x_new = max(0, x - dist_left + adjustment) y_new = max(0, y - dist_top) w_new = h_new = max(w + dist_left + dist_right - adjustment * 2, h + dist_top + dist_bottom - adjustment * 2) # Calculate coordinates of all corners top_left = (x_new, y_new) top_right = (x_new + w_new, y_new) bottom_right = (x_new + w_new, y_new + h_new) bottom_left = (x_new, y_new + h_new) # Print bounding box coordinates and side lengths print("Bounding Box Coordinates (x, y):", x_new, y_new) print("Bounding Box Width and Height:", w_new, h_new) # Print coordinates of all corners print("Top Left Corner:", top_left) print("Top Right Corner:", top_right) print("Bottom Right Corner:", bottom_right) print("Bottom Left Corner:", bottom_left) # Draw bounding box on original image result_img = img.copy() cv2.rectangle(result_img, (x_new, y_new), (x_new + w_new, y_new + h_new), (0, 255, 0), 2) # Show the result cv2_imshow(result_img) # Save the result cv2.imwrite("object_detection_result.png", result_img) # Crop the original image using the bounding box coordinates and dimensions cropped_img = img[y_new:y_new+h_new, x_new:x_new+w_new] # Show the cropped image cv2_imshow(cropped_img) # Save the cropped image cv2.imwrite("cropped_image.png", cropped_img)
Finally crop the resultant image.
I hope the provided information helps. Thank you.
-
David Ebbo 5 Reputation points
2024-05-09T10:59:00.1766667+00:00 @dupammi if the top half of your square crop has the entire image, what exactly is in the bottom half that you're not showing?
-
dupammi 8,465 Reputation points • Microsoft Vendor
2024-05-09T11:25:44.0233333+00:00 Hi @David Ebbo
Thank you for your follow-up on this.
The bottom half covering the paws in the bottom, face on the left, tail on the right etc.
Due to content moderation reasons, I was unable to attach the full image.
I hope you understand. Thank you.
-
David Ebbo 5 Reputation points
2024-05-09T11:32:47.5666667+00:00 Hi @dupammi Sorry, I don't understand what you mean. The original image is much wider than high. Since the top half that you are showing contains most of the original image, there is no way you can have a square image unless you are filling the bottom half some some random filler. This is not what 'cropping' means.
-
dupammi 8,465 Reputation points • Microsoft Vendor
2024-05-09T11:40:13.2933333+00:00 Hi @David Ebbo
I request you to raise a support case through Azure portal. This will allow you to get assistance from Azure support in getting a better Azure component for your use case. Please provide a detailed description, including any screenshots, when raising the support case.
I hope you understand. Thank you.
-
David Ebbo 5 Reputation points
2024-05-09T11:43:07.36+00:00 Yes, I may do that. So I take it that you misunderstood the question, and that your answer did not result in a square crop of the original image. Thank you for trying nonetheless.
Sign in to comment