From matt.brown at kitware.com Mon Mar 12 10:59:39 2018 From: matt.brown at kitware.com (Matt Brown) Date: Mon, 12 Mar 2018 10:59:39 -0400 Subject: [Kwiver-users] Bounding box coordinates Message-ID: I have an application where I would like to significantly reduce the resolution of an image using pyramid reductions, run a detector on that reduced-resolution image, and then warp the resulting bounding box back up to the native-resolution coordinate system. However, I have some questions about dealing with the bounding box coordinates and how detectors treat, or should treat, these coordinates. I wrote a new OCV image_pyramid process, which can be configured to apply multiple iterations of OCV's pyrUp or pyDown and can emit a homography (there is actually a 0.5 pixel translation on each level, so it's not just a scaling) representing the warping from output image back to source image. I am also writing a *detected_object_bounding_box_warp* node to apply a homography to each bounding box in a detected_object_set. Basically, it will warp the top-left and bottom-left coordinates with the homography. Though, I admit it is a little weird applying a homography to a bounding box that has to remain an axis-aligned rectangle. So, maybe there is a discussion about renaming or reworking this. My question is related to the precise definition of the bounding box, which encodes the upper left and lower right coordinates of the box. Do we have a standard enforced for how detectors populate these bounding boxes? Let's say you have a detection covering a rectangle of pixels starting at upper-left pixel indices (x1,y1) to lower-right (x2,y2) inclusive, I would think the bounding box definition should be (x1-0.5,y1-0.5) to lower-right (x2+0.5,y2+0.5), which is the box in image coordinates completely containing the area of the pixels. A different of a half pixel may seem trivial, but in my case, I am detecting on a highly reduced version of the image and then upscaling the bounding boxes, so that half pixel of different can be compounded many times. Thanks, Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.tunison at kitware.com Mon Mar 12 12:05:35 2018 From: paul.tunison at kitware.com (Paul Tunison) Date: Mon, 12 Mar 2018 12:05:35 -0400 Subject: [Kwiver-users] Bounding box coordinates In-Reply-To: References: Message-ID: I don't remember the name, but the process could be constrained to only allow the type of transform that allows translation and scaling (no rotation, etc.). On Mon, Mar 12, 2018 at 10:59 AM, Matt Brown wrote: > I have an application where I would like to significantly reduce the > resolution of an image using pyramid reductions, run a detector on that > reduced-resolution image, and then warp the resulting bounding box back up > to the native-resolution coordinate system. However, I have some questions > about dealing with the bounding box coordinates and how detectors treat, or > should treat, these coordinates. > > I wrote a new OCV image_pyramid process, which can be configured to apply > multiple iterations of OCV's pyrUp or pyDown and can emit a homography > (there is actually a 0.5 pixel translation on each level, so it's not just > a scaling) representing the warping from output image back to source image. > I am also writing a *detected_object_bounding_box_warp* node to apply a > homography to each bounding box in a detected_object_set. Basically, it > will warp the top-left and bottom-left coordinates with the homography. > Though, I admit it is a little weird applying a homography to a bounding > box that has to remain an axis-aligned rectangle. So, maybe there is a > discussion about renaming or reworking this. > > My question is related to the precise definition of the bounding box, > which encodes the upper left and lower right coordinates of the box. Do we > have a standard enforced for how detectors populate these bounding boxes? > Let's say you have a detection covering a rectangle of pixels starting at > upper-left pixel indices (x1,y1) to lower-right (x2,y2) inclusive, I would > think the bounding box definition should be (x1-0.5,y1-0.5) to lower-right > (x2+0.5,y2+0.5), which is the box in image coordinates completely > containing the area of the pixels. A different of a half pixel may seem > trivial, but in my case, I am detecting on a highly reduced version of the > image and then upscaling the bounding boxes, so that half pixel of > different can be compounded many times. > > Thanks, > Matt > > _______________________________________________ > Kwiver-users mailing list > Kwiver-users at public.kitware.com > https://public.kitware.com/mailman/listinfo/kwiver-users > > -- Paul Tunison Senior R&D Engineer Kitware, Inc. 28 Corporate Drive Clifton Park, NY 12065 USA Phone: (518) 371-3971 Ext.164 -------------- next part -------------- An HTML attachment was scrubbed... URL: