Confused: Visual Attention vs Segmentation
Recently, I am very confused about the relationship of visual attention detection and segmentation. Dose the visual attention detection is a one-step behind segmentation by ordering obtained segments according to some attentional criterion? If the answer is yes, what kind of contribution can be made in the research of bottom-up visual attention detection?
Is it true that attention pop-out after different homogeneous regions are segmented? I suppose that's the debate between spatial-based attention and object-based attention? Actually, the criterion that segmentation algorithms use already consider the contrast between segments. Does that means visual attention? Then what contributions can be done in the area of bottom-up visual attention detection?
Visual Attention is commonly considered to relate to contrast. So how to measure the contrast (spatial for image, temporal for video)? Current methods applied spatial block center-surround difference and scale-space center-surround difference? Both of these two inevidently have granularity problem. The interval between consecutive scales in the scale space and the block size have effect on the performance of attention detection. This defects reveal two questions to us:
1. Does the granularity problem indicate the hierarchy of attention? Can we build up a hierarchical structure of attention and apply it to browsing problem?
2. Should visual attention has different scale (granularity) at different location? Or just a hierarchical structure where in each layer, there is a same common scale for all location?
Is it true that attention pop-out after different homogeneous regions are segmented? I suppose that's the debate between spatial-based attention and object-based attention? Actually, the criterion that segmentation algorithms use already consider the contrast between segments. Does that means visual attention? Then what contributions can be done in the area of bottom-up visual attention detection?
Visual Attention is commonly considered to relate to contrast. So how to measure the contrast (spatial for image, temporal for video)? Current methods applied spatial block center-surround difference and scale-space center-surround difference? Both of these two inevidently have granularity problem. The interval between consecutive scales in the scale space and the block size have effect on the performance of attention detection. This defects reveal two questions to us:
1. Does the granularity problem indicate the hierarchy of attention? Can we build up a hierarchical structure of attention and apply it to browsing problem?
2. Should visual attention has different scale (granularity) at different location? Or just a hierarchical structure where in each layer, there is a same common scale for all location?