Biological vision uses attention to reduce the visual bandwidth simplifying the higher-level processing. This paper presents a model and its hardware real-time architecture in a field programmable gate array (FPGA) to be integrated in a robotic system that emulates this powerful biological process. It is based on the combination of bottom-up saliency and top-down task-dependent modulation. The bottom-up stream is deployed including local energy, orientation, color opponencies, and motion maps. The most novel part of this work is the saliency modulation by two high-level features: 1) optical flow and 2) disparity. Furthermore, the influence of the features may be adjusted depending on the application. The proposed system reaches 180 fps for 640× 480 resolution. Finally, an example shows its modulation potential for driving assistance systems.