This paper presents a design methodology suitable for the cost-effective and real-time implementation of nonlinear image processing algorithms. Starting from high-level functional descriptions the proposed optimization flow simplifies the designer's duty to achieve a low complexity and low power realization in CMOS technology (FPGA and/or ASIC) with low accuracy loss for the implemented algorithm. As an application case study the paper describes the design of a system, based on a Retinex-like algorithm, to improve the visual quality of images acquired in bad lighting conditions.