In remote sensing image, there is significant difference between the scales of different objects, so any single-scale segmentation can barely produce satisfying result. This paper argues that appropriate segmentation scale can be selected according to the visual complexity of scene. Based on the Watson visual model, a method is proposed to calculate the complexity used for adapting the scale of the Statistical Region Merging (SRM). In addition, the SRM is improved with dynamic merging mode and extended to multi-band image. The experiments demonstrate that the performance of the proposed adaptive scale segmentation is better than any single-scale segmentation.