19.1 Why Not Use Fully Connected Networks for Images?#
Consider a grayscale image of size 28×28 pixels. If we treat this as a flat input vector to a dense network, it results in 784 input nodes. For a 100x100 image, the number of inputs jumps to 10,000! Fully connected networks don’t scale well and tend to ignore spatial hierarchies like “edges” or “patterns.”
CNNs solve this by leveraging local connectivity, weight sharing, and spatial hierarchies to detect meaningful patterns in images.
Figure: Fully connected networks flatten images into vectors, losing important spatial relationships like shape, texture, and position. This may work for simple images but fails on complex ones with pixel dependencies. CNNs preserve spatial information by using filters, making them more effective for image tasks like recognition and detection.