You are confused, because you think of photons as if they were little (massless) localized particles, but they are far from that. If you want to think about photons in a classical way, it's much closer to the true meaning of what a photon is, to think in terms of electromagnetic waves. The wavelength has nothing to do with the spatial extension of the wave. It's just telling the spatial periodicity of the wave. The extreme case is a plane wave, which is extended over the entire space but has a sharp wavelength.
Now if you have a true single-photon state with pretty sharp momentum (i.e., pretty sharp frequency and wavelength) that means, heuristically, you have an electromagnetic wave is pretty much extended in space and thus goes through both slits. Because it's a single photon, on the other hand, it can be either registered by a detector (e.g., a photoplate or CCD cam) as a whole or it's not detected at all. The reason is, and that's what makes it in a way "particle like", the interaction of em. radiation with a frequency ##\omega## with matter is always in integer multiples of the "energy quantum" ##E_{\omega}=\hbar \omega##. By definition a single-photon state is an energy eigenstate of the electromagnetic field of frequency ##\omega## with energy eigenvalue ##E_{\omega}##. So if the photon interacts with matter making up the detector it can be either completely absorbed and registered at the place where the interaction took place or it just get's at most a bit scattered by the interaction but not absorbed and not registered. There's no way to detect any portion of the photon only.
That's why you need the probability interpretation of the quantum formalism: On the one hand you have the wave picture, which describes continuous phenomena on the other hand if you deal with single-photon states the photon can only be registered as a whole or not registered at all. So all you get is a single point on your detector screen (photoplate or CCD cam). So the observable feature of the quantum state is that it predicts the probability to detect a photon at the place where the detector is located.