Aside from the technical reasons we lose information online, I think the main reason is that we simply don't care about the information. If it's important, or particularly memorable, it probably will stick around. But so much of the content we see online is just a written version of what we talk about at the water cooler -- important in context, but as time goes by, less necessary to keep around.
We have backups of data at Conquent, but who would really mourn the lost data from some guy's random blog? Unless something happens and I attain celebrity status, it's unlikely that people will be digging through my past ramblings. The problem is that those ramblings will be gone without conscious effort to retain them.
Of course, there's lots of random stuff stuck in the corners of the Internet, and I wonder what kind of picture would emerge of me if some forensic anthropologist tried to piece the bits together later. There are lots of random photos of Anna Frank, for example, where she was in the background of a photo and only because we cared later, did those photos surface. But what we really know about the girl is limited to what she wrote in her famous journal.
The memory of the Internet is fleeting, malleable, and constantly susceptible to sensory overload. The way we store data online is less like the great archives of a library and more like a gossip tree -- things get distorted as they pass from person to person and get re-interpreted or get processed through Photoshop.
You can't believe everything you see online because it's not the truth, it's just the bit of information we remember right now, and it's going to change again tomorrow...