Doing a crap ton of openGL programming recently and it's standard to shove together image data into the 4 channels of a RGBA image. I don't know the internals of Corona, but it does save on bandwidth and processing, since you upload just 1 texture to the GPU in one call instead of 4 and you take up only one texture unit.
However, don't overthink it. Once it's loaded there should be no difference. Do what is the most comfortable to you as the artist. Comfort is the biggest speed-up you can get :]
but then I heard a person say that the PNG reader in 3dsMax is single threaded and slower than Tiff.
That is true, the PNG reader is a little disaster. I managed to
break and crash 3dsMax constantly by piping 40 PNG sequences into the Corona multi map.
I don't like how long photoshop is taking to compress PNGs.
They addressed this with a compression complexity setting, but it is indeed crazy how long an 8k by 8k image can take to compress. There is a lot of optimization you can do in PNG's zip compression. I guess for use in web photoshop wants to offer a high compression ratio by crunching the optimal compression settings, to the determent of usability. At least Photoshop does this asynchronously now and doesn't lock up the whole program like it used to in the past.