Some answers (but not all :) )
_Do render nodes need to be as powerful as the main machine? (Probably not)
They can be as powerful, more powerful, less powerful - it's all good.
_Sometimes I find to be short on ram due to very high-poly-count scenes, does that mean that all the render nodes should also have 128GB of ram? (It seems quite unreasonable)
Yes, the RAM on each machine will need to be able to handle the scene. While there may technically be some savings from not having the full GUI of the DCC open, having less other software running, etc. it is safest to assume that if your scene wouldn't work on 64GB on your main machine and needs 128GB, then the render nodes will also need 128GB (else they won't be able to render the scene).
_Is it better to buy a big and powerful (and expensive) machine like a TR 5995 with lots of ram, or is it fine to have many cheaper render nodes?
Again, either works. It may affect how things are "delivered" - e.g. say Machine A, or two Machine Bs which are half the power. If Machine A would deliver one frame in 1 minute for an animation, then the other option will see you waiting for 2 minutes before you see any final result, BUT it will be 2 frames and not one. It's never entirely accurate, but you can as a general guideline "add up" the overall power of all the machines combined. That said note that if you get into a lot of machines, you may start running into network issues with all the traffic, and would need to look into that as another aspect :)