Weighing in here briefly with some thoughts that I know I've already passed to the Corona devs over the years.
I kind of disagree that getting caustics to work reliably is almost impossible in complex scenes, that's maybe an overstatement - we're making it work regularly in very complex scenes - but I do totally support the sentiment - it's way, way too fiddly currently. And this is principally due to caustics having very basic optimisation/adaptivity in response to the scene - you often end up with poor looking noisy results and having to brute-force across the entire scene to get something decent, and almost always relying on some kind of post-processing fudging. This results in enormous wasted R&D time, rendering time and costs.
The main thing we're extremely keen to see is real, solid improvements to getting caustics to be controllable where and when you want them, with actual improvements therefore in quality/result and speed. The include/exclude is purely a compositing aid and has zero effect on speed/result. That's the key thing that needs to change somehow IMO.
Tied into that is the fact that caustics are still resolution-dependent in terms of photons. It works better in recent major releases so the overall "brightness" and appearance of caustics is now much closer as you scale up and down res, but it's still not working how it ought to be. Photons Per Pixel is inherently problematic because it's resolution-dependent, therefore the result changes drastically when you're testing at say 2k and then go to full-res. I've even found that I'm basically unable to render what looks good at 2k blown up to 9k, because it uses up too much memory. I think Photons will ultimately need to be moved to a world-unit-space similar to how displacement can work. The visual result should always be predictable across all resolutions, with memory usage scaling more reliably too.
Unlike other areas of CG, with caustics you really cannot currently just brute-force a clean solution by throwing a tonne of photons or passes at the problem, because to achieve that results in unsolvable memory issues even on 256gb beast machines like ours. That would be a first, worst solution IMO - if we can throw computing power at the issue and get a totally reliable, clean result, that would be ok for now. As a better long-term solution, a properly optimised workflow with Ondra's original "one click and done" working the way he envisioned for this some years ago.