Author Topic: Big Studio Render Farm unsing Corona + Deadline  (Read 4351 times)

2017-12-07, 00:52:02

Phasma

  • Active Users
  • **
  • Posts: 112
    • View Profile
Hello.

we are 40+ Artists at EVE images (70+ Employees). We are using Corona as our Renderer of choice. We want a reliable Network rendering solution - based on Deadline.

What we are currently doing:

I wrote a custom submitter for Tile Jobs. It is submitting UHD cache Jobs seperately and takes care of important Corona Setting such as adaptive sampling. It does a lot more (like checking external files and stuff) but in a nutshell: it makes sure that all the tiles from rendered on different machines will fit without seams. so no bloom/glare, no camera distortion, no adaptivity, no time limit, no noise level limit and so on. we are basically disabeling a lot of cool corona features just to make it work.

Thats bad.

I recently wrote tools and defined a new standard for lighting at our office. As this is simulating lenses/cameras (and other stuff) the farm/distributing system we are looking for should be capable of all these things above. the normal tile/jigsaw rendering method of deadline can not produce that with corona. even if the camera distortion will work at some point, the sampling adaptivity might still not work and so on. while rendering nice bokeh, adaptive sampling is extremely important though... so my idea is the following now:

we have to use corona DR, as it seems to be capable handling all the requested features. But instead of just starting some corona DR slaves via Deadline, the job itself - so the master - should also render on the farm, not local.

This seems a bit tricky though. I would split up our 60 nodes into DR slaves and regular max job slaves. maybe even dynamically so it would adopt to the amount of jobs coming in (I guess this might somohow be possible with some code). but then? every submitted max job (with coronaDR internally enabled) would need to have some placeholder/wild-card rendernodes inside the CoronaDR slave list in order to render the job on just a fixed amount of free DR blades. "search lan during render" would result in one job using all free DR blades to render on, blocking other jobs. But using fixed blade names would result in fixed DR Blade groups that would not be dynamic enough.

I would like to find a solution that:

a: is capable of using all features of corona
b: utilizes and scales to all of the blades dynamically
c: is reliable
d: needs to be able to handle as much as 50+ Jobs a day

it does not matter if this involves some coding and stuff but it needs to be done. Also If someone has a better idea please feel free to share.

Thanks in advance :-)

PS. I know that Thinkbox/Amazon is working together with RenderLegion/ChaosGroup to solve some of these issues. This is all fine and good but we need a solution very soon, so sorry if you feel pushed ;-)

2017-12-07, 11:28:07
Reply #1

maru

  • Corona Team
  • Active Users
  • ****
  • Posts: 12800
  • Marcin
    • View Profile
I can only repeat the ending - we are discussing how some of the things can be improved between Corona and Deadline, and everything is going in the right direction. If you are experiencing some specific problems with Corona+Deadline, please report them to us, and we will look into this (support@corona-renderer.com).

Other than that I would advise to always use the newest version of Deadline and Corona as the compatibility is continuously being improved.
Marcin Miodek | chaos-corona.com
3D Support Team Lead - Corona | contact us

2018-01-04, 16:01:29
Reply #2

Phasma

  • Active Users
  • **
  • Posts: 112
    • View Profile

2018-01-09, 10:25:03
Reply #3

maru

  • Corona Team
  • Active Users
  • ****
  • Posts: 12800
  • Marcin
    • View Profile
As I wrote previously, it would be best if you could contact us at support@corona-renderer.com with specific issues that you believe we should sort our with Deadline. Right now I am not exactly sure how we could help you, as the main message contains a lot of information.

There is also this app, which you may be interested in, however I am not sure how far is it in the development: https://rendernodemonitor.com/
Marcin Miodek | chaos-corona.com
3D Support Team Lead - Corona | contact us

2018-01-30, 11:31:32
Reply #4

Phasma

  • Active Users
  • **
  • Posts: 112
    • View Profile
thanks, but we spend a lot of moneyz on deadline. We are using a lot of its features. We will not step back to a way simpler solution that-at the end- only controls Corona DR slaves that can also be achieved simply with Deadline even at the moment.

so we most importantly need adaptivity, denoising, bloom/glare and camera distortion to work with Deadline Tiles.

We still hope that - at some point at least some of those issues will be fixed. otherwise we consider switching back to Vray - sadly this menace is not a big lever for speeding up development anymore.
« Last Edit: 2018-01-30, 11:37:37 by Phasma »

2018-02-08, 14:27:14
Reply #5

maru

  • Corona Team
  • Active Users
  • ****
  • Posts: 12800
  • Marcin
    • View Profile
Hi, did you eventually contact us at support@corona-renderer.com about your issues? (or Deadline support)
Marcin Miodek | chaos-corona.com
3D Support Team Lead - Corona | contact us

2018-02-10, 00:40:16
Reply #6

Phasma

  • Active Users
  • **
  • Posts: 112
    • View Profile
it was a while ago, I wrote to Michal directly (mostly trying to fix all these issues with the external corona image editor) and he kept me informed about the status on thinkbox/corona development. last update was in august last year... "working on it"

however I can adress all these issues again in a new email that i'll send to support@corona-render.com if that somehow helps...

2018-03-24, 21:14:30
Reply #7

cgifarm

  • Active Users
  • **
  • Posts: 55
  • Your Brand New RenderFarm
    • View Profile
    • CGIFarm
Hi,

A while ago I wrote a power management script which will assign the nodes to a specific job using deadline webservice API.

Here's what's doing:

1. Send a signal to power on a node via ssh. This can be also implemented for cloud solutions
by creating a VM using the api.

2. Assign the node to a certain job. In order to have a job rendering on a certain number of nodes and not having them mixed with the search on lan, you need to specify
which node should render what. Deadline supports machine limit, making a white list of slaves that can render a job. The list is then given to the first node "master node"
and it's appended with maxscript to the DR list.

3. When denoising step is detected, it will mark as completed the slaves and leave just the master node rendering.

Notes, to achieve this I had to modify some deadline files as well, the last upgrade was for deadline  10.

4. The power management script will detect if there are any running jobs with queued frames and assign the powered on nodes which finished other jobs. If no jobs is queued it will shut down the nodes after a certain number of minutes of innactivity.

To further filter you machines if they are different configurations, you can use deadline groups assigned to your machines.

I can help implementing this in your environment as well if you are interested, it will take a couple of days to properly test everything.

If you want to implement this yourself, feel free to ask any questions here on this topic or get in touch on my skype id CGIFarm .

I wish you the best!

Alex
Working on a Renderfarm Platform - checkout our website cgifarm.com and our cost calculator : https://www.cgifarm.com/renderfarm-cost-calculator

2018-03-26, 10:21:20
Reply #8

Phasma

  • Active Users
  • **
  • Posts: 112
    • View Profile
Hi. thanks for this insight!

I will also try it again now with deadline 10! If I run into issues, would you mind if I contact you?

2018-03-26, 12:07:14
Reply #9

cgifarm

  • Active Users
  • **
  • Posts: 55
  • Your Brand New RenderFarm
    • View Profile
    • CGIFarm
Hi,

Feel free to contact me at any time.

There's a single frame distributed function implemented into deadline for V-ray which needs to be replicated for corona as well in order to assign the nodes which needs to render the job, most of it is copy/paste and
adding the class for Corona DBR into the main deadline plugin for max. There's 2 types of job submission through API, command line and through deadline plugin. I remember we chose the deadline plugin which handles
the pop-ups from 3ds max for various plugins and dll errors, it depends on how you decided to work with this. CMD line could be more powerful but there's more stuff you need to take care of, and the guys from deadline
implemented quite some checks into their plugin which you can take advantage of.

I also had to modify the slave python script that's running when launching the corona DBR so it launches with the proper 3DS Max version. If your nodes has just one 3ds max version installed
you should be fine without that. Just the master node needs to properly load the slave list from the config file that you will be saving.

Good luck!

Alex
« Last Edit: 2018-03-28, 14:28:21 by cgifarm »
Working on a Renderfarm Platform - checkout our website cgifarm.com and our cost calculator : https://www.cgifarm.com/renderfarm-cost-calculator

2018-03-28, 14:16:40
Reply #10

Phasma

  • Active Users
  • **
  • Posts: 112
    • View Profile
thanks a lot! I will try it soon!