Author Topic: Team Render server crashing when rendering 1 frame on multiple clients  (Read 6113 times)

2020-07-09, 13:14:56

JPeters

  • Active Users
  • **
  • Posts: 91
    • View Profile
Hi there,

We are running into an issue using Corona (both 5 and the latest (June 30) daily build of 6) with a team render server+clients setup.
The problem is as follows:

When we try to render a C4D scene that has 1 frame (So it splits the one frame over multiple machines), the entire render server crashes. When we restart the render server, the job we just started is indicated as "complete" without results.
If we have started multiple projects, it will be rendering a different project from the one we started when the server comes back online.
We are trying to render one frame on multiple machines to speed up the rendering process.

When we take the same exact scene, give it 2 frames (so it will render 1 frame per client), the render server works fine and there are no problems.

When using Corona 5, when rendering 1 frame on multiple machines, even when it finishes, there is a color difference (looks like different exposure or if some lights are switched off) in the image which does not happen when we render the same frame on 1 dedicated machine. (All machines are running the exact same Corona version so this can not be the issue I imagine)
When using Corona 6, when rendering 1 frame on multiple machines, there is no color difference anymore (All machines are running the exact same Corona version so this can not be the issue I imagine)

When using Corona 5, when rendering 1 frame on multiple machines, if you start a job like this, the render server crashes. (All machines are running the exact same Corona version so this can not be the issue I imagine)
When using Corona 6, when rendering 1 frame on multiple machines, you can start 1 job safely but as soon as you launch a second job the render server crashes. (All machines are running the exact same Corona version so this can not be the issue I imagine)

=====================

I've attached the bug report zip of the latest crash with Corona 5 (I believe) called _BugReport.zip which is the crash that happens when you try to launch any job that splits over multiple machines.
I've also attached the bug report of the latest crash with Corona 6 (30 June) called _BugReport_Corona6.zip which is the crash that happens when you try to launch more then one job.

As I said it happens with multiple Corona versions.

This is a rather serious issue for us as we render a combination of animation but also high res single images most days and having the render farm crash all the time is rather frustrating.

Cheers,

Joep





« Last Edit: 2020-07-09, 13:20:18 by JPeters »

2020-07-09, 15:17:22
Reply #1

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5858
    • View Profile
A few quick questions while we start to take a look.

- The color difference. Note that in Corona 5, the VFB uses a different color space than the Cinema 4D Picture Viewer. When you say the TR version looks different, are you comparing the result from TR to the result from the VFB? In Corona 6, there is a new option to use the C4D color space in the VFB, that way you know what your image will look like when rendered using TR or the Picture Viewer (because the VFB is using the same color space). Would that be the cause?

- How many machines in your network?

- Are you submitting to TR via the web server, or direct from inside C4D? It looks like the web server as you mention being able to see job status etc.

EDIT extra question

- Is the TR Server running on a machine that is also a TR Client? Or it's a standalone machine that just acts as the server and doesn't contribute to rendering?

Tom Grimes | chaos-corona.com
Product Manager | contact us

2020-07-09, 15:24:34
Reply #2

JPeters

  • Active Users
  • **
  • Posts: 91
    • View Profile
Hi Tom,

Thanks for the reply, to answer your questions:

- The color difference is gone with Corona 6, I imagine this was the color space issue indeed. However, I was comparing the TR result to the render written out by C4D when finishing the render. (Not using the VFB).

- For this test, we used 1 server and 4 clients, all running the same Corona version (5 and 6 build 30 June).

- We are submitting jobs via the webserver indeed as from within C4D you can't really keep track of what people submit.

- TR server is running on it's dedicated machine that is not rendering/running a client.

Cheers,

2020-07-09, 15:57:09
Reply #3

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5858
    • View Profile
I tested on 1 Server, 2 clients, Server as a standalone (no client or rendering)

- Submitting a job to render 1 frame with the 2 clients did not cause a crash and worked as expected

- Submitting two jobs to render, and starting them both in the Server, did indeed cause a crash

The color difference we would say is something that is fixed in 6 (as in, the color difference no longer appears, and is due to the use of the C4D color space in the VFB, which ensures what you see in the VFB is what you will get from TR or the PV).

Both tests with Corona 6 daily from Jun 30th.
Tom Grimes | chaos-corona.com
Product Manager | contact us

2020-07-09, 16:22:15
Reply #4

JPeters

  • Active Users
  • **
  • Posts: 91
    • View Profile
Hi Tom,

"Good" to hear it crashed for you as well, hopefully you are able to find a solution for the next daily build or a workaround in the meantime.
It has improved in Corona 6 daily from Jun 30th compared to Corona 5, because in Corona 5 you could mostly not even start 1 job without a crash, now at least you can run 1 at a time with Corona 6.

Cheers,

Joep

2020-07-22, 09:34:30
Reply #5

JPeters

  • Active Users
  • **
  • Posts: 91
    • View Profile
Hi Tom,

Have you got news on this one or if a fix is being worked on?

I found a second issue with rendering a single image on multiple machines, it seems the Corona Sky is getting disabled when you render like this. This only seems to happen when you render a single image on multiple machines.

Rendering that single image on one machine works fine.

2021-04-28, 15:39:23
Reply #6

HFPatzi

  • Active Users
  • **
  • Posts: 147
    • View Profile
Hi Tom,

I'd like to revive this thread since we are setting up a render server right now and have the same problems with the crashing clients when starting two jobs right away. When using cinema's standard render the second job queues as expected when started. Seems like corona is starting the second job right away. Is there anything we can do to prevent this issue, other than wait until a job finishes and then start the next one?
Would be really great to have the queueing working, since it is pretty complicated to always communicate with all users and also as jpeters mentioned, after the client crashes, the actually running job is marked as finished without any output.

Thanks in advance!

Greetings from germany,

Moritz

2021-04-28, 16:44:53
Reply #7

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5858
    • View Profile
See the news from today about the hotfix :)

https://forum.corona-renderer.com/index.php?topic=33106.0
"- Fixed Team Render Server crashes when starting multiple single-image renders"

Please test and let us know how it works for you.
Tom Grimes | chaos-corona.com
Product Manager | contact us

2021-04-28, 17:16:14
Reply #8

HFPatzi

  • Active Users
  • **
  • Posts: 147
    • View Profile
Damn, that was fast :D

Allright, as soon as we tested, I'll get back to you.

Thank you!

2021-04-29, 09:47:45
Reply #9

mmarcotic

  • Former Corona Team Member
  • Active Users
  • **
  • Posts: 544
  • Jan - C4D QA
    • View Profile
Hello,

please go ahead and try. This was actually fixed in V7 Daily Build 17-02-2021 (https://forum.corona-renderer.com/index.php?topic=30837.msg181960#msg181960) and we have backported it to V6 hotfix. I will remember to also let users know in the individual threads rather than just in changelog for the future.

Thanks,
Jan
Learn how to report bugs for Corona in C4D here.

2021-04-29, 11:01:23
Reply #10

HFPatzi

  • Active Users
  • **
  • Posts: 147
    • View Profile
Hi Tom & Jan,

we tested it and it works almost fine. Two things though:

1. Let's say i have two render clients and one job rendering on both clients and one job is in queue. If One of the clients is done faster than the other, it immediatly grabs the next job in queue and starts to render, which is fine. But when the other client is done rendering the first job it won't start rendering the next job wich is already started by the other client. It just stays on idle and the second job is only rendered by the one client which grabbed it right away.

2. I'm not sure if there's already a thread about this. But the render results are quite different compared to when i render the same job on my local machine. The result from the render server looks pretty overexposed and seems to be way more saturated compared to the local machine render. See attached screenshot. I did another test where i put all used image textures in the corona bitmap shader, but the result is the same. I Uploaded my test scene via dropbox. The filename is "showreel_wall_04.zip"

Thanks for your support!

Greetings from Germany,

Moritz

Edit: All Computers involved are PCs with windows 10.

2021-04-30, 17:11:44
Reply #11

HFPatzi

  • Active Users
  • **
  • Posts: 147
    • View Profile
Ok, for the second issue i found something out. It seems that the Settings for Photographic exposure are not really linked between the corona camera tag and the actual render settings. In my scene i had photographic exposure ticked on in the camera tag but these settings (exposure, iso, f-stop, etc) where completely ignored by the teamrender server / temarender clients. I had to manually set the same settings in the global rendersettings although they where already set in the camera tag and the little icon behind every box says that these settings got overwritten by the camera tag. In my opinion this is pretty dangerous and tideous especially if you have more than one camera in a scene and you always have to set the exposure settings manually in global rendersettings again.

Nevertheless, have a nice weekend! ;)

Greetings,
Moritz

2021-05-03, 09:57:44
Reply #12

mmarcotic

  • Former Corona Team Member
  • Active Users
  • **
  • Posts: 544
  • Jan - C4D QA
    • View Profile
Thank you,
I have added the Corona Camera Tag to our internal site (Internal ID=683937558)

Thanks,
Jan
Learn how to report bugs for Corona in C4D here.