Author Topic: Team render issues  (Read 13838 times)

2019-02-07, 13:28:23

kizo

  • Active Users
  • **
  • Posts: 28
    • View Profile
Hi,

as some other users I also have issues with TR. The issues are clients disconnecting and difference in output plus other limitations like not being able to save crx format.

As beta will expire on the 21st Feb. we are trying to prepare the workflow but the TR issues just seem to prevent the production.At the current state I feel like the TR is totally useless, unreliable and in fact any node licences are worthless which is a shame for a newly released product. I love Corona and hope to keep using it. We have NO issues working on single machines but distributed rendering or using the full potential of our hardware is essential.
I have been trying to get some help from the official support but unfortunately it takes way too long to get any answer so I will try here.Maybe I will have better luck.



TEAM RENDER ISSUE: CLIENTS DISCONNECTING AND SLOW RENDER TIME

I have spent a lot of time testing things out. Im not a network specialist so its rather hard to understand whats going on and if the network itself is causing issues but I did a series of tests to identify the differences in settings and results obtained.But we used the same machines and network setup with Vray DR and Thea render distributed (CPU+GPU on the network) without similar issues.

We are running a 1Gbp/s network and have a dedicated range of fixed IPs for machines. I have adjusted all firewall setting , disabled bonjour and Im using IP adresses to conect. I have followed the cineversity series of videos on TR and made all adjustments.Corona license server running. Tests we made with the hotfix release.

I made all tests using 9 machines.

From the tests I made so far one conclusion is that the smaller the packet size I set in Corona render settings the less the nodes disconnect.

In fact with a size of 5 or 10 and interval always at 10s none of the nodes disconnected.With a size of 50 they start disconnecting and with 100,200 or even 500 they also disconnect and it seems sooner. The error that we get when the node disconnects is:  "Frame synchronization failed: Communication error" but if pinged or tested the connection the node is available so I can only assume that there were to many chinks at the same time, but Im not sure. Also if thats the case...having smaller chunks would mean a lot more of them at the same time .....but  when chunks are smaller no nodes disconnect at all.

A small chunk size seems to be OK for smaller resolution renders.The tests I made with a 1k square render and small packet size 10 and 10s interval  rendered without disconnecting nodes at all.The scene with all nodes would render to the noise level set in 5-6 min. (same scene on a single 2990wx took 19 mins)

On a 4k square render of the same scene some interesting facts are shown:

A) packet size =  10 mb // render time (stamped)=0:42:00 // no nodes disconnected

B) packet size= 100 mb// render time (stamped)=0:21:34// 4 nodes disconnected (3 almost at the same time some 3-4 mins after starting the render)

Test A took around 1/4th of the time only to collect all the chunks. So it finished rendering in 30 mins and it took 10 to get all the chunks.

Test B  the render finishes in 21min. Even tho 4 of the machines disconnected this is way faster but the issue is not all of the power is used.

I have tried many combinations changing the interval only, the packet size only or combined. Its a very time consuming process. In all my test  nodes disconnect with larger packages wile they dont with smaller ones but than rendering takes way too long.


PICTURE VIEWER OUTPUT DIFFERENT FROM VFB

In all the test made so far using TR trough PV (the only possible way) but single machine too,  there is a difference where the PV saved image is always darker.
The c4d project settings are set to linear workflow and sRGB

I also had a situation where the lights didnt match in the render saved as .jpg from the PV to the .jpg saved from the VFB. It seems VFB is showing and saving the lightmix while the PV the beauty pass.That would explain why the color and intensity of light was different.Please check the attached images.
 If the above is true how would one save a non layered format out of PV and have it match the VFB ?

RENDER TIME INCONSISTENCY

While testing I noticed different render times are reported for the same job.Im not sure why is that but makes it harder to know whats the real render time and estimate.Check attached example.

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

It would be great to get any help from the support but Im looking forward to user input too. Anyone out there using TR without the above issues?

Must say Im rather disappointed with how network rendering works. I have 3 x Corona + 3 nodes licences  and one Corona + 10 nodes license along with 10 machines waiting to render.And Im now facing major slowdown in production and cant find a solution. Im not getting any help from the support too.
It feels like the beta stage should have lasted longer even if it was rather long.But making distributed rendering work should be part of a released product and unfortunately cinema Team Render doesnt seem to be the best choice for that.

Cheers
kizo









2019-02-07, 14:05:10
Reply #1

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
Hi Kizo!

To me, it sounds like your network is being overloaded with traffic. 10 machines is a lot to run on a 1Gb network, and I know there are people with less machines who are using a 10Gbe network, so I am thinking collisions on the network is the problem, where "keep alive" messages sent between the machines to confirm they are still there are being lost and so either the Client concludes the Server isn't there or vice versa.

This also sounds like a fit with the fact that using a larger packet size is causing the drop outs, while a smaller packet size does not. A test that might help in confirming this some is this - you mention " packet size= 100 mb// render time (stamped)=0:21:34// 4 nodes disconnected", it would be interesting to know with the 100 Mb packet size using only workstation + 2 clients, does a disconnect happen? Then workstation + 4 nodes, and so on. If the larger packet size works when there are less nodes in use, that would also suggest the network is becoming overloaded resulting in some keep lives getting lost among data collisions.

(BTW, you should have heard all this from support already)

The other possibly interesting test would be to use Team Render but with one of the native C4D render engines - I haven't looked into that to see if it has a similar control over packet size, but if it does you could try and replicate the same settings there and see if that has the same issue too.

A search for 10gbe will show threads where people are using 10 gig ethernet networks, so could be that is just a lot of machines and a lot of data that is more than the 1 gig network can manage. I haven't looked too far into the question for C4D in general, but I do see posts like this https://www.c4dcafe.com/ipb/forums/topic/92259-team-render-woes-bandwidth-issues/ where it is suggested that a gigabit network can do about 100mbps, a figure I thought was interesting since that is the packet size where you mention things start to drop out.

We're continuing to look into this and discuss the subject of networking here, including considering whether some home grown networking solution may be better than C4D TR, but in the meantime this could just be a limitation of C4D TR.

If you can do those tests (100 mb packet size, less nodes; TR with native render engine with similar packet sizes if controllable) that could further confirm whether the network might be an issue. Let us know!
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-07, 14:14:34
Reply #2

kizo

  • Active Users
  • **
  • Posts: 28
    • View Profile
Hi Tom,

I will try to make further testing.
But if network is the issue how would you explain rendering with vray DR or Thearender distributed a few years back worked without any issues?

also if network is the bottleneck wouldnt it make more sense that it gets congested with far more packets being sent at the same time like when a smaller packet is used?

in any of these scenarios seems like all my node licences wont be of any use and if the issue is Team render related  I can see the Corona team is pointing to Maxon to solve the issue with TR.
If you research you can see TR issues with other engines as well as C4d ones.So those issues are here for a long time its no news. And that makes the choice to use TR by corona even more strange

as I stated in my emails all I need is to be able to continue the production. As it seems now that wont be possible.

Cheers
kizo

2019-02-07, 14:25:39
Reply #3

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
Using TR isn't strange in that if there is a good existing solution, it makes sense not to spend development time replacing it. The question on whether TR is a good solution is still out for the jury though :)

TY for the further testing, it will be interesting to know the result of that! And as a note, we are not pointing fingers at Maxon, nor expecting them to solve it - in part, since we are using TR in a somewhat different way than it's native intention. What we are trying to find out is whether the way we use it supports 10 machines across a 1 gig network. If that turns out to be the case, we will have to consider what the next steps will be. It's certainly a much rarer thing for people to be using that many clients, making it an edge case with much less data for us to investigate how things work in that scenario, which is why your testing is very valuable to us.
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-07, 14:29:28
Reply #4

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
The other interesting test would be changing the interval at which data is sent - this could also reduce the likelihood of collisions. That's the second part of the Manual TR settings for Corona, and may be something to test too. Unfortunately at the moment I don't have any suggested values to try, but raising that (so that machines send data less often) may also be a solution to get everything working as required. Hope this helps!
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-07, 14:38:21
Reply #5

kizo

  • Active Users
  • **
  • Posts: 28
    • View Profile
Hi Tom,

thanks on the further explanation.

Please can you reply to my question:

But if network is the issue how would you explain rendering with vray DR or Thearender distributed a few years back worked without any issues?

also if network is the bottleneck wouldnt it make more sense that it gets congested with far more packets being sent at the same time like when a smaller packet is used?

Thanks
kizo


P.S.  I understand 10 nodes might not be so common and Im willing to test anything you like just to get things working.If you have more specific tests needed just let me know.I have a huge load of work but I will find the time for that
« Last Edit: 2019-02-07, 14:43:03 by kizo »

2019-02-07, 14:41:25
Reply #6

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
I can't comment on V-Ray and Thearender as I don't know how they were using TR compared to how we use it.

And not necessarily on the packet size, and the smaller ones more often would leave more gaps for keep alives to go in between them. Think of a highway, when lots of small cars are joining the highway, there is always a gap possible by delaying one car to slip in another one (a keep alive), but when a truck made of 18 connected trailers is joining the highway, there is no gap and our keep alive car has to sit and wait for the whole super long truck to get onto the highway before it can join (by which time, it may be too late for it to reach its destination).

(EDIT which is why sending the super large trucks less often may open up gaps for the keep alives to get in there, as one possibility)
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-07, 14:47:33
Reply #7

kizo

  • Active Users
  • **
  • Posts: 28
    • View Profile
I can't comment on V-Ray and Thearender as I don't know how they were using TR compared to how we use it.

And not necessarily on the packet size, and the smaller ones more often would leave more gaps for keep alives to go in between them. Think of a highway, when lots of small cars are joining the highway, there is always a gap possible by delaying one car to slip in another one (a keep alive), but when a truck made of 18 connected trailers is joining the highway, there is no gap and our keep alive car has to sit and wait for the whole super long truck to get onto the highway before it can join (by which time, it may be too late for it to reach its destination).

(EDIT which is why sending the super large trucks less often may open up gaps for the keep alives to get in there, as one possibility)

a good description of the packet size thing, thanks on that much more suited for my non tech brain.

as for commenting vray or any other engine its a bit political answer. I dont know the ins and outs of the tech behind but I do know I rendered distributed on the same network with 10 and more machines and never had issues. from my POV its a good indication something is wrong with either TR or coronas usage of it.  BTW non of them used TR that might tell something
« Last Edit: 2019-02-07, 14:53:00 by kizo »

2019-02-07, 15:34:22
Reply #8

kizo

  • Active Users
  • **
  • Posts: 28
    • View Profile
Tom I would like to get an answer on the color difference issues also if possible? here is the issue again for easier following but please check the attachments in the 1st post in this thread.

Thanks


"PICTURE VIEWER OUTPUT DIFFERENT FROM VFB

In all the test made so far using TR trough PV (the only possible way) but single machine too,  there is a difference where the PV saved image is always darker.
The c4d project settings are set to linear workflow and sRGB

I also had a situation where the lights didnt match in the render saved as .jpg from the PV to the .jpg saved from the VFB. It seems VFB is showing and saving the lightmix while the PV the beauty pass.That would explain why the color and intensity of light was different.Please check the attached images.
 If the above is true how would one save a non layered format out of PV and have it match the VFB ?
"

2019-02-07, 16:05:19
Reply #9

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
I'll have to leave that one to someone else, as I have no knowledge or research done into that one :) One for Ben, or the devs!
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-07, 17:53:04
Reply #10

houska

  • Former Corona Team Member
  • Active Users
  • **
  • Posts: 1512
  • Cestmir Houska
    • View Profile
Tom I would like to get an answer on the color difference issues also if possible? here is the issue again for easier following but please check the attachments in the 1st post in this thread.

Thanks


"PICTURE VIEWER OUTPUT DIFFERENT FROM VFB

In all the test made so far using TR trough PV (the only possible way) but single machine too,  there is a difference where the PV saved image is always darker.
The c4d project settings are set to linear workflow and sRGB

I also had a situation where the lights didnt match in the render saved as .jpg from the PV to the .jpg saved from the VFB. It seems VFB is showing and saving the lightmix while the PV the beauty pass.That would explain why the color and intensity of light was different.Please check the attached images.
 If the above is true how would one save a non layered format out of PV and have it match the VFB ?
"

Hi kizo,

after reading your description, it seems to me that you somehow managed to save only the non-lightmix version of the image out of the PV. The functionality of the "Save as..." option of the C4D Picture Viewer depends on the current layer mode (Image vs. Single-Pass vs. Multi-Pass) and possibly on what layer you have selected.

As for the different lightness, we are aware of a very slight (almost imperceptible) difference between PV and VFB and it's probably a result of different sRGB handling. But it seems to me (based on your description) that you have a much bigger difference between those two. Might I ask, whether you have any PostProcessing filters enabled? And if so, are the results from PV and VFB the same after disabling PostProcessing?

2019-02-07, 19:53:47
Reply #11

Nelaton

  • Active Users
  • **
  • Posts: 56
    • View Profile
Hello Houska,

I think kizo is telling that we cannot  automatically save the result of the lightmix interactive pass as  in the corona frame buffer.

Moreover, beside this problem of correspondancies between both viewers that is occuring,  we meet exact same problem than kizo regarding Team render and stills rendering: Clients are
disconnecting when raising the packet size+ interval in manual mode. So i +1 everything he say, to us, the problem is not on our side.

Regards,

Nelaton
« Last Edit: 2019-02-07, 20:01:57 by Nelaton »

2019-02-07, 19:57:53
Reply #12

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
As a note, raising the packet size is what may cause clients to disconnect (larger packets, no room for keep alives to be sent across). Raising the packet size is only recommended for high resolutions, where the default packet size may be too small (causing slow rendering). Smaller packet sizes make disconnects less likely (we believe, if it is network traffic causing this, which is what we hope to find out from the tests).

How many clients are you using to render, and what packet sizes are you setting, when you get those disconnects? That information would be very useful to the issue at hand.

Cheers!
   Tom
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-07, 20:19:52
Reply #13

Nelaton

  • Active Users
  • **
  • Posts: 56
    • View Profile
hi Tom,
We are using 8 clients, and the value we tested is 100 Mb for the size packet  and we render A3, 144 dpi.
We also tested smaller packet size, but render time increased drastically to a point it was preferable to us to render localy.

One remark,  when rendering animations (with TR), we have no client disconnecting and descent render times/frames (around 10 mn/frames).
Not sure this is a solution, but i'm wondering why it's working  (client are not disconnecting) with animation and not with frame rendering.
 
Cheers,

Nelaton
« Last Edit: 2019-02-07, 20:35:21 by Nelaton »

2019-02-07, 20:37:30
Reply #14

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
Thanks Nelaton! It is still pointing to the same issue in that case, that the network is getting overloaded (is it by chance a 1 Gig ethernet, rather than a 10 Gig ethernet? Sorry for the questions, but all this information helps us a lot!). The fact that this is a large number of Clients again also suggests that, along with your packet size, great info thanks!

For animations, if you are using the Team Render Server, that would also make sense, as the Clients are not sending back packets of results but only a single image once completed, so they are "silent" across the network other than keep alives while they are rendering (while TR to Picture Viewer has the clients frequently sending back their latest results, in whatever packet size is set).

I wonder if raising the Client Update Interval would help in these cases - larger packets can be sent less often, which should reduce network congestion. I don't have any good figures from experimentation, but maybe 30 seconds, or even 60 seconds, or 90 seconds would be good (the progress of the render isn't so important here, as you should already have a good idea of what the final result will look like - you just want all machines working effectively to produce the final image, so if you don't get updates from them except every 30 seconds that shouldn't disrupt workflow, and may ease the problem of network traffic).

Thanks for your patience and information while this gets researched and investigated further (both you and Kizo too)
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-07, 21:54:13
Reply #15

kizo

  • Active Users
  • **
  • Posts: 28
    • View Profile
Tom I would like to get an answer on the color difference issues also if possible? here is the issue again for easier following but please check the attachments in the 1st post in this thread.

Thanks


"PICTURE VIEWER OUTPUT DIFFERENT FROM VFB

In all the test made so far using TR trough PV (the only possible way) but single machine too,  there is a difference where the PV saved image is always darker.
The c4d project settings are set to linear workflow and sRGB

I also had a situation where the lights didnt match in the render saved as .jpg from the PV to the .jpg saved from the VFB. It seems VFB is showing and saving the lightmix while the PV the beauty pass.That would explain why the color and intensity of light was different.Please check the attached images.
 If the above is true how would one save a non layered format out of PV and have it match the VFB ?
"

Hi kizo,

after reading your description, it seems to me that you somehow managed to save only the non-lightmix version of the image out of the PV. The functionality of the "Save as..." option of the C4D Picture Viewer depends on the current layer mode (Image vs. Single-Pass vs. Multi-Pass) and possibly on what layer you have selected.

As for the different lightness, we are aware of a very slight (almost imperceptible) difference between PV and VFB and it's probably a result of different sRGB handling. But it seems to me (based on your description) that you have a much bigger difference between those two. Might I ask, whether you have any PostProcessing filters enabled? And if so, are the results from PV and VFB the same after disabling PostProcessing?

Hi Houska,

thanks on the help. Im aware of the saving process from PV. I was just saving a file without previously going into single layer mode and choosing a layer.

I made further tests regarding the color and light difference and can reproduce the issue.

Rendering on a local machine only. The scene uses LightMix. Tried both .jpg and .tif.
NO multi pass file set to save so both formats save a single layer file.

1st time rendered in PV and I saved the files automatically from the render settings.
2nd time I rendered from VFB but leaving the save path to save automatically.
3rd I turned off auto saving the render, fired it from VFB and saved manually.

and this is where I get the difference.
the auto saved render from VFB and the manually saved one look different.check the attached .jpgs. tifs look the same

So this is not TR related but I encountered it while testing TR as I usually never render to PV on a single machine.
I have sent the scene I used for testing to Corona support 3 days ago so you can try reproducing it. Please let me know if you manage to test it out.

Thanks

kizo


The render saved from the PV or VFB(set to save in render settings) saves the beauty pass while if saved manually from the VFB it saves the lightmix. At least thats what we concluded after comparing the saved renders and separate passes.It would need checking.
« Last Edit: 2019-02-08, 11:36:42 by kizo »

2019-02-08, 09:26:54
Reply #16

Nelaton

  • Active Users
  • **
  • Posts: 56
    • View Profile
Hello,

@TomG: So we have a 10 gb switch  and network cards per computer of 1Gb. So, but i'm not sure, it's the switch who do the repartition and it power that preveal, is it?

NB:I was mistaken when i said the size packet was at  100 Mb in manual mode, for our  animations.
This is set to automatic.

Cheers,
Nelaton
« Last Edit: 2019-02-08, 09:32:08 by Nelaton »

2019-02-08, 14:25:22
Reply #17

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
The 1 GB cards would be the bottleneck (can think of it as a highway at one speed, but on and off ramps at one tenth of that speed, so that things can get delayed during those on and off ramps, which has the same effect of things not reaching their destination in time when machines go "Hi! Are you still there?")

If you have time to test, would be interested to know what happens if you do use automatic - mostly at the moment, testing longer intervals would be what was interesting. As another thought, If you use the Web Server version of Team Render, you can still render a single image and I *think* (but not sure off the top of my head) that you can get all machines contributing to that single image still... if that is possible, would be interesting to know whether it has the same issues, since there may be some different management in Team Render there rather than from a live-and-running version of Cinema 4D. I'll have a look and see if there is such a possibility with TR and the standalone server (though I can't test the network overload here, as I only have 1 WS and 2 nodes here :) So I will just be checking to see if "all machines rendering image" is possible, and won't know if it has any impact on the issues you are experiencing)
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-08, 14:40:51
Reply #18

Nelaton

  • Active Users
  • **
  • Posts: 56
    • View Profile
Ok, so we tested in the past a scenario where we have 12 cameras. 1 per frame. And then, as you say, the machines works on one image a time. And render times are no comparison with
rendering in TR, stills (no animation).

But a still image calculated not in animation, take approximately de same time to render with TR, than on our local machine(s). So this is disppointing. On the other end, when rendering still image with TR (no animation mode) we get lines that are appearing, but this is very long, to make the render finish (and we have 8 clients runnings in this extent)

Also,
 we know that having 1giga network cards per machines is properly a non-sense with a 10 gig switch, and  we are aiming to change them in a near future.
Cheers,
Nelaton
 
« Last Edit: 2019-02-08, 14:45:51 by Nelaton »

2019-02-08, 15:24:32
Reply #19

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
Hi Nelaton,

I am not sure what you mean by "a still image calculated not in animation" - do you mean when using Team Render to Picture Viewer? When you mention lines appearing and render times are slow, this is where setting to Manual and raising the packet size helps (if you see the render build in the Picture Viewer in only thin lines, it means not enough data is being sent by the machines in a single packet, and they are spending all their time sending these small packets rather than focusing on rendering). Raising the packet size helps here - but then after a certain size, you may end up with a congested network and start losing connection to the nodes (all depends on how many nodes you have, and speed of your network). In this second case, a test with increased Interval would be good, as "larger packets sent less often" may resolve both the slow renders from the small default packet size, and the network congestion from sending too much data over the network (but, the impact of the Interval has yet to be tested by anyone experiencing this, so we don't know yet how much it will help).

So the process right now is
a) My Team Render to Picture Viewer is slower than it should be, almost as slow or slower than rendering locally (and I see the image build in very small strips) - raise packet size
b) I raised packet size, now nodes keep disconnecting - try increasing the Interval

As a note, upgrading your ethernet cards to 10 gig rather than 1 gig should mean you can raise the packet size with less risk of ending up with network problems from a large amount of data, and hopefully everything will then just work :)
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-08, 16:22:50
Reply #20

Nelaton

  • Active Users
  • **
  • Posts: 56
    • View Profile
Thanks for your kind explanation. It doesn't appear to be a solution as pointed Kizo earlier, as other render engines do not suffer from traffic congestions
at 1Gigabit. I'm sure you got this essential point.
So then one question: Why not contact the Vray team to see how they handle this, and implement a Corona DR interface.
Because, now, i see TR as very unstable with corona.

Cheers,
« Last Edit: 2019-02-13, 13:28:23 by Nelaton »

2019-02-08, 16:23:26
Reply #21

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
Here is another option that may help, which is submitting a job to the Team Render Server rather than using Team Render to Picture Viewer. The downside is that you can't save to CXR, but other than that it should be fine.

The Server may handle sending data back and forth differently, since it is not showing the ongoing progress in the Picture Viewer / VFB, and so may not experience the dropouts - it seems to be independent of packet size, so could be it has its own inbuilt and separate method for handling sending data back and forth between Server and Client.

For me the easiest way to run a job through the Server is
a) Run the TR Server, the Corona License Server, and the TR Clients (including on the master machine, if you want it to contribute to rendering)
b) Use the folder icon in the top left of the TR Server UI to open the local file location where jobs are stored, and then make sure you are in the "admin" folder
c) Copy that address from Windows explorer
d) In C4D, use "Save Project with Assets" and paste the folder location in from above
e) This automatically creates a job for the TR Server
f) Open the browser UI for TR Server using the Globe icon almost top left in the TR Server
g) Start the job. So long as this is only a single frame job, all machines will contribute to it (if it is an animation, each machine will be given a different frame - and tests from Nelaton suggest that this for sure doesn't have issues with either slow rendering or nodes dropping out)

Since I don't have that many machines to test on, I can't say for sure if the possible difference in how the Server handles server-client communication will prevent network traffic problems, but I am kind of optimistic :) If anyone who has a large number of Clients and who experiences these issues can test, I'd be interested to hear the results.
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-08, 16:24:39
Reply #22

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
On the V-Ray issues, two things - one, the implementation of V-Ray in the past was not done by Chaos Group but an external team. And two, they likely aren't updating the VFB in the same way we are (which is what makes our use of TR different from other engines). So, unfortunately, there isn't anything we can learn or do there in that regard, sorry.
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-08, 16:25:07
Reply #23

kizo

  • Active Users
  • **
  • Posts: 28
    • View Profile
Hi,

so we have been testing various combinations and suggestions from this thread.

Raising the interval does help with disconnecting nodes. But the overall speed is extremely poor.
These are the specs of the machines used:

      x45   192.168.178.45:5401   PC, 40x2.3GHz, 64.00 GB RAM, (Studio Client)   Windows 10, 64 Bit, Professional Edition (build 17763)   Idle   
      x44   192.168.178.44:5401   PC, 40x2.3GHz, 64.00 GB RAM, (Studio Client)   Windows 10, 64 Bit, Professional Edition (build 17763)   Idle   
      x43   192.168.178.43:5401   PC, 40x2.3GHz, 64.00 GB RAM, (Studio Client)   Windows 10, 64 Bit, Professional Edition (build 17763)   Idle   
      x42   192.168.178.42:5401   PC, 40x2.3GHz, 64.00 GB RAM, (Studio Client)   Windows 10, 64 Bit, Professional Edition (build 17763)   Idle   
      x41   192.168.178.41:5401   PC, 40x2.3GHz, 64.00 GB RAM, (Studio Client)   Windows 10, 64 Bit, Professional Edition (build 17763)   Idle   
      ws35   192.168.178.35:5401   PC, 16x3.6GHz, 64.00 GB RAM, (Studio Client)   Windows 10, 64 Bit, Professional Edition (build 17763)   Idle   
      ws34   192.168.178.34:5401   PC, 32x3.8GHz, 64.00 GB RAM, (Studio Client)   Windows 10, 64 Bit, Professional Edition (build 17763)   Idle   
      ws32   192.168.178.32:5401   PC, 32x4GHz,    64.00 GB RAM, (Studio Client)   Windows 10, 64 Bit, Professional Edition (build 17763)   Idle
         ws31   192.168.178.31:5401   PC, 64x3.8GHz, 128.00 GB RAM, (Studio Client)   Windows 10, 64 Bit, Professional Edition (build 17763)   Idle


We rendered the same 4000x4000px image with different variations

Rendering on the single 2990wx machine takes 14:34

using 8 machines from the list we got these results.The below tests were rendered from the PV with TR

30s/100mb   most nodes disconnected
40/100         1 disconnected        total render time stamped 14:26   (it took 6:20 to gather all the chunks and do the post process)
50/100         0,1 disconnected                                            14:07  (6:01)
60/100         no disconnected                                             15:54  (5:52)
90/100         no disconnected                                             19:57  (7:28)


using 5 machines

30/100         no disconnected                                             14:23

using 3 machines

30/100         no disconnected                                             15:41 


With a packet size higher than 100 times got a lot slower at any interval. When disconnected this is the error we got:

"
Sending chunk 12/13 to the server
Frame synchronization failed: Communication Error"

also this one:


"019/02/08 14:45:42  [Corona4D]
Sending chunk 2/5 to the server
2019/02/08 14:45:42  [Corona4D]
MEMORY_ERROR while sending data
2019/02/08 14:45:42  [Corona4D]
Frame synchronization failed: Memory Error"

with the above error the machine list would still show nodes rendering but the image wasnt updating and at over 20 mins nowhere near done

We also tried starting from the server app using the web interface.This time without the 2990wx. The remaining machines rendered the image in 14:55 The save path was set but the render wasnt save in that location at all.I was saved in the "results" folder.
Th render is in liner color space and extremely different than any of the tests made so far


After this set of tests Im even more worried. It seems that for our network and number of nodes on this particular render the best combination was 50s/100mb as it rendered with 8 machines in 14:07 but compared to the 14:34 of the single 2990wx its very disappointing.

   




2019-02-08, 16:26:03
Reply #24

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
The other note as regards stability - it is fine for most people, so far these reports only relate to yourself and Kizo, who are in the rare situation of having 8 and 10 Clients (most people have much less clients, and so much less network traffic).
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-08, 16:27:54
Reply #25

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
TY for the results Kizo! What happens when you don't use all the Clients? (that is, rendering with just 2 clients, just 4 clients, etc., rather than all 10) You may get better results with less clients, as then there won't be network congestion and you won't need to raise the Interval (so you'll keep the speed benefit).

And what happens when using the TR Server approach to have all the machines rendering on that one job?
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-08, 16:32:09
Reply #26

kizo

  • Active Users
  • **
  • Posts: 28
    • View Profile
Here is another option that may help, which is submitting a job to the Team Render Server rather than using Team Render to Picture Viewer. The downside is that you can't save to CXR, but other than that it should be fine.

The Server may handle sending data back and forth differently, since it is not showing the ongoing progress in the Picture Viewer / VFB, and so may not experience the dropouts - it seems to be independent of packet size, so could be it has its own inbuilt and separate method for handling sending data back and forth between Server and Client.

For me the easiest way to run a job through the Server is
a) Run the TR Server, the Corona License Server, and the TR Clients (including on the master machine, if you want it to contribute to rendering)
b) Use the folder icon in the top left of the TR Server UI to open the local file location where jobs are stored, and then make sure you are in the "admin" folder
c) Copy that address from Windows explorer
d) In C4D, use "Save Project with Assets" and paste the folder location in from above
e) This automatically creates a job for the TR Server
f) Open the browser UI for TR Server using the Globe icon almost top left in the TR Server
g) Start the job. So long as this is only a single frame job, all machines will contribute to it (if it is an animation, each machine will be given a different frame - and tests from Nelaton suggest that this for sure doesn't have issues with either slow rendering or nodes dropping out)

Since I don't have that many machines to test on, I can't say for sure if the possible difference in how the Server handles server-client communication will prevent network traffic problems, but I am kind of optimistic :) If anyone who has a large number of Clients and who experiences these issues can test, I'd be interested to hear the results.

Thanks on the effort but this is a workflow killer even if it worked but it doesnt.
Saving projects locally to be able to render them out each time? thats just crazy
going each time trough the web interface part also.

I mean even if there were no issues its a overly long and frustrating procedure for 1 render let alone doing 20 or more
also there is no benefit in speed using the server.

2019-02-08, 16:33:23
Reply #27

kizo

  • Active Users
  • **
  • Posts: 28
    • View Profile
TY for the results Kizo! What happens when you don't use all the Clients? (that is, rendering with just 2 clients, just 4 clients, etc., rather than all 10) You may get better results with less clients, as then there won't be network congestion and you won't need to raise the Interval (so you'll keep the speed benefit).

And what happens when using the TR Server approach to have all the machines rendering on that one job?

Hi Tom

rendering wit 2 and 4 clients + main machine is on the above test list too. no benefit at all

2019-02-08, 16:35:28
Reply #28

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
TY! The fewer machines was with the raised packet size, right?

And the no speed benefit using the server was a comment based on an actual test? (Workflow killer aside) Cheers!
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-08, 16:49:20
Reply #29

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
BTW all this information has been incredibly helpful! I just wanted to add that as well as the tests and potential workarounds that I've been writing here, the developers have been continuing to think about and research the situation based on the tests that everyone has done for us (which have helped point to causes and so that helps in thinking about how it might be fixed). Didn't want you to think that what I write here is the only thing happening :) So, we are still looking into this and what might be possible for improvements.
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-08, 16:58:50
Reply #30

kizo

  • Active Users
  • **
  • Posts: 28
    • View Profile
yes Tom the no benefit in using the server is base on an actual test...one of the numerous tests we did. the render time is again almost same as on a single 2990wx and the machines rendering have way more power combined

the fewer machines tests were done with 30/100  as I wrote above.

Im glad to hear the devs are thinking about the issues as well as Im very glad on your help. Im only worried how will I manage to work on daily bassis once the betas stop working.
At this point I can see  any improvement will need time. I also see I will not be able to use the nodes that I bought the licences for and I will not be able to use all the costly hardware I have.

very unfortunate and seems I will be forced to look for alternative rendering solution which is a shame as I love Corona very much.

The only other thing I can do is make the 10G network.That is an expensive operation so before investing I would like to know would that resolve the issues for sure?

kizo
« Last Edit: 2019-02-08, 17:07:10 by kizo »

2019-02-08, 17:32:44
Reply #31

jojorender

  • Active Users
  • **
  • Posts: 241
    • View Profile
I recently reported client disconnects with just 3 clients. At the time when TR server had the memory leak issue I only used 1 client and that also had frequent disconnects.
This happened in automatic and in manual mode. Can’t imagine that 1 client can cause network congestion, but who knows? 

Is increasing the interval and/or # of retries when frame sync fails a way to give that “keep alive car” a chance to sneak onto the highway?

As kizo mentioned, setting up a 10Gb network is a considerable expense, even for small networks.
Cheapest 8-port switch cost about $600, + NIC’s, cables, etc.
About $1k+ to network 4 machines, not knowing if that will really solve the problem.
I already spend a lot of money on RAM, trying to fight the TR memory leak. LOL.
I learned my lesson.

Since you know your customers best, maybe you can survey all your 10+ nodes customers (especially those that have nothing to report) about their TR experience and network implementation and pass along your findings?

Also, as Stefan (former Mr. VrayforC4D I believe) points out here https://forum.corona-renderer.com/index.php?topic=23033.msg140809#msg140809 TR is flawed and he advocates for a real DR solution like implemented for max.
Maybe you can pick his brain, since he must have a pretty good insight into the inner workings/ shortcomings of TR.

We love you Corona!

2019-02-08, 17:35:31
Reply #32

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
You mention "how will I manage to work on daily bassis once the betas stop working".. that made me wonder, how were you working before the betas stop working? Does this mean Team Render with lots of clients etc. was working fine with the betas, and now is no longer working with the commercial licensed release? That would be unusual if so and I can't think of a reason for it, but would be useful information to know.

On the 10 gig network question, unfortunately we can't offer any guarantees on that, sorry.  While it may assist, as it can handle much more traffic, since we haven't been able to test on a 1 Gig versus a 10 Gig with 10 Clients and compare the difference ourselves, we can't offer any sort of guarantee on it resolving the issue, or on at what point the issue would appear again (at certain packet sizes or image resolutions).
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-08, 17:51:02
Reply #33

kizo

  • Active Users
  • **
  • Posts: 28
    • View Profile
Im working without TR.In fact I never use it as it always had issue in my tests.Also I have never used it as other engines we use(d) dont require it.
I have Cinema installed on each machine and corona beta.so I just render on a single machine....big stills (6000x4000) overnight or range of animations on each machine at the same time. works excellent.
Now that its a product with license I wanted to avoid buying 10 licences as its a heavy price.


great points jojorender !
I agree a dedicated DR solution would be best choice. Also it would be really great to hear from anyone using TR with success in general even with fewer machines.
« Last Edit: 2019-02-08, 23:31:28 by kizo »

2019-02-11, 10:16:37
Reply #34

HFPatzi

  • Active Users
  • **
  • Posts: 138
    • View Profile
Hey there Teamrenderers ;)

I guess i would be one of a few corona users with more then 2 machines. Is it helpful for you, if i keep you updated about my refresh/size settings and what's happening during TR? Although i don't have time to do much tests right now since I'm cluttered with renderjobs at the moment. But as I said, I can keep you updated what's happening with certain settings during rendering. And I probably will replicate some of the settings kizo used for his tests. Thank you very much, btw for doing all the testing!!!

My renders will mostly be 9000 x 6000 Pixels @72dpi. But in the near future I'll have some really huge ones. About 15000 x 9000 pixels. Would be pretty interesting I guess :)

Here are my Specs:

Workstation: Intel i9-7940X @3,1GHz, 64GB DDR4 RAM, M2 SSD (1TB), Win 10

Renderclients:

8 x Intel i7-7700 @3,6GHz, 32GB DDR4 RAM, Win 10
1 x Mac Pro (Late 2013), Intel Xeon E5 @3,5GHz, 32GB DDR3 RAM, OSX 10.12.6
1 x MacPro (Mid 2012), 2 x Intel Xeon @2,6GHz, 16GB DDR3 RAM, OSX 10.12.6

(Company)Network:

1 x HP2810-24G Switch
2 x Cisco SG 11D-08 Switch

My workstation is connected directly to the HP Switch as well as both Mac-Clients. Due to lack of free Ports on the HP-Switch, the 8 Windows Clients are splitted (4 each) to the 2 cisco switches wich are connected to the HP-Switch.
Not the greatest setup I know, but the only one that works right now without buying new, expensive hardware.
Also upgrading to a 10Gbit network will be pretty expensive for a small company like ours, where CGI isn't the main income.
In the past I experienced dropouts from basically every client in pretty much random order. Sometimes I had a feeling that only every other Windows client stopped rendering. But that's not for sure.

I'm truly sorry, that i can't provide any testing right now. Will do for sure, if i have more time in the future, since I love corona and want to help improve this great product.
If you have any question regarding my setup, feel free to ask!

Greetings,
Moritz


« Last Edit: 2019-02-11, 22:16:21 by HFPatzi »

2019-02-11, 19:54:53
Reply #35

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
Hi!

Thanks for the offering of testing when time allows Moritz, we welcome any data you are able to send.

@Kizo, thank you for all your testing too. When time allows, we'd be interested to know if you open some large scene from the C4D Content Browser and rendering it using a native C4D render engine, using Team Render. This would be useful to know, to see if the problem lies with Team Render in general, or if our specific implementation of Team Render has added some problems that don't occur with inbuilt engines and TR. Sorry to ask for more testing, but the information is invaluable to us!

And of course our own tests and investigations continue internally, too.

Thanks all!
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-11, 22:08:09
Reply #36

HFPatzi

  • Active Users
  • **
  • Posts: 138
    • View Profile
Guess what? There was a "small" timeframe this evening, so I took the oportunity ;)

I always rendered the same scene and same size: 14763 x 8858 pixels with different "amounts" of clients.
Every combination was rendered with two different network settings: 50s / 50MB and 100s / 100MB

Here are the results:


WS Only:
Network settings: -
Rendertime: 1h 00min
Client dropouts: 0
Clients finished: 1
Console error: -
Dropout time: -


WS + 2 Mac clients:
Network settings: 50s/50MB
Rendertime: 0h 26min
Client dropouts: 0
Clients finished: 3
Console error: -
Dropout time: -


WS + 2 Mac clients:
Network settings: 100s/100MB
Rendertime: 0h 25min
Clients dropouts: 0
Client finished: 3
Console error: -
Dropout time: -


WS + 2 Mac / 2 Win clients:
Network settings: 50s/50MB
Rendertime: 0h 17min
Client dropouts: 0
Clients finished: 5
Console error: -
Dropout time: -


WS + 2 Mac / 2 Win clients:
Network settings: 100s/100MB
Rendertime: 0h 17min
Client dropouts: 0
Clients finished: 5
Console error: -
Dropout time: -


WS + 2 Mac / 4 Win clients:
Network settings: 50s/50MB
Rendertime: 0h 16min
Client dropouts: 1 Win / 0 Mac
Clients finished: 6
Console error: Frame synchronization failed: Communication Error
Dropout time: While Finishing and Sending back chunks. Rendering was finished.


WS + 2 Mac / 4 Win clients:
Network settings: 100s/100MB
Rendertime: 0h 14min
Client dropouts: 0 Win / 1 Mac
Clients finished: 6
Console error: Frame synchronization failed: Communication Error
Dropout time: While Finishing and Sending back chunks. Rendering was finished.


WS + 2 Mac / 6 Win clients:
Network settings: 50s/50MB
Rendertime: 0h 14min
Client dropouts: 3 Win / 0 Mac
Clients finished: 6
Console error: Frame synchronization failed: Communication Error
Dropout time: While Finishing and Sending back chunks. Rendering was finished.


WS + 2 Mac / 6 Win clients:
Network settings: 100s/100MB
Rendertime: 0h 14min
Client dropouts: 3 Win / 1 Mac
Clients finished: 5
Console error: Frame synchronization failed: Communication Error
Dropout time: While Finishing and Sending back chunks. Rendering was finished.


WS + 2 Mac / 8 Win clients:
Network settings: 50s/50MB
Rendertime: 0h 14min
Client dropouts: 3 Win / 2 Mac
Clients finished: 6
Console error: Frame synchronization failed: Communication Error
Dropout time: While Finishing and Sending back chunks. Rendering was finished.


WS + 2 Mac / 8 Win clients:
Network settings: 100s/100MB
Rendertime: 0h 14min
Client dropouts: 6 Win / 0 Mac
Clients finished: 5
Console error: Frame synchronization failed: Communication Error
Dropout time: While Finishing and Sending back chunks. Rendering was finished.



I will also upload an excel sheet with the test results, for better readability and the console output of my workstation via your private uploader. I will edit this post whith the uploaded filename, when done.
What also conserns me is that there is no decrease in rendertime aftrer a certain number of clients. Not sure if this is caused by the increasing number of dropouts.
The main Problem i have with TR right now is, that one has to do a lot of tests to find out the perfect render settings/ number of clients for every project. Almost all time advantage you have with TR is eaten up by finding the best settings for each project.
And, on a side note: Before corona, I rendered with Physical Renderer and also via TR (same image size and same kind of scenes). And as far as I can remember, i never had any dropouts back then.

So much for my testing today ;)
If there are any questions, feel free to ask!

Greetings,
Moritz

P.s.: the Workstation's console output was uploaded via your private uploader uder the name: 1549919693_Workstation-console-output.txt

========

EDIT: Forgot to mention that i set a pass limit of 15 for my tests.
« Last Edit: 2019-02-12, 09:53:48 by HFPatzi »

2019-02-12, 08:25:52
Reply #37

HFPatzi

  • Active Users
  • **
  • Posts: 138
    • View Profile
Just out of curiosity, I did one last test this morning with all clients and settings as follows: 150s/50MB.

Rendertime was 13min 30s and again I had 5 dropouts while finishing and sending back chunks.
Here are 2 example-console outputs of the renderclients. They are slightly different, so I don't know if thats interesting for you:

Code: [Select]
Rendering frame 0 of job 'pv'
2019/02/12 07:54:21  [Corona4D] Parsing scene
2019/02/12 07:54:21  [Corona4D] ====RENDER STARTED====
2019/02/12 07:54:21  [Corona4D] Core build timestamp: Jan 17 2019 14:36:24
Asset 'XXXXX.psd' erstellt
Lade Asset(s) herunter
Asset 'XXXXX.tif' erstellt
Asset 'XXXXX.tif' erstellt
Asset 'Brushed-Metal-1024x1024.png' erstellt
Asset 'XXXXX.png' erstellt
Asset 'XXXXX.png' erstellt
Asset 'XXXXX.tif' erstellt
Asset 'XXXXX.tif' erstellt
Asset 'MetalStainlessSteelBrushedElongated005_COL_8K_SPECULAR.jpg' erstellt
Asset 'MetalStainlessSteelBrushedElongated005_GLOSS_8K_SPECULAR.jpg' erstellt
Asset 'MetalStainlessSteelBrushedElongated005_NRM16_8K_SPECULAR.tif' erstellt
Lade Asset(s) herunter
Asset 'studio003.hdr' erstellt
2019/02/12 07:54:44  [Corona4D] [TR] Sending HELLO to server
2019/02/12 07:54:45  [Corona4D] [TR] Connection established
2019/02/12 07:54:48  [Corona4D] REPORT: unchanged: 0, Prim: 28, Inst: 0, Mtl: 0, Env-tentative: 1
2019/02/12 07:54:48  [Corona4D] Core::renderFrame before onFrame
2019/02/12 07:54:49  [Corona4D] Parsing scene took 27.69 seconds
2019/02/12 07:54:49  [Corona4D] Calculating displacement
2019/02/12 07:54:51  [Corona4D] Preparing geometry took 2.793 seconds
2019/02/12 07:54:51  [Corona4D] Preparing geometry
2019/02/12 07:54:52  [Corona4D] Embree: memory usage: 313,2 MB. Total embree build time is 537ms, top level commit took: 419ms. There are 28 embree geometry groups, 0 instances, 28/0/0/0 tri/animTri/hair/custom primitives
2019/02/12 07:54:52  [Corona4D] Core::renderFrame after onFrame
2019/02/12 07:54:52  [Corona4D] CoronaCore::renderFrame before unique materials enum
2019/02/12 07:54:52  [Corona4D] CoronaCore::renderFrame after unique materials enum
2019/02/12 07:54:52  [Corona4D] CoronaCore::renderFrame after unique materials update
2019/02/12 07:54:52  [Corona4D] Preparing geometry took 0.736 seconds
2019/02/12 07:54:52  [Corona4D] Preparing lights
2019/02/12 07:54:52  [Corona4D] TreeAdapt built: nodes 27, time 0 ms, memory 6.12 kB
2019/02/12 07:54:52  [Corona4D] CutCache initialized: records 1993496, time 29 ms, memory 79.7401 MB
2019/02/12 07:54:52  [Corona4D] CoronaCore:RenderFrame: after directLight
2019/02/12 07:54:52  [Corona4D] CoronaCore::renderFrame: after GI precompute
2019/02/12 07:54:52  [Corona4D] Preparing lights took 0.033 seconds
2019/02/12 07:54:52  [Corona4D] CoronaCore::renderFrame: before render
2019/02/12 07:54:53  [Corona4D] Rendering initial pass
2019/02/12 07:57:13  [Corona4D] [TR] Sending chunk 7/41 to the server
2019/02/12 07:59:59  [Corona4D] [TR] Sending chunk 8/41 to the server
2019/02/12 08:01:00  [Corona4D] Rendering pass 2
2019/02/12 08:02:22  [Corona4D] Rendering initial pass
2019/02/12 08:02:22  [Corona4D] [TR] Sending chunk 9/41 to the server
2019/02/12 08:04:48  [Corona4D] [TR] Sending chunk 10/41 to the server
2019/02/12 08:06:51  [Corona4D] [TR] Sending chunk 11/41 to the server
2019/02/12 08:07:11  [Corona4D] Rendering pass 2
2019/02/12 08:09:05  [Corona4D] Rendering initial pass
2019/02/12 08:09:05  [Corona4D] [TR] Sending chunk 12/41 to the server
2019/02/12 08:09:27  [Corona4D] [TR] Frame synchronization failed: Communication Error
2019/02/12 08:09:27  [Corona4D] [TR] Sending chunk 12/41 to the server
2019/02/12 08:09:28  [Corona4D] [TR] Frame synchronization failed: Communication Error
2019/02/12 08:09:28  [Corona4D] [TR] Sending chunk 12/41 to the server
2019/02/12 08:09:28  [Corona4D] [TR] Frame synchronization failed: Communication Error
2019/02/12 08:09:29  [Corona4D] [TR] Sending chunk 12/41 to the server
2019/02/12 08:09:29  [Corona4D] [TR] Frame synchronization failed: Communication Error
2019/02/12 08:09:29  [Corona4D] [TR] Sending chunk 12/41 to the server
2019/02/12 08:09:30  [Corona4D] [TR] Frame synchronization failed: Communication Error
2019/02/12 08:09:30  [Corona4D] [TR] Sending chunk 12/41 to the server
2019/02/12 08:09:30  [Corona4D] [TR] Frame synchronization failed: Communication Error
2019/02/12 08:09:31  [Corona4D] [TR] Sending chunk 12/41 to the server
2019/02/12 08:09:31  [Corona4D] [TR] Frame synchronization failed: Communication Error
2019/02/12 08:09:31  [Corona4D] CoronaCore::renderFrame: after render
2019/02/12 08:09:31  [Corona4D] Terminating DR slaves
2019/02/12 08:09:31  [Corona4D]  - terminating slave handlers
2019/02/12 08:09:31  [Corona4D]  - waiting for broadcast thread to finish
2019/02/12 08:09:31  [Corona4D]  - clearing slave handlers
2019/02/12 08:09:31  [Corona4D] Terminating DR slaves ended
2019/02/12 08:09:31  [Corona4D] Rendering took 878.941 seconds
2019/02/12 08:09:31  [Corona4D] Cleaning up
2019/02/12 08:09:31  [Corona4D] Terminating DR slaves
2019/02/12 08:09:31  [Corona4D]  - terminating slave handlers
2019/02/12 08:09:31  [Corona4D]  - waiting for broadcast thread to finish
2019/02/12 08:09:31  [Corona4D]  - clearing slave handlers
2019/02/12 08:09:31  [Corona4D] Terminating DR slaves ended
2019/02/12 08:09:31  [Corona4D] CutCache memory: 80.398 MB
2019/02/12 08:09:31  [Corona4D] Unique Primitives: 4303813
2019/02/12 08:09:31  [Corona4D] Primitives with instancing: 4303813
2019/02/12 08:09:31  [Corona4D] Area lights: 15
2019/02/12 08:09:31  [Corona4D] Geometry groups: 28
2019/02/12 08:09:31  [Corona4D] Materials: 20
2019/02/12 08:09:31  [Corona4D] Instances: 28
2019/02/12 08:09:31  [Corona4D] Portals: 0
2019/02/12 08:09:31  [Corona4D] Area lights: 15
2019/02/12 08:09:31  [Corona4D] Avg samples per pixel: 0.00105628
2019/02/12 08:09:31  [Corona4D] Avg rays per sample: 18.6773
2019/02/12 08:09:31  [Corona4D] Rays/s: 2935.24
2019/02/12 08:09:31  [Corona4D] Samples/s: 157.155
2019/02/12 08:09:31  [Corona4D] Saving + Cleaning up took 0.06 seconds
2019/02/12 08:09:31  [Corona4D] CoronaCore::exiting renderFrame
2019/02/12 08:09:31  [Corona4D] Rendered 0/0 passes
Peer-to-Peer-Statistik:
    > CAD660023 Downloadgeschwindigkeit 9.69 MiB\s (1x)
    > PC01 Downloadgeschwindigkeit 10.45 MiB\s (8x)
    > CAD660029 Downloadgeschwindigkeit 33.05 MiB\s (13x)
    > CAD660026 Downloadgeschwindigkeit 41.52 MiB\s (1x)
    > CAD660028 Downloadgeschwindigkeit 88.59 MiB\s (1x)
Peer-to-Peer-Statistik Ende

Code: [Select]
Rendering frame 0 of job 'pv'
2019/02/12 07:54:21  [Corona4D] Parsing scene
2019/02/12 07:54:21  [Corona4D] ====RENDER STARTED====
2019/02/12 07:54:21  [Corona4D] Core build timestamp: Jan 17 2019 14:36:24
Asset 'XXXXXX.psd' erstellt
Lade Asset(s) herunter
Asset 'XXXXXX.tif' erstellt
Asset 'XXXXXX.tif' erstellt
Asset 'Brushed-Metal-1024x1024.png' erstellt
Asset 'XXXXXX.png' erstellt
Asset 'XXXXXX.png' erstellt
Asset 'XXXXXX.tif' erstellt
Asset 'XXXXXX.tif' erstellt
Asset 'MetalStainlessSteelBrushedElongated005_COL_8K_SPECULAR.jpg' erstellt
Asset 'MetalStainlessSteelBrushedElongated005_GLOSS_8K_SPECULAR.jpg' erstellt
Asset 'MetalStainlessSteelBrushedElongated005_NRM16_8K_SPECULAR.tif' erstellt
Lade Asset(s) herunter
Asset 'studio003.hdr' erstellt
2019/02/12 07:54:41  [Corona4D] [TR] Sending HELLO to server
2019/02/12 07:54:41  [Corona4D] [TR] Connection established
2019/02/12 07:54:44  [Corona4D] REPORT: unchanged: 0, Prim: 28, Inst: 0, Mtl: 0, Env-tentative: 1
2019/02/12 07:54:44  [Corona4D] Core::renderFrame before onFrame
2019/02/12 07:54:45  [Corona4D] Parsing scene took 23.676 seconds
2019/02/12 07:54:45  [Corona4D] Calculating displacement
2019/02/12 07:54:48  [Corona4D] Preparing geometry took 2.863 seconds
2019/02/12 07:54:48  [Corona4D] Preparing geometry
2019/02/12 07:54:48  [Corona4D] Embree: memory usage: 313,2 MB. Total embree build time is 538ms, top level commit took: 419ms. There are 28 embree geometry groups, 0 instances, 28/0/0/0 tri/animTri/hair/custom primitives
2019/02/12 07:54:48  [Corona4D] Core::renderFrame after onFrame
2019/02/12 07:54:48  [Corona4D] CoronaCore::renderFrame before unique materials enum
2019/02/12 07:54:48  [Corona4D] CoronaCore::renderFrame after unique materials enum
2019/02/12 07:54:48  [Corona4D] CoronaCore::renderFrame after unique materials update
2019/02/12 07:54:48  [Corona4D] Preparing geometry took 0.731 seconds
2019/02/12 07:54:48  [Corona4D] Preparing lights
2019/02/12 07:54:49  [Corona4D] TreeAdapt built: nodes 27, time 0 ms, memory 6.12 kB
2019/02/12 07:54:49  [Corona4D] CutCache initialized: records 1993496, time 29 ms, memory 79.7401 MB
2019/02/12 07:54:49  [Corona4D] CoronaCore:RenderFrame: after directLight
2019/02/12 07:54:49  [Corona4D] CoronaCore::renderFrame: after GI precompute
2019/02/12 07:54:49  [Corona4D] Preparing lights took 0.033 seconds
2019/02/12 07:54:49  [Corona4D] CoronaCore::renderFrame: before render
2019/02/12 07:54:49  [Corona4D] Rendering initial pass
2019/02/12 07:57:30  [Corona4D] [TR] Sending chunk 4/41 to the server
2019/02/12 08:00:34  [Corona4D] [TR] Sending chunk 5/41 to the server
2019/02/12 08:00:58  [Corona4D] Rendering pass 2
2019/02/12 08:02:58  [Corona4D] Rendering initial pass
2019/02/12 08:02:58  [Corona4D] [TR] Sending chunk 6/41 to the server
2019/02/12 08:05:55  [Corona4D] [TR] Sending chunk 7/41 to the server
2019/02/12 08:07:09  [Corona4D] Rendering pass 2
2019/02/12 08:08:38  [Corona4D] Rendering initial pass
2019/02/12 08:08:38  [Corona4D] [TR] Sending chunk 8/41 to the server
2019/02/12 08:08:39  [Corona4D] CoronaCore::renderFrame: after render
2019/02/12 08:08:39  [Corona4D] Terminating DR slaves
2019/02/12 08:08:39  [Corona4D]  - terminating slave handlers
2019/02/12 08:08:39  [Corona4D]  - waiting for broadcast thread to finish
2019/02/12 08:08:39  [Corona4D]  - clearing slave handlers
2019/02/12 08:08:39  [Corona4D] Terminating DR slaves ended
2019/02/12 08:08:39  [Corona4D] Rendering took 830.658 seconds
2019/02/12 08:08:39  [Corona4D] Cleaning up
2019/02/12 08:08:39  [Corona4D] Terminating DR slaves
2019/02/12 08:08:39  [Corona4D]  - terminating slave handlers
2019/02/12 08:08:39  [Corona4D]  - waiting for broadcast thread to finish
2019/02/12 08:08:39  [Corona4D]  - clearing slave handlers
2019/02/12 08:08:39  [Corona4D] Terminating DR slaves ended
2019/02/12 08:08:39  [Corona4D] CutCache memory: 80.4062 MB
2019/02/12 08:08:39  [Corona4D] Unique Primitives: 4303813
2019/02/12 08:08:39  [Corona4D] Primitives with instancing: 4303813
2019/02/12 08:08:39  [Corona4D] Area lights: 15
2019/02/12 08:08:39  [Corona4D] Geometry groups: 28
2019/02/12 08:08:39  [Corona4D] Materials: 20
2019/02/12 08:08:39  [Corona4D] Instances: 28
2019/02/12 08:08:39  [Corona4D] Portals: 0
2019/02/12 08:08:39  [Corona4D] Area lights: 15
2019/02/12 08:08:39  [Corona4D] Avg samples per pixel: 0.00275053
2019/02/12 08:08:39  [Corona4D] Avg rays per sample: 18.5701
2019/02/12 08:08:39  [Corona4D] Rays/s: 8041.15
2019/02/12 08:08:39  [Corona4D] Samples/s: 433.016
2019/02/12 08:08:39  [Corona4D] Saving + Cleaning up took 0.056 seconds
2019/02/12 08:08:39  [Corona4D] CoronaCore::exiting renderFrame
2019/02/12 08:08:39  [Corona4D] Rendered 0/0 passes
2019/02/12 08:08:40  [Corona4D] [TR] Sending chunk 9/41 to the server
2019/02/12 08:08:45  [Corona4D] [TR] Finished chunks: 1/41
2019/02/12 08:08:45  [Corona4D] [TR] Sending chunk 10/41 to the server
2019/02/12 08:08:58  [Corona4D] [TR] Finished chunks: 2/41
2019/02/12 08:08:59  [Corona4D] [TR] Sending chunk 11/41 to the server
2019/02/12 08:09:08  [Corona4D] [TR] Finished chunks: 3/41
2019/02/12 08:09:08  [Corona4D] [TR] Sending chunk 12/41 to the server
2019/02/12 08:09:16  [Corona4D] [TR] Finished chunks: 4/41
2019/02/12 08:09:17  [Corona4D] [TR] Sending chunk 13/41 to the server
2019/02/12 08:09:38  [Corona4D] [TR] Frame synchronization failed: Communication Error
2019/02/12 08:09:38  [Corona4D] [TR] Sending chunk 13/41 to the server
2019/02/12 08:09:38  [Corona4D] [TR] Frame synchronization failed: Communication Error
2019/02/12 08:09:39  [Corona4D] [TR] Sending chunk 13/41 to the server
2019/02/12 08:09:39  [Corona4D] [TR] Frame synchronization failed: Communication Error
2019/02/12 08:09:39  [Corona4D] [TR] Sending chunk 13/41 to the server
2019/02/12 08:09:39  [Corona4D] [TR] Frame synchronization failed: Communication Error
2019/02/12 08:09:40  [Corona4D] [TR] Sending chunk 13/41 to the server
2019/02/12 08:09:40  [Corona4D] [TR] Frame synchronization failed: Communication Error
2019/02/12 08:09:40  [Corona4D] [TR] Sending chunk 13/41 to the server
2019/02/12 08:09:41  [Corona4D] [TR] Frame synchronization failed: Communication Error
2019/02/12 08:09:41  [Corona4D] [TR] Sending chunk 13/41 to the server
2019/02/12 08:09:41  [Corona4D] [TR] Frame synchronization failed: Communication Error
Peer-to-Peer-Statistik:
    > CAD660024 Downloadgeschwindigkeit 4.38 MiB\s (1x)
    > CAD660026 Downloadgeschwindigkeit 9.15 MiB\s (1x)
    > PC01 Downloadgeschwindigkeit 12.24 MiB\s (7x)
    > CAD660029 Downloadgeschwindigkeit 38.18 MiB\s (13x)
    > CAD660027 Downloadgeschwindigkeit 50.73 MiB\s (1x)
    > CAD660025 Downloadgeschwindigkeit 53.85 MiB\s (1x)
Peer-to-Peer-Statistik Ende

And here is one example of a machine without errors:

Code: [Select]
Rendering frame 0 of job 'pv'
2019/02/12 07:54:22  [Corona4D] Parsing scene
2019/02/12 07:54:22  [Corona4D] ====RENDER STARTED====
2019/02/12 07:54:22  [Corona4D] Core build timestamp: Jan 17 2019 14:36:24
Asset 'XXXXXX.psd' erstellt
Lade Asset(s) herunter
Asset 'XXXXXX.tif' erstellt
Asset 'XXXXXX.tif' erstellt
Asset 'Brushed-Metal-1024x1024.png' erstellt
Asset 'XXXXXX.png' erstellt
Asset 'XXXXXX.png' erstellt
Asset 'XXXXXX.tif' erstellt
Asset 'XXXXXX.tif' erstellt
Asset 'MetalStainlessSteelBrushedElongated005_COL_8K_SPECULAR.jpg' erstellt
Asset 'MetalStainlessSteelBrushedElongated005_GLOSS_8K_SPECULAR.jpg' erstellt
Asset 'MetalStainlessSteelBrushedElongated005_NRM16_8K_SPECULAR.tif' erstellt
Asset 'studio003.hdr' erstellt
2019/02/12 07:54:42  [Corona4D] [TR] Sending HELLO to server
2019/02/12 07:54:42  [Corona4D] [TR] Connection established
2019/02/12 07:54:45  [Corona4D] REPORT: unchanged: 0, Prim: 28, Inst: 0, Mtl: 0, Env-tentative: 1
2019/02/12 07:54:45  [Corona4D] Core::renderFrame before onFrame
2019/02/12 07:54:45  [Corona4D] Parsing scene took 23.556 seconds
2019/02/12 07:54:45  [Corona4D] Calculating displacement
2019/02/12 07:54:48  [Corona4D] Preparing geometry took 2.851 seconds
2019/02/12 07:54:48  [Corona4D] Preparing geometry
2019/02/12 07:54:49  [Corona4D] Embree: memory usage: 315,2 MB. Total embree build time is 532ms, top level commit took: 414ms. There are 28 embree geometry groups, 0 instances, 28/0/0/0 tri/animTri/hair/custom primitives
2019/02/12 07:54:49  [Corona4D] Core::renderFrame after onFrame
2019/02/12 07:54:49  [Corona4D] CoronaCore::renderFrame before unique materials enum
2019/02/12 07:54:49  [Corona4D] CoronaCore::renderFrame after unique materials enum
2019/02/12 07:54:49  [Corona4D] CoronaCore::renderFrame after unique materials update
2019/02/12 07:54:49  [Corona4D] Preparing geometry took 0.726 seconds
2019/02/12 07:54:49  [Corona4D] Preparing lights
2019/02/12 07:54:49  [Corona4D] TreeAdapt built: nodes 27, time 0 ms, memory 6.12 kB
2019/02/12 07:54:49  [Corona4D] CutCache initialized: records 1993496, time 29 ms, memory 79.7401 MB
2019/02/12 07:54:49  [Corona4D] CoronaCore:RenderFrame: after directLight
2019/02/12 07:54:49  [Corona4D] CoronaCore::renderFrame: after GI precompute
2019/02/12 07:54:49  [Corona4D] Preparing lights took 0.033 seconds
2019/02/12 07:54:49  [Corona4D] CoronaCore::renderFrame: before render
2019/02/12 07:54:50  [Corona4D] Rendering initial pass
2019/02/12 07:57:29  [Corona4D] [TR] Sending chunk 3/41 to the server
2019/02/12 08:00:32  [Corona4D] [TR] Sending chunk 4/41 to the server
2019/02/12 08:00:53  [Corona4D] Rendering pass 2
2019/02/12 08:03:24  [Corona4D] Rendering initial pass
2019/02/12 08:03:24  [Corona4D] [TR] Sending chunk 5/41 to the server
2019/02/12 08:06:22  [Corona4D] [TR] Sending chunk 6/41 to the server
2019/02/12 08:06:58  [Corona4D] Rendering pass 2
2019/02/12 08:08:17  [Corona4D] Rendering initial pass
2019/02/12 08:08:17  [Corona4D] [TR] Sending chunk 7/41 to the server
2019/02/12 08:08:20  [Corona4D] CoronaCore::renderFrame: after render
2019/02/12 08:08:20  [Corona4D] Terminating DR slaves
2019/02/12 08:08:20  [Corona4D]  - terminating slave handlers
2019/02/12 08:08:20  [Corona4D]  - waiting for broadcast thread to finish
2019/02/12 08:08:20  [Corona4D]  - clearing slave handlers
2019/02/12 08:08:20  [Corona4D] Terminating DR slaves ended
2019/02/12 08:08:20  [Corona4D] Rendering took 810.947 seconds
2019/02/12 08:08:20  [Corona4D] Cleaning up
2019/02/12 08:08:20  [Corona4D] Terminating DR slaves
2019/02/12 08:08:20  [Corona4D]  - terminating slave handlers
2019/02/12 08:08:20  [Corona4D]  - waiting for broadcast thread to finish
2019/02/12 08:08:20  [Corona4D]  - clearing slave handlers
2019/02/12 08:08:20  [Corona4D] Terminating DR slaves ended
2019/02/12 08:08:20  [Corona4D] CutCache memory: 80.3913 MB
2019/02/12 08:08:20  [Corona4D] Unique Primitives: 4303813
2019/02/12 08:08:20  [Corona4D] Primitives with instancing: 4303813
2019/02/12 08:08:20  [Corona4D] Area lights: 15
2019/02/12 08:08:20  [Corona4D] Geometry groups: 28
2019/02/12 08:08:20  [Corona4D] Materials: 20
2019/02/12 08:08:20  [Corona4D] Instances: 28
2019/02/12 08:08:20  [Corona4D] Portals: 0
2019/02/12 08:08:20  [Corona4D] Area lights: 15
2019/02/12 08:08:20  [Corona4D] Avg samples per pixel: 0.0077253
2019/02/12 08:08:20  [Corona4D] Avg rays per sample: 18.5262
2019/02/12 08:08:20  [Corona4D] Rays/s: 23079.2
2019/02/12 08:08:20  [Corona4D] Samples/s: 1245.76
2019/02/12 08:08:20  [Corona4D] Saving + Cleaning up took 0.066 seconds
2019/02/12 08:08:20  [Corona4D] CoronaCore::exiting renderFrame
2019/02/12 08:08:20  [Corona4D] Rendered 0/0 passes
2019/02/12 08:08:21  [Corona4D] [TR] Sending chunk 8/41 to the server
2019/02/12 08:08:22  [Corona4D] [TR] Finished chunks: 1/41
2019/02/12 08:08:23  [Corona4D] [TR] Sending chunk 9/41 to the server
2019/02/12 08:08:25  [Corona4D] [TR] Finished chunks: 2/41
2019/02/12 08:08:25  [Corona4D] [TR] Sending chunk 10/41 to the server
2019/02/12 08:08:29  [Corona4D] [TR] Finished chunks: 3/41
2019/02/12 08:08:29  [Corona4D] [TR] Sending chunk 11/41 to the server
2019/02/12 08:08:49  [Corona4D] [TR] Finished chunks: 4/41
2019/02/12 08:08:49  [Corona4D] [TR] Sending chunk 12/41 to the server
2019/02/12 08:09:00  [Corona4D] [TR] Finished chunks: 5/41
2019/02/12 08:09:01  [Corona4D] [TR] Sending chunk 13/41 to the server
2019/02/12 08:09:12  [Corona4D] [TR] Finished chunks: 6/41
2019/02/12 08:09:12  [Corona4D] [TR] Sending chunk 14/41 to the server
2019/02/12 08:09:22  [Corona4D] [TR] Finished chunks: 7/41
2019/02/12 08:09:22  [Corona4D] [TR] Sending chunk 15/41 to the server
2019/02/12 08:09:27  [Corona4D] [TR] Finished chunks: 8/41
2019/02/12 08:09:28  [Corona4D] [TR] Sending chunk 16/41 to the server
2019/02/12 08:09:31  [Corona4D] [TR] Finished chunks: 9/41
2019/02/12 08:09:32  [Corona4D] [TR] Sending chunk 17/41 to the server
2019/02/12 08:09:36  [Corona4D] [TR] Finished chunks: 10/41
2019/02/12 08:09:37  [Corona4D] [TR] Sending chunk 18/41 to the server
2019/02/12 08:09:44  [Corona4D] [TR] Finished chunks: 11/41
2019/02/12 08:09:45  [Corona4D] [TR] Sending chunk 19/41 to the server
2019/02/12 08:09:50  [Corona4D] [TR] Finished chunks: 12/41
2019/02/12 08:09:51  [Corona4D] [TR] Sending chunk 20/41 to the server
2019/02/12 08:10:04  [Corona4D] [TR] Finished chunks: 13/41
2019/02/12 08:10:04  [Corona4D] [TR] Sending chunk 21/41 to the server
2019/02/12 08:10:07  [Corona4D] [TR] Finished chunks: 14/41
2019/02/12 08:10:08  [Corona4D] [TR] Sending chunk 22/41 to the server
2019/02/12 08:10:14  [Corona4D] [TR] Finished chunks: 15/41
2019/02/12 08:10:15  [Corona4D] [TR] Sending chunk 23/41 to the server
2019/02/12 08:10:19  [Corona4D] [TR] Finished chunks: 16/41
2019/02/12 08:10:19  [Corona4D] [TR] Sending chunk 24/41 to the server
2019/02/12 08:10:25  [Corona4D] [TR] Finished chunks: 17/41
2019/02/12 08:10:26  [Corona4D] [TR] Sending chunk 25/41 to the server
2019/02/12 08:10:30  [Corona4D] [TR] Finished chunks: 18/41
2019/02/12 08:10:31  [Corona4D] [TR] Sending chunk 26/41 to the server
2019/02/12 08:10:40  [Corona4D] [TR] Finished chunks: 19/41
2019/02/12 08:10:41  [Corona4D] [TR] Sending chunk 27/41 to the server
2019/02/12 08:10:44  [Corona4D] [TR] Finished chunks: 20/41
2019/02/12 08:10:45  [Corona4D] [TR] Sending chunk 28/41 to the server
2019/02/12 08:10:48  [Corona4D] [TR] Finished chunks: 21/41
2019/02/12 08:10:49  [Corona4D] [TR] Sending chunk 29/41 to the server
2019/02/12 08:10:51  [Corona4D] [TR] Finished chunks: 22/41
2019/02/12 08:10:52  [Corona4D] [TR] Sending chunk 30/41 to the server
2019/02/12 08:10:54  [Corona4D] [TR] Finished chunks: 23/41
2019/02/12 08:10:55  [Corona4D] [TR] Sending chunk 31/41 to the server
2019/02/12 08:11:02  [Corona4D] [TR] Finished chunks: 24/41
2019/02/12 08:11:02  [Corona4D] [TR] Sending chunk 32/41 to the server
2019/02/12 08:11:05  [Corona4D] [TR] Finished chunks: 25/41
2019/02/12 08:11:05  [Corona4D] [TR] Sending chunk 33/41 to the server
2019/02/12 08:11:08  [Corona4D] [TR] Finished chunks: 26/41
2019/02/12 08:11:08  [Corona4D] [TR] Sending chunk 34/41 to the server
2019/02/12 08:11:11  [Corona4D] [TR] Finished chunks: 27/41
2019/02/12 08:11:11  [Corona4D] [TR] Sending chunk 35/41 to the server
2019/02/12 08:11:13  [Corona4D] [TR] Finished chunks: 28/41
2019/02/12 08:11:13  [Corona4D] [TR] Sending chunk 36/41 to the server
2019/02/12 08:11:14  [Corona4D] [TR] Finished chunks: 29/41
2019/02/12 08:11:15  [Corona4D] [TR] Sending chunk 37/41 to the server
2019/02/12 08:11:16  [Corona4D] [TR] Finished chunks: 30/41
2019/02/12 08:11:16  [Corona4D] [TR] Sending chunk 38/41 to the server
2019/02/12 08:11:22  [Corona4D] [TR] Finished chunks: 31/41
2019/02/12 08:11:23  [Corona4D] [TR] Sending chunk 39/41 to the server
2019/02/12 08:11:25  [Corona4D] [TR] Finished chunks: 32/41
2019/02/12 08:11:25  [Corona4D] [TR] Sending chunk 40/41 to the server
2019/02/12 08:11:27  [Corona4D] [TR] Finished chunks: 33/41
2019/02/12 08:11:27  [Corona4D] [TR] Sending chunk 0/41 to the server
2019/02/12 08:11:31  [Corona4D] [TR] Finished chunks: 34/41
2019/02/12 08:11:32  [Corona4D] [TR] Sending chunk 1/41 to the server
2019/02/12 08:11:34  [Corona4D] [TR] Finished chunks: 35/41
2019/02/12 08:11:34  [Corona4D] [TR] Sending chunk 2/41 to the server
2019/02/12 08:11:35  [Corona4D] [TR] Finished chunks: 36/41
2019/02/12 08:11:36  [Corona4D] [TR] Sending chunk 3/41 to the server
2019/02/12 08:11:39  [Corona4D] [TR] Finished chunks: 37/41
2019/02/12 08:11:40  [Corona4D] [TR] Sending chunk 4/41 to the server
2019/02/12 08:11:41  [Corona4D] [TR] Finished chunks: 38/41
2019/02/12 08:11:42  [Corona4D] [TR] Sending chunk 5/41 to the server
2019/02/12 08:11:45  [Corona4D] [TR] Finished chunks: 39/41
2019/02/12 08:11:46  [Corona4D] [TR] Sending chunk 6/41 to the server
2019/02/12 08:11:49  [Corona4D] [TR] Finished chunks: 40/41
2019/02/12 08:11:50  [Corona4D] [TR] Sending chunk 7/41 to the server
2019/02/12 08:11:51  [Corona4D] [TR] Finished chunks: 41/41
Peer-to-Peer-Statistik:
    > CAD660022 Downloadgeschwindigkeit 10.09 MiB\s (1x)
    > PC01 Downloadgeschwindigkeit 10.62 MiB\s (8x)
    > CAD660029 Downloadgeschwindigkeit 34.58 MiB\s (13x)
    > CAD660023 Downloadgeschwindigkeit 41.25 MiB\s (1x)
    > CAD660025 Downloadgeschwindigkeit 48.63 MiB\s (1x)
Peer-to-Peer-Statistik Ende

I only censored some asset-names.

Hope that helps!

Greetings,
Moritz

2019-02-12, 09:07:22
Reply #38

HFPatzi

  • Active Users
  • **
  • Posts: 138
    • View Profile
Ok.....one last thing :D

I rendered the whole scene with Physical Renderer and TR for comparison.

Rendertime: 24min, no dropouts.
Here is one example client console-log:

Code: [Select]
Grabbed Render-Job 'pv' from machine 'PC01'
Asset 'XXXXXX.c4d' erstellt
Lade Asset(s) herunter
Lade Asset(s) herunter
Lade Asset(s) herunter
Downloaded Asset(s) in 9.771 seconds
Start Rendering for Machine PC01
Asset 'XXXXXX.psd' erstellt
Lade Asset(s) herunter
Asset 'XXXXXX.tif' erstellt
Asset 'XXXXXX.tif' erstellt
Asset 'Brushed-Metal-1024x1024.png' erstellt
Asset 'XXXXXX.png' erstellt
Asset 'XXXXXX.tif' erstellt
Asset 'XXXXXX.tif' erstellt
Asset 'MetalStainlessSteelBrushedElongated005_GLOSS_8K_SPECULAR.jpg' erstellt
Asset 'MetalStainlessSteelBrushedElongated005_NRM16_8K_SPECULAR.tif' erstellt
Asset 'studio003.hdr' erstellt
Rendering frame 0 of job 'pv'
Peer-to-Peer-Statistik:
    > CAD660027 Downloadgeschwindigkeit 4.84 MiB\s (1x)
    > PC01 Downloadgeschwindigkeit 22.14 MiB\s (9x)
    > CAD660026 Downloadgeschwindigkeit 30.18 MiB\s (1x)
    > CAD660028 Downloadgeschwindigkeit 35.49 MiB\s (10x)
    > CAD660024 Downloadgeschwindigkeit 77.72 MiB\s (1x)
Peer-to-Peer-Statistik Ende

2019-02-12, 10:15:27
Reply #39

kizo

  • Active Users
  • **
  • Posts: 28
    • View Profile
Hello Moritz,

its great to see you are testing this too. Seems the tests show the same pattern where the faster the render gets the more clients drop out.Makes me think how great and fast would it be if none would disconnect.
Getting no disconnects but a rather slow render time is rather frustrating when you have more power available.

"The main Problem i have with TR right now is, that one has to do a lot of tests to find out the perfect render settings/ number of clients for every project. Almost all time advantage you have with TR is eaten up by finding the best settings for each project"

yes I can agree thats a big problem for me too. It is scene and resolution dependent but also the number of nodes used plays a role so to optimally use the clients one should do extensive testing before rendering and thats taking way too much time

@Tom

I did test a scene from the content browser "Living Room.c4d"  and I got clients disconnecting.I managed to do only one test using all clients for now but will try to do more and let you know.

Cheers

kizo

2019-02-12, 14:38:11
Reply #40

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
Wow, thanks guys for the tests, very much appreciated!

Moritz, what was the render limit, was it noise or passes? Any chance you made a note of how many passes were done? (One thing to note about distributed rendering, and this applies to Max too, is that it may "go over" the render limit, e.g. target is 4% noise, master receives updates of 30 passes from nodes and actual noise is 4.1% so it continues rendering, then when it hits 4% if receives another 30 passes from the clients and the noise gets to 3% - so it can mean you end up with more passes and a cleaner image than a single workstation, which can obscure some time savings just a little; the more nodes are in use, the more likely that is to happen by a larger margin - though it still shouldn't be massive, but it may play a part).

Interesting that the Win machines seem to drop out more than the Mac machines. (EDIT though there are more Win clients, so a roll of the dice on who drops out makes it more likely to be a Win machine... so maybe there isn't a difference or a big difference, as I read more :) )

Also interesting Kizo that a native render engine and scene also seems to show the drop outs, could point to just a general flaw in TR itself rather than how we implement it (not to say that there isn't anything we can do if that's the case, we'd certainly still look into it, but it is just good to know where to look and what solutions to consider and investigate).

Thanks!
« Last Edit: 2019-02-12, 14:45:22 by TomG »
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-12, 15:08:12
Reply #41

HFPatzi

  • Active Users
  • **
  • Posts: 138
    • View Profile
You're right, i forgot to mention that :) On all tests, pass limit was set to15 and no denoising activated.
I also noticed the "overshoot" when pass limit is set. e.g. 26 of 15 passes.
I will test the "Living Room.c4d" scene too asap and will report the results

2019-02-12, 16:13:03
Reply #42

HFPatzi

  • Active Users
  • **
  • Posts: 138
    • View Profile
Any chance you made a note of how many passes were done?

You mean how many more passes then the 15 it was set up for? If so, I haven't unfortunatly. But i can tell that it was always more then 15 (maybe between 5 and 15 more) and not alway the same.

2019-02-12, 16:19:03
Reply #43

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
NP on not having the exact number, good to know what the rendering limit was though, and yes that expected "overrun" will account for some time differences (render time may not reduce quite as much as expected, but that's because there are actually more passes being done / a less noisy render being produced).
Tom Grimes | chaos-corona.com
Product Manager | contact us

2019-02-16, 09:30:54
Reply #44

Gnorkie

  • Active Users
  • **
  • Posts: 21
    • View Profile
Same issues here in the company I work for.

We already bought 7 ws+5nodes licenses and a ws+10 nodes license.

We could potentially use 50 clients to render, and yes... 1gbit Ethernet.

I tried last night a 5000px A3 image, quite simple file, and clients were constantly getting disconnected and I was manually restarting them from the main workstation.

Sorry guys at Corona but this is quite disappointing since you are selling this product and not every company has time to do so many tests like Kizo is doing here, some companies might not even read the forum.

As written in one of the posts I definitely suggest you to develop a dedicated DR server. VRAYforC4D's one was (and is) working smoothly, using the vrayserver from Chaos. Now Chaos acquired them, let's see what comes.
The company I work for was using Maxwell Render before switching to Corona (we are trying to...) and we never had such issues, even using 20 or 30 nodes to calculate a single image.

I might sound harsh but you are selling a product that you didn't develop personally and also it seems you didn't test properly. I'm not talking about Corona but about the Team Render solution.


EDIT:

Reading the other post I see people complaining for the vrayserver restrictions. Well, the later a proper DR will be developed the more users will feel the "loss" of native C4D shaders usability.
« Last Edit: 2019-02-16, 09:35:53 by Gnorkie »

2019-02-16, 11:25:41
Reply #45

Gnorkie

  • Active Users
  • **
  • Posts: 21
    • View Profile
Ok, now I'm trying only with 5 nodes + WS and they don't seem to disconnect. Last night I was using 10 nodes and it was constantly disconnecting.

I'll try to keep these 6 computers busy for testing... mainly to check the best number of nodes to use (WS +2, 3, 4 or 5) to get a reasonable speedup.

EDIT: adding the 5th node causes disconnections. With 4 nodes it was working fine for more that 1hour.
« Last Edit: 2019-02-16, 11:43:45 by Gnorkie »

2019-02-19, 18:35:53
Reply #46

Gnorkie

  • Active Users
  • **
  • Posts: 21
    • View Profile
Object ID pass is not working with Team Render. It becomes totally black.

2019-02-19, 19:30:58
Reply #47

TomG

  • Administrator
  • Active Users
  • *****
  • Posts: 5434
    • View Profile
Object ID pass is not working with Team Render. It becomes totally black.

Currently, each machine involved in TR assigns its own colors to each object, which leads to problems. This was reported at https://forum.corona-renderer.com/index.php?topic=23411, and was logged from that for us to investigate and fix :) (as noted in that other thread, the internal tracking number for the issue is 313390528)
Tom Grimes | chaos-corona.com
Product Manager | contact us