Chaos Corona Forum

Chaos Corona for 3ds Max => [Max] Daily Builds => Topic started by: maru on 2018-10-24, 11:37:14

Title: DR server needs to be restarted issue - V3 daily builds update
Post by: maru on 2018-10-24, 11:37:14
We have received few reports from users having an issue with the DR server where it gets stuck and has to be restarted to continue rendering. Unfortunately we were never able to reproduce this problem and investigation based on your reports and minidumps did not lead us to the root cause of this bug yet. This makes us suspect that the issue might be related to specific hardware configuration, network setup, or 3rd party plugins and applications (however we are not saying it's not our fault!).

We have identified some other bugs which we believe may be related to this problem and we hope fixing them fixes this one as well. Some of the fixes have been already released in the recent builds, and some of them will be released in the upcoming ones.

We have also added a "Restart 3ds Max after each render" option to the DR server based on your requests. We hope this will serve as a workaround for this problem until we can identify the real cause and fix it properly.

Please let us know if you are still experiencing this "DR server needs to be restarted" issue with the "Restart 3ds Max" checkbox enabled and disabled as this will greatly help us improve distributed rendering further.

*Update: note that the new version of the DR server application is installed into C:\Program Files\Corona\DR Server\DrServer.exe while the old one was installed into C:\Program Files\Corona\DrServer.exe. Make sure you are launching the correct version of the DR server application. It must have "DrServer | 3 (Release Candidate X)" text printed in its title bar.

The newest daily build can be downloaded from https://coronarenderer.freshdesk.com/support/solutions/articles/5000570015
Feel free to share your feedback in this forum thread or through https://coronarenderer.freshdesk.com/support/tickets/new

Thank you in advance for testing and for your patience, and sorry for this inconvenience.

Update: 09.11.2018 - V3 RC5 released with yet another DR-related fix. Please try it.

Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-10-25, 16:32:37
Unfortunately the master PC was stuck again this morning. I hoped the problem was disappeared but I was wrong.
Can't figure out why is happening.

It seems if I use the DR servers from my local PC everything works fine.
Using Backburner and sending the job to another server + DR produces randomly the block of the main servers while the DR ones continue to calculate I don't know even what?!?!?
Their task manager shows CPU at 100% while the main server is blocked.

Dionysios -
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: maru on 2018-10-25, 16:35:28
Unfortunately the master PC was stuck again this morning. I hoped the problem was disappeared but I was wrong.
Did you try with the "restart 3ds Max" option on and off? It was the same in both cases?
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-10-25, 16:37:27
Unfortunately the master PC was stuck again this morning. I hoped the problem was disappeared but I was wrong.
Did you try with the "restart 3ds Max" option on and off? It was the same in both cases?

We activated it right now. We have a very long list of renders to be done so if it works or not will come out by tomorrow morning.
For sure I'll let you know! :)

Dionysios -
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-10-26, 10:19:15
No good news, the system got blocked 2 times tonight, again... That's so sad...

Last night at 11pm I had to log in from home via Team Viewer to the Master PC and the desktop was freezed. Fortunately I could start task manager and ill the 3ds Max process.
In a second, everything turned to normal. Backburner sent the same job again which was started successfully and duting the night the job was finished.
After that another job after 3 hours got done, and the next one started but now that I see the desktop the process seems alive but everything in the Corona UI is sooooo slow.
This is what happens when before the process gets blocked! If I press the tabs Post, Stats, History, DR, Lightmix, all of them have a very slow response.

This is all I can give for now and I don't know guys how you're going to resolve this issue. It gets really annoying...
We need to find a solution ASAP. Let me know if you need any extra data from us.

Thanks,

Dionysios -
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: maru on 2018-10-26, 14:19:42
Thanks for testing Dionysios, and sorry to hear about your results. We will definitely do our best to fix this.
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-10-26, 14:24:16
Thanks for testing Dionysios, and sorry to hear about your results. We will definitely do our best to fix this.

I know, is not your fault guys, I wish I could help actually...
I have an update, I wrote this morning the system was freezed, it was true, I left it there for a while and at the end it finished the render job and it goes on with the rest. So no crash for now.
We have 8 more jobes to be done in Backburner so I'll let you know.

We have the Restare 3ds Max option ON now.

Thanks,

Dionysios -
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: maru on 2018-10-26, 14:25:57
the system was freezed, it was true, I left it there for a while and at the end it finished the render job and it goes on with the rest
Does it mean that you left the computer frozen, and then it unfroze and continued rendering?? Or maybe I misunderstood your message?
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-10-26, 14:30:59
the system was freezed, it was true, I left it there for a while and at the end it finished the render job and it goes on with the rest
Does it mean that you left the computer frozen, and then it unfroze and continued rendering?? Or maybe I misunderstood your message?

Yes I confirm!

It was frozen, left it there and went on!
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: maru on 2018-10-26, 14:34:34
It was frozen, left it there and went on!
Woah, I don't think we've ever had a similar report. Not sure if it's a good thing, as it might potentially make the issue even more confusing. :/
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-10-26, 14:39:02
It was frozen, left it there and went on!
Woah, I don't think we've ever had a similar report. Not sure if it's a good thing, as it might potentially make the issue even more confusing. :/

I know... :(
But what could it be that freezes the 3ds Max / Corona process so much? And in never happens at the start of the rendering process but ALWAYS near to the end by the way.
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-10-30, 11:44:30
I have some updates but I am not sure if they may help. Better share them with you in any case.

Lets start:

- Yesterday I sent to Backburner 6 jobs.
- At 10pm I checked via TeamViewer the DR Master to see how the jobs are doing.
- One of the job was freezed under this condition: Corona VFB was open but freezed, the noise threshold was reached so the job was completed at 100% but Corona was there, waiting and doing nothing.
- I decided to check what the others DR servers where doing at this point.
- DR Server 03 was in standby mode! Not working at all, normal if you consider that the rendering process had reached the 100% state.
- I then connected to my personal workstation which during the nights I use is as DR server as well, guess what??? The DR server mode was on rendering!!! WTF! :)
- I checked the Logs and I found that in that specific moment was calculating passes and couldn't send the file to the DR Master!!!
- I closed the DR server on my machine at this point and you know what? The DR Master finally completed the job instantly and saved the file...
- Last thing, I opend the task manager of my PC and I found another 3ds Max process going on but I closed the DR server a while ago. It seems that a second 3ds Max process was going on and maybe created prblems to the DR process at the end? I don't know guys.
- I started the DR server on my machine again and till now all the rest of Backburner jobs are go on without problems till now.

Excuse me for the long message here but I am trying to hel and be as detailed as I can.

Thanks,

Dionysios -
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: maru on 2018-10-30, 19:02:31
Thanks for your message. It may be crucial that there is a 2nd instance of 3ds Max running. We have recently identified a similar issue.
We will investigate this - stay tuned for updates.
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dung (Ivan) on 2018-10-30, 21:54:13
I have some updates but I am not sure if they may help. Better share them with you in any case.

Lets start:

- Yesterday I sent to Backburner 6 jobs.
- At 10pm I checked via TeamViewer the DR Master to see how the jobs are doing.
- One of the job was freezed under this condition: Corona VFB was open but freezed, the noise threshold was reached so the job was completed at 100% but Corona was there, waiting and doing nothing.
- I decided to check what the others DR servers where doing at this point.
- DR Server 03 was in standby mode! Not working at all, normal if you consider that the rendering process had reached the 100% state.
- I then connected to my personal workstation which during the nights I use is as DR server as well, guess what??? The DR server mode was on rendering!!! WTF! :)
- I checked the Logs and I found that in that specific moment was calculating passes and couldn't send the file to the DR Master!!!
- I closed the DR server on my machine at this point and you know what? The DR Master finally completed the job instantly and saved the file...
- Last thing, I opend the task manager of my PC and I found another 3ds Max process going on but I closed the DR server a while ago. It seems that a second 3ds Max process was going on and maybe created prblems to the DR process at the end? I don't know guys.
- I started the DR server on my machine again and till now all the rest of Backburner jobs are go on without problems till now.

Excuse me for the long message here but I am trying to hel and be as detailed as I can.

Thanks,

Dionysios -

Hi, are you using Corona DR? Or Backburner servers? I want to setup the same scenario and so far I was able to set up with Backburner servers only when using Backburner Monitor.

Thank you for your patience :)
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-10-31, 10:33:44
Quote
Hi, are you using Corona DR? Or Backburner servers? I want to setup the same scenario and so far I was able to set up with Backburner servers only when using Backburner Monitor.

Thank you for your patience :)

Hi and I am glad to help!
 
I also have some updates but first I'll answer your question.
This is our render farm setup:

Workstation 1 (used as DR server during the night and during the day uses the DR Servers below if available for rendering tests)
Workstation 2 (used as DR server during the night and during the day uses the DR Servers below if available for rendering tests)
DR Server 04  (used as Backburner Manager and DR Master machine during the Backburner process)
DR Server 03  (only DR Server)

So basically, DR Server 04 Receives the jobs from Backburner and renders them using all the above computers in DR mode.

Here is my today update:

The freeze happened again tonight and I saw it remotely from my phone this morning.
Again, DR Server 04 was freezed but this time wasn't my workstation who was stuck on the saving process but DR Server 03.
"Magically", turning off DR Server process on the DR Server 03 resolved the problem on the DR GENERAL PROCESS and the image was saved instantly and Backburner loaded to the next job.

I took a capture screen images of the DR Server's task manager when I close the DR Server process and I see a Corona process on!!! What is this? (See image below)
Sorry for the size of the screenshot but was made by my phone.

So in general, one of the DR servers (Workstations or DR Server 03) during the Backburner process has difficulty to save the passes and goes on in infinity even if the passes on the master machine are 100% done. And all the Dr process freezes.

This is all I can report for now.

Thanks,

Dionysios -

Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-10-31, 18:18:13
I would like to share some thoughts with you guys.

This issue is very strange, there is something really blocking the process and very often I've noticed at the end that the freeze behaviour is not always there. Sometimes the DR Master seems sluggish and slow, the render remain time seems extremely long while the same scene opened on a local workstation using DR at the same time renders everything perfectly.

Sometimes I think that the problem could be on th Nvidia AI Denoiser but I am not sure. We had the freeze moments even when the denoiser was the Corona one but the sluggish behaviour is almost the same like having 2 sessions of 3ds Max opened locally and rendering in one of them an image with Nvidia AI on. It is sluggish as hell.

Maybe is a Frame Network bug, actually could be hundreds of things.

Would be perfect if we could replicate the issue on your network as well but I don't know how we can do that.

I have a scene who give us problems always lately when is rendered via Backburner + DR and the job is sent to the DR Server 04 and not locally.
Is quite heavy as is full of glass bricks and caustics are enabled on their glass. Don't know if this could help you or not. In that particular scene Corona gives us strange render remain timing when Backburner is used while locally + DR works fine.

Let me know if eventually we can prepare the scene for you.

Thanks,

Dionysios -
Title: DR sending masking samples to slaves timed out
Post by: Giorgos Zacharioudakis on 2018-11-04, 22:48:05
Hi,

Does any have encountered the above message? Every time I use DR it freezes and I have to close DR server to get the master working. Then I get the error “sending masking samples to slaves time out”

Corona 3 RC4
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Giorgos Zacharioudakis on 2018-11-04, 23:14:39
Increasing time from 60 to 120 seems to work. But why is this happing in corona 3?
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: maru on 2018-11-05, 13:58:42
@Dionysios:

I'd like to sum up what we have so far. Please let me know if any of the below is not correct, or if you would like to add anything:

Main issue:
Rendering getting "stuck" when using Corona's distributed rendering with Autodesk Backburner.

Additional issues:
Sometimes the VFB becomes very slow.

Additional notes:
Quote
It seems if I use the DR servers from my local PC everything works fine.
Does this mean that if you use ONLY the Corona's distributed rendering (without Backburner), then everything is working fine, and the issue never appears?

Quote
next one started but now that I see the desktop the process seems alive but everything in the Corona UI is sooooo slow
So this means that you can see the VFB on the node computer, and that the VFB works very slowly. Right?
Is this happening when using Corona's DR + Backburner, or also when using Corona's DR only (without Backburner)?

-Sometimes the process gets un-stuck if you wait long enough - is this correct?

-When the rendering is stuck, are there always 2 or more instances of 3dsmax.exe in the task manager? Or is it sometimes stuck with just 1 3dsmax.exe in the task manager?

-Do you have all Windows Updates installed, including the newest Spectre patch?



@CloundN9:
Can you please contact us about this issue here https://coronarenderer.freshdesk.com/support/tickets/new and provide your full DR and Backburner logs (even if you are not using BB)? Here is how to get them: https://coronarenderer.freshdesk.com/support/solutions/articles/12000002065
Thanks.
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-11-05, 14:46:54
Thanks for getting back Maru, I'll answer to your questions:

Quote
Does this mean that if you use ONLY the Corona's distributed rendering (without Backburner), then everything is working fine, and the issue never appears?
It happened only via Backburner, locally we never had any issues for now so I can confirm that DR locally seems to work fine.

Quote
So this means that you can see the VFB on the node computer, and that the VFB works very slowly. Right?
Is this happening when using Corona's DR + Backburner, or also when using Corona's DR only (without Backburner)?
Only when I use DR + Backburner, it starts working very slowly and after a while the process freezes.

Quote
Sometimes the process gets un-stuck if you wait long enough - is this correct?
Yes confirm!

Quote
When the rendering is stuck, are there always 2 or more instances of 3dsmax.exe in the task manager? Or is it sometimes stuck with just 1 3dsmax.exe in the task manager?
Unfortunately that's random but in most cases I saw a second instance some other times was only 1.

Quote
Do you have all Windows Updates installed, including the newest Spectre patch?
I don't know this info, I need to ask out IT manager. Every week we receive updates from our main server so if is important I can check with him right away.

Thanks!
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: maru on 2018-11-05, 14:58:25
Thanks for the replies.

Quote
Do you have all Windows Updates installed, including the newest Spectre patch?
I don't know this info, I need to ask out IT manager. Every week we receive updates from our main server so if is important I can check with him right away.
It would be great if you could check this. There are some random issues after applying the spectre/meltdown fix, usually they are related to CPU usage and general performance when rendering, but who knows...

One more question:
Are you running only one job on the network, and then all nodes are working on this single job?
Or are you submitting multiple jobs, and various nodes pick up various jobs?
When there are a few masters and a few nodes on one network, and various jobs are submitted, then some issues may appear.
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-11-05, 15:45:11
Thanks for the replies.

Quote
Do you have all Windows Updates installed, including the newest Spectre patch?
I don't know this info, I need to ask out IT manager. Every week we receive updates from our main server so if is important I can check with him right away.
It would be great if you could check this. There are some random issues after applying the spectre/meltdown fix, usually they are related to CPU usage and general performance when rendering, but who knows...

One more question:
Are you running only one job on the network, and then all nodes are working on this single job?
Or are you submitting multiple jobs, and various nodes pick up various jobs?
When there are a few masters and a few nodes on one network, and various jobs are submitted, then some issues may appear.

I am sure 100% we didn't make any microcode updates so on the HW side things are the same as 1 year ago. As for the Windows OS side, we receive updates every 2 or 3 weeks so I don't know for now if that update is already installed. I checked the updates history but can't see anything for this.

As for the network, I run 1 job on the network and all the nodes are working on this. The network Master is only 1.
When we have issues we use, as explained before, our local workstations + DR.
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-11-07, 11:07:09
I have a clue for you guys!!!

Right now we've got a freeze issue again.
Job sent a while ago via Backburner to our DR Master.

The job started succesfully, after a while the DR Master Corona UI was Freezed and the calculation became very very slow.
I checked all the DR servers and in this case one of them is one of ours workstations and I found out that was gaving error to send the EXR dump file.
The CPUs were 100% on power but the Corona process was freezed there as well.

I force the shutdown of DR Server application and the job on Backburner turned out normal again!

At that moment, my assistant had opened a 3ds Max session in the same time to work on a simple scene while her PC was in the DR process. Could be the reason?
That is something random btw. I found the same problem with the normal DR Servers being freezed cause they can't sent the EXR Dump file and they block all the process and guess what? Another 3ds Max process was on.

Here is the error I saw in the DR log:

2018-11-07 09:58:11   Finished EXR dump after 2 s
2018-11-07 09:58:11   Received sampling focus mask (region 0 2378 6100 2460)
2018-11-07 09:59:16   Started EXR dump (206 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump8989.exr)
2018-11-07 09:59:19   Finished EXR dump after 2 s
2018-11-07 09:59:19   Received sampling focus mask (region 0 2460 6100 2542)
2018-11-07 10:00:24   Started EXR dump (206 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump8990.exr)
2018-11-07 10:01:30   Started EXR dump (205 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump8991.exr)
2018-11-07 10:02:38   Started EXR dump (205 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump8992.exr)
2018-11-07 10:03:44   Started EXR dump (206 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump8993.exr)
2018-11-07 10:04:51   Started EXR dump (205 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump8994.exr)
2018-11-07 10:05:58   Started EXR dump (201 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump8995.exr)
2018-11-07 10:07:04   Started EXR dump (186 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump8996.exr)
2018-11-07 10:08:10   Started EXR dump (185 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump8997.exr)
2018-11-07 10:09:17   Started EXR dump (184 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump8998.exr)
2018-11-07 10:10:24   Started EXR dump (184 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump8999.exr)
2018-11-07 10:10:24   Sending file to remote side failed
2018-11-07 10:11:30   Started EXR dump (183 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9000.exr)
2018-11-07 10:12:35   Started EXR dump (182 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9001.exr)
2018-11-07 10:13:41   Started EXR dump (181 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9002.exr)
2018-11-07 10:14:47   Started EXR dump (181 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9003.exr)
2018-11-07 10:15:54   Started EXR dump (180 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9004.exr)
2018-11-07 10:17:00   Started EXR dump (178 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9005.exr)
2018-11-07 10:18:05   Started EXR dump (173 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9006.exr)
2018-11-07 10:19:11   Started EXR dump (128 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9007.exr)
2018-11-07 10:20:15   Started EXR dump (114 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9008.exr)
2018-11-07 10:20:25   Sending file to remote side failed
2018-11-07 10:21:20   Started EXR dump (114 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9009.exr)
2018-11-07 10:22:24   Started EXR dump (114 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9010.exr)
2018-11-07 10:23:29   Started EXR dump (110 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9011.exr)
2018-11-07 10:24:33   Started EXR dump (106 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9012.exr)
2018-11-07 10:25:38   Started EXR dump (105 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9013.exr)
2018-11-07 10:26:42   Started EXR dump (104 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9014.exr)
2018-11-07 10:27:46   Started EXR dump (81 455 470 bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9015.exr)
2018-11-07 10:28:53   Started EXR dump (207 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9016.exr)
2018-11-07 10:30:00   Started EXR dump (207 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9017.exr)
2018-11-07 10:30:25   Sending file to remote side failed
2018-11-07 10:31:07   Started EXR dump (207 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9018.exr)
2018-11-07 10:32:13   Started EXR dump (207 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9019.exr)
2018-11-07 10:33:21   Started EXR dump (207 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9020.exr)
2018-11-07 10:34:27   Started EXR dump (207 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9021.exr)
2018-11-07 10:35:34   Started EXR dump (208 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9022.exr)
2018-11-07 10:36:40   Started EXR dump (208 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9023.exr)
2018-11-07 10:37:57   Started EXR dump (207 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9024.exr)
2018-11-07 10:39:04   Started EXR dump (207 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9025.exr)
2018-11-07 10:40:11   Started EXR dump (207 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9026.exr)
2018-11-07 10:40:25   Sending file to remote side failed
2018-11-07 10:41:18   Started EXR dump (207 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9027.exr)
2018-11-07 10:42:25   Started EXR dump (207 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9028.exr)
2018-11-07 10:43:32   Started EXR dump (206 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9029.exr)
2018-11-07 10:44:39   Started EXR dump (206 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9030.exr)
2018-11-07 10:45:45   Started EXR dump (205 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9031.exr)
2018-11-07 10:46:52   Started EXR dump (206 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9032.exr)
2018-11-07 10:47:59   Started EXR dump (206 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9033.exr)
2018-11-07 10:49:06   Started EXR dump (206 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9034.exr)
2018-11-07 10:50:15   Started EXR dump (206 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9035.exr)
2018-11-07 10:50:26   Sending file to remote side failed
2018-11-07 10:51:22   Started EXR dump (207 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9036.exr)
2018-11-07 10:52:28   Started EXR dump (207 M bytes, filename: C:/Users/ABAGATELLA/AppData/Local/CoronaRenderer/DrData/dump9037.exr)

As you can see, till hour 09:58:08 everything was fine.
After that, something happens and it gives Sending file to remote side failed.
Then why we have all those Started EXR dump messages from 10:00 and on??? And then we get always the sending falied error?

Hope all this can help!

Thanks,

Dionysios -
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-11-07, 14:57:41
Happened again now and this time with 2 PCs (DR Servers) but no second 3ds Max instance was open on purpose or found in the task manager...

Dionysios -
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-11-07, 16:04:59
Another weird thing today!

The DR Master appears in the DR Servers list without having the DR Server service open!
See attached file!
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: maru on 2018-11-08, 16:26:49
What's "workstation" and "DR master" - shouldn't it be the same thing?
Maybe there is some IP conflict in your network, and two computers are getting the same IP?
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: maru on 2018-11-09, 14:36:08
We have just released RC5, which fixes yet another issue (Fixed DR server sometimes getting stuck when restarting slave 3ds Max). If anyone with the issue where DR server needs to be restarted to trigger rendering is reading this, please test the newest RC and report to us whether there is an improvement: https://coronarenderer.freshdesk.com/support/solutions/articles/5000570015
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-11-13, 10:21:25
What's "workstation" and "DR master" - shouldn't it be the same thing?
Maybe there is some IP conflict in your network, and two computers are getting the same IP?

I thought the same thing, but I don't see any conflicts here.
anyway, I'll install the RC5 now and let you know.

Thanks!

Dionysios
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-11-13, 15:34:11
I installed the RC 5 and unfortunately we got right now the freeze problem!
Is very difficult to work under such circumstances and I really don't knwo how to help more.

My workstation was blocking the process this time but in the DR log I didn't see any of the sending EXR errors as usually were happened the last week. When I close the DR Server process on my workstation everthing started to work again perfectly.

Are we sure the NET Framework doesn't block the process of the system??? Or the Nvidia Denoiser? Or who knows what else...

Actually with the NET Framework we had some problems in the past.

For now the only thing I can say is that working like this is quite impossible.
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: maru on 2018-11-13, 16:02:07
When I close the DR Server process on my workstation everthing started to work again perfectly.
Can you please explain this sentence? Your DR setup is a bit confusing to me all this time.
By "workstation" we usually mean a computer where a person is sitting and doing some stuff in 3ds Max. It is also the computer where you click "render" inside 3ds Max and then it either renders with the help of nodes, or sends the job to BB.
By "nodes" we usually mean computers where 3ds Max is running in command line mode, without its UI exposed, and no one is using those computers for working with 3ds Max (they may even not have monitors).

Is it the same for you, or are you using your computers in some different way?

Also, is there ever a situation for you where:
-You are using more than 1 instance of 3ds Max on the same PC
-You are running Backburner Manager/Server/Monitor and Corona's DR server on one PC?
-You are running 3ds Max and Corona's DR server on once PC?
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-11-13, 16:19:26
When I close the DR Server process on my workstation everthing started to work again perfectly.
Can you please explain this sentence? Your DR setup is a bit confusing to me all this time.
By "workstation" we usually mean a computer where a person is sitting and doing some stuff in 3ds Max. It is also the computer where you click "render" inside 3ds Max and then it either renders with the help of nodes, or sends the job to BB.
By "nodes" we usually mean computers where 3ds Max is running in command line mode, without its UI exposed, and no one is using those computers for working with 3ds Max (they may even not have monitors).

Is it the same for you, or are you using your computers in some different way?

Also, is there ever a situation for you where:
-You are using more than 1 instance of 3ds Max on the same PC
-You are running Backburner Manager/Server/Monitor and Corona's DR server on one PC?
-You are running 3ds Max and Corona's DR server on once PC?

Maru, can I ask you something?

If I go home during the night, can my Workstation become a DR node for the time I am not in front of it till I get back the morning later?

If I go for lunch and let's say for 90 minutes I want my workstation to contribute as a DR Node in the general process, can I start the DR Server as well?

I am 20 years in the industry, I helped the mental ray and iray development for years and I know what a node and workstation terms are. No offense please.

We have 4 PCs.
2 of them are Nodes.
2 of them are Workstations.

We use Backburner for the final production images and when we don't use our workstations we use them as nodes.

I don't understand where is the problem.

The Backburner Manager & DR Master is always one and the same PC, is one of the nodes and gets connected to any of the above PCs which runs the DR server.

I see a simple setup here.

When the Corona process freezes, is always fault of one of the nodes which are connected to the DR Master and it happens randomly.
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: maru on 2018-11-13, 16:30:43
Yes, of course, no offense. I only wanted to clarify those things. The reason is that we are dealing with many reports every day, and it is impossible to memorize all the details. Threads may get pretty lengthy, so it is good to have all the info sorted and stored in one place, in as simple form as possible. Sorry if I sometimes sound dumb, but the ideal description of an issue is what others often call "explain like I'm five".
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-11-13, 16:36:51
Yes, of course, no offense. I only wanted to clarify those things. The reason is that we are dealing with many reports every day, and it is impossible to memorize all the details. Threads may get pretty lengthy, so it is good to have all the info sorted and stored in one place, in as simple form as possible. Sorry if I sometimes sound dumb, but the ideal description of an issue is what others often call "explain like I'm five".

No worries! :)
I am not angry, just want to help! For you guys, the community and us.

You asked and I answer:

Quote
-You are using more than 1 instance of 3ds Max on the same PC
No, only one instance.

Quote
-You are running Backburner Manager/Server/Monitor and Corona's DR server on one PC?
The Backburner Manager which is the DR master uses in the same time: Backburner Manager / Backburner Server and that's it.

Quote
-You are running 3ds Max and Corona's DR server on once PC?
I don't understand well this question, when the nodes work as nodes, there is only DR Server opened and that's it.

Thanks!

Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: maru on 2018-11-13, 16:41:54
Thanks for the response. Just one more thing:

Quote
I force the shutdown of DR Server application and the job on Backburner turned out normal again!
^So this does not mean that the Corona DR Server and Backburner were working on one PC? They were running on two different PCs?

Quote
At that moment, my assistant had opened a 3ds Max session in the same time to work on a simple scene while her PC was in the DR process. Could be the reason?
^So in this case there was DR Server and 3ds Max running on the same PC? Or 2 instances of 3ds Max?

Quote
I found the same problem with the normal DR Servers being freezed cause they can't sent the EXR Dump file and they block all the process and guess what? Another 3ds Max process was on.
^Here you observed more than one 3ds Max process on a single PC?
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-11-13, 17:39:23


Quote
I force the shutdown of DR Server application and the job on Backburner turned out normal again!
Quote
^So this does not mean that the Corona DR Server and Backburner were working on one PC? They were running on two different PCs?
Yes, the Backburner Manager receives the jobs, it starts, and it uses any DR Server node found available on the Lan.

Quote
At that moment, my assistant had opened a 3ds Max session in the same time to work on a simple scene while her PC was in the DR process. Could be the reason?
Quote
^So in this case there was DR Server and 3ds Max running on the same PC? Or 2 instances of 3ds Max?
This is an unlike situation which in the previous version of Corona it worked always fine. When during the day we need some more power, while we work we add our Workstations to the DR process so the become nodes so it happens sometimes to have 2 3ds Max instances opened. We noticed in this case that the Backburner machine + the DR process on the specific workstation become less reactive. But, the freezing problem happened also without having 2 3ds Max instances opened. Like the freeze happened today. Is this clear? If not let me know.

Quote
I found the same problem with the normal DR Servers being freezed cause they can't sent the EXR Dump file and they block all the process and guess what? Another 3ds Max process was on.
Quote
^Here you observed more than one 3ds Max process on a single PC?
No extra instances were opened in this case. Only the one of the rendering process.
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: maru on 2018-11-13, 17:42:22
The reason I asked was that these scenarios are not recommended, and various issues may appear when using them:
Quote
-You are using more than 1 instance of 3ds Max on the same PC
-You are running Backburner Manager/Server/Monitor and Corona's DR server on one PC?
-You are running 3ds Max and Corona's DR server on once PC?
Obviously we are trying to improve that, and it looks like it could help in your case.
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-11-13, 17:45:18
The reason I asked was that these scenarios are not recommended, and various issues may appear when using them:
Quote
-You are using more than 1 instance of 3ds Max on the same PC
-You are running Backburner Manager/Server/Monitor and Corona's DR server on one PC?
-You are running 3ds Max and Corona's DR server on once PC?
Obviously we are trying to improve that, and it looks like it could help in your case.

I imagined and noticing the problem we avoid now to do so but the freeze of the software is always here.
The second instance I found in some cases on the nodes was not started from us, I just found it in the task manager under the process tab.
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: Dionysios.TS on 2018-11-30, 18:59:28
Here the problem continues as before.
Had no time to install the official 3.0 and we still use the RC7 but I guess nothing will change.

The DR process freezes here and there randomly. When I find which DR Server is blocking the process and I force to close it's DR process, everything comes back to normal again.

Have a nice weekend to everybody.

Dionysios -
Title: Re: DR server needs to be restarted issue - V3 daily builds update
Post by: maru on 2018-12-03, 10:16:02
We will do our best to fix this, and will contact you in case some more information is needed. (and also others who reported this problem).