Author Topic: 1.6 DR  (Read 34190 times)

2017-04-02, 10:25:51
Reply #105

tolgahan

  • Active Users
  • **
  • Posts: 229
    • View Profile
    • Architectural visualization & Graphic design
same problem again...

When i send a job to backburner it doesnt start render.

it looks on screen as if it started but max doesnt run.When i do restart job on backburner it starts rendering.


and second biggest problem.

when it was time to save rendering part frame buffer screen is crashed every time..backburner is very important.

when i go back to december 13 version i get no problems

and what are the grey on greys in the dr tab.

search lan still not found all slaves.

nothing is fixed altough tyey were said to have been fixed :(

we cant use this version full off beauties :(
Imagination is more important than knowlege

2017-04-02, 11:46:43
Reply #106

ihabkal

  • Active Users
  • **
  • Posts: 253
    • View Profile

when it was time to save rendering part frame buffer screen is crashed every time..

I mentioned this a few times about a month ago :)

2017-04-02, 12:01:01
Reply #107

Dionysios.TS

  • Active Users
  • **
  • Posts: 766
    • View Profile
    • Evolvia Imaging

when it was time to save rendering part frame buffer screen is crashed every time..

I mentioned this a few times about a month ago :)

All these issue with DR are so strange!

Now I rendered everything with the latest build! 6 images at 7.100px resolution all rendered without issues.
The only thing I noticed, when the process arrives to 100% the master PC waits too long to save the final images.

For the rest seems ok.

2017-04-02, 13:52:57
Reply #108

tolgahan

  • Active Users
  • **
  • Posts: 229
    • View Profile
    • Architectural visualization & Graphic design

when it was time to save rendering part frame buffer screen is crashed every time..

I mentioned this a few times about a month ago :)

All these issue with DR are so strange!

Now I rendered everything with the latest build! 6 images at 7.100px resolution all rendered without issues.
The only thing I noticed, when the process arrives to 100% the master PC waits too long to save the final images.

For the rest seems ok.

Its been an hour but still not recording.
Maybe not saving on network.

I try with 20 slave machine.
There is no time to do combination and find the problem
Short tests sometimes give good results but 3-4 hour renders are crashed like this.
I dont know what to do It takes time to go back and setup daily build version.

Imagination is more important than knowlege

2017-04-02, 14:07:22
Reply #109

Dionysios.TS

  • Active Users
  • **
  • Posts: 766
    • View Profile
    • Evolvia Imaging

when it was time to save rendering part frame buffer screen is crashed every time..

I mentioned this a few times about a month ago :)

All these issue with DR are so strange!

Now I rendered everything with the latest build! 6 images at 7.100px resolution all rendered without issues.
The only thing I noticed, when the process arrives to 100% the master PC waits too long to save the final images.

For the rest seems ok.

Its been an hour but still not recording.
Maybe not saving on network.

I try with 20 slave machine.
There is no time to do combination and find the problem
Short tests sometimes give good results but 3-4 hour renders are crashed like this.
I dont know what to do It takes time to go back and setup daily build version.

Tell the truth the renders that were successful to me where under 3 hours of calculation.

2017-04-02, 14:41:27
Reply #110

tolgahan

  • Active Users
  • **
  • Posts: 229
    • View Profile
    • Architectural visualization & Graphic design

when it was time to save rendering part frame buffer screen is crashed every time..

I mentioned this a few times about a month ago :)

All these issue with DR are so strange!

Now I rendered everything with the latest build! 6 images at 7.100px resolution all rendered without issues.
The only thing I noticed, when the process arrives to 100% the master PC waits too long to save the final images.

For the rest seems ok.

Its been an hour but still not recording.
Maybe not saving on network.

I try with 20 slave machine.
There is no time to do combination and find the problem
Short tests sometimes give good results but 3-4 hour renders are crashed like this.
I dont know what to do It takes time to go back and setup daily build version.

Tell the truth the renders that were successful to me where under 3 hours of calculation.

3 hours is enought to get good results.

I am doing another test I will send 6,5k screenshot.

I think I have found where the problem comes from.
PUBLIC sharing folder was off in one ıf the slave machines.I will test after I turn it on.I think it crashes when It cant connect the users.

Imagination is more important than knowlege

2017-04-03, 05:45:14
Reply #111

ihabkal

  • Active Users
  • **
  • Posts: 253
    • View Profile


3 hours is enought to get good results.

I am doing another test I will send 6,5k screenshot.

I think I have found where the problem comes from.
PUBLIC sharing folder was off in one ıf the slave machines.I will test after I turn it on.I think it crashes when It cant connect the users.
[/quote]


Cool project!

2017-04-03, 12:57:11
Reply #112

Frood

  • Active Users
  • **
  • Posts: 1926
    • View Profile
    • Rakete GmbH
After a while having no chance to test DBs, I tried 17-03-31 right now with 2 random nodes., Max 2016 SP4 | Windows 10. And I have to say it´s not yet usable for us at the moment :[ Here are the issues:

1. DrServer.exe is still sucking CPU when idle. When I start DrServer.exe it takes about 3% permanent load on a dual xeon and 12-13% on a i7, one logical processor is allways busy.

2. While DrServer window reports status of memory and the max process state correctly when rendering, the master cVFB doesn´t receive any status messages but instead always reports memory 0/0 and "not running". However, nodes render and contribute, the "Updates" and "Passes" values are correct.

Edit: This is/was because of the automatic conversion of the existing nodelist in a scene which consists of nodenames here, not IP adresses. Corona 1.6 puts the node entries from older scenes into the "IP" column regardless if it´s a name or IP adress. This way DR is possible but apparently not receiving any status messages. I was not able to correct the nodelist manually (why the heck am I not allowed to edit it?) but "search lan" created a new one. However I was again not able to sort the resulting list so it´s messy now and there are slaves missing. Did not tried the "from file" option, last hope to get a sane list of nodes again.

Edit2: Additionally I even cannot see the slave numer in the DR tab because "search lan" seems to use the fully qualified domain name of the node. What´s so important about IP adresses? :) (see image)

3. I also get a maxscript popup from post.ms when closing DrServer window with highlight in line 19

4. I got a "Cannot bind to discovery port (UDP 19668)" in the log of Drserver on my box (strangely not that error box at startup) and thus it does not spawn any max instance and refuses to work. The port seems to be bound by the (new) licensing server. At least a "netstat -aon" reveals the PID of licensingserver.exe here. So it´s not possible for me to run drserver.exe and licensingserver.exe simultaneously. When I shut down the license server I can use DrServer.
What I´ve tried so far:

- Applied all available updates (one slave + master only)
- Switched Firewalls off
- Rebooted nodes
- Different user accounts for drserver
- Checked logs - nothing suspicious
- Changed master <-> slave role


Good Luck



« Last Edit: 2017-04-03, 17:01:40 by Frood »
Never underestimate the power of a well placed level one spell.

2017-04-03, 16:49:12
Reply #113

sevecek

  • Former Corona Team Member
  • Active Users
  • **
  • Posts: 197
    • View Profile
and second biggest problem.

when it was time to save rendering part frame buffer screen is crashed every time..backburner is very important.


Could you send me the minidump from the crash?

2017-04-03, 16:55:14
Reply #114

sevecek

  • Former Corona Team Member
  • Active Users
  • **
  • Posts: 197
    • View Profile

Edit: This is/was because of the automatic conversion of the existing nodelist in a scene which consists of nodenames here, not IP adresses. Corona 1.6 puts the node entries from older scenes into the "IP" column regardless if it´s a name or IP adress. This way DR is possible but apparently not receiving any status messages. I was not able to correct the nodelist manually (why the heck am I not allowed to edit it?) but "search lan" created a new one. However I was again not able to sort the resulting list so it´s messy now and there are slaves missing. Did not tried the "from file" option, last hope to get a sane list of nodes again.


You can edit nodes just like you edit file names, by two clicks (not double-click).

2017-04-03, 17:14:27
Reply #115

Frood

  • Active Users
  • **
  • Posts: 1926
    • View Profile
    • Rakete GmbH
You can edit nodes just like you edit file names, by two clicks (not double-click).

I can edit the IP column yes, but not the name. I presume you just resolve the IP and fill in the name you get. But I´d like to use the hostname in the IP column to have a sorted list (by nodename, not by IPs which are dynamically to some extend anyway here because they are DHCP leases). It apears to me a little bit too complicated now. The old system (using names when they can be resolved or IPs if not) combined with the new checkboxes would have been perfect (?). But for sure you have your reasons.

Other way round: Why does it not work to have the slave status inside cVFB when I use a nodelist like

"node-01
node-02
node-03"

as IP adress? As mentioned, DR works but then I get no status from slaves in DR tab.


Good Luck

Never underestimate the power of a well placed level one spell.

2017-04-03, 17:28:30
Reply #116

sevecek

  • Former Corona Team Member
  • Active Users
  • **
  • Posts: 197
    • View Profile

I can edit the IP column yes, but not the name. I presume you just resolve the IP and fill in the name you get. But I´d like to use the hostname in the IP column to have a sorted list (by nodename, not by IPs which are dynamically to some extend anyway here because they are DHCP leases). It apears to me a little bit too complicated now. The old system (using names when they can be resolved or IPs if not) combined with the new checkboxes would have been perfect (?). But for sure you have your reasons.

Other way round: Why does it not work to have the slave status inside cVFB when I use a nodelist like

"node-01
node-02
node-03"

as IP adress? As mentioned, DR works but then I get no status from slaves in DR tab.


Good Luck

Yep, I'll fix the status reporting when using hostname instead of IP addresses. The second column in the table are resolved names, IMHO it doesn't make sense to edit them.

2017-04-03, 18:20:14
Reply #117

Frood

  • Active Users
  • **
  • Posts: 1926
    • View Profile
    • Rakete GmbH
Yep, I'll fix the status reporting when using hostname instead of IP addresses. The second column in the table are resolved names, IMHO it doesn't make sense to edit them.

Cool if you are able to fix the status system using hostnames only, I feared the resolved name is somehow basically required for it to run, thank you! This way we will be able to have a sorted node list again and to read the name in the DR tab. But others who just press "Search Lan" maybe not.

IMHO the IP adress could be completely removed from the DR tab so that only the resolved hostname is visible (and in this case it can even be the FQHN - enough space there).


Good Luck


Never underestimate the power of a well placed level one spell.

2017-04-03, 18:44:15
Reply #118

tolgahan

  • Active Users
  • **
  • Posts: 229
    • View Profile
    • Architectural visualization & Graphic design
and second biggest problem.

when it was time to save rendering part frame buffer screen is crashed every time..backburner is very important.


Could you send me the minidump from the crash?


As I said before the only thing I can think of is when render finishes onn backburner  and collect data on slave machines backburner was crashing down.

Therefore I got connected from backburner machine to all slave machines.Only  one of them asked username and password.

as it seems we can render well on backburner.As I have never had such problems in 13 th december daily version.I always thought corona was causing problems.When I turned on the sharing problem dissappered.

Imagination is more important than knowlege

2017-04-03, 22:50:19
Reply #119

Frood

  • Active Users
  • **
  • Posts: 1926
    • View Profile
    • Rakete GmbH
Yep, I'll fix the status reporting when using hostname instead of IP addresses.

Damn - you are fast! All nodes sitting idle now, no CPU load at all. Nodenames are clearly readable when using DR and they are fast to business after canceling a job and a re-render like it was never before.

I am awestruck, congrats. Especially because of fixing the idle load issue which prevented us from using DBs in production. It almost feels like a RC1 now.

What´s left to suggest is, that colums could be switched in the "Distributed Rendering" flyout so that "Resolved Name" comes first - IPs are for nerds 8]


Good Luck


Never underestimate the power of a well placed level one spell.