Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - cgifarm

Pages: [1] 2 3 4
1
[Max] General Discussion / Re: DR render problem.
« on: 2018-09-03, 11:27:02 »
I Alex,
thanks for replay.
changing the switch 10/100/1000 a notice improvement.
I will check the other aspects you suggested.

You're welcome!

2
Hey Tom, thanks for the answer! I just wanted to write it down because I forgot so many times about this and I always remember when a client has an issue.

Alex

3
[Max] General Discussion / Re: DR render problem.
« on: 2018-08-29, 23:17:22 »
Hi all,
it seems i have similar problem on DR.
Workstation:Ryzen Threadripper 1950X on win 10, 64 bit.
3 node i7 2700k on win 7, 64bit.

The contribution of the 3 nodes is too low than it should be, sometime i notice also long parsing time.

Hi,

Long parsing time can be when all 3 nodes are copying files at the same time from network. Please check your network speed, hard drive
or lack of enough ram on the network storage can make things worst.

If you are using the master as network sharing and also renders at the same time, it can affect performance a lot. Best thing is to use dedicated
sharing server for the assets and at least 1GB network connection if not 10GB fiber optics to make sure network is not bottle neck when doing DR.

To test network performance, you can test with 1 extra node, check parsing time, then add the other 2 etc.

You also need to take into the account that a render node will start a 3ds max instance and load the scene from scratch. On your workstation
the scene is loaded and you might be deceived that it starts rendering right away, and the others are still parsing.

Not sure what your real problem can be, but I just put some of my thoughts based on the little info provided.

Good luck!

Alex

4
[Max] General Discussion / Re: DR render problem.
« on: 2018-08-29, 23:08:17 »
Makes sense, but when you only have two PC's which are connected in the same room through a small (but fast) switch.  I'd expect to see some pretty instant speed improvement when the node kicks in, especially when the node is twice as fast as the host to render a single frame.  I'm also rendering images which take 5 hours+. 

Admittedly I need to find a day where I can leave a render going on one machine until completion and then render again with DR on and a do a full comparison to completion.  It's just not always practical in a production environment.  It was obvious in VRay when you could see buckets, it's guesswork in Corona (any progressive renderer) without actually sitting there for a full render to complete as described above.

The best thing is to see how many passes your DR node did and subract that from the total amount. Then you will know which node rendered how much of the image.

You need to keep an eye on the stats while it renders or setup a screenshot program to take screenshots every minute or so.

It's a paint to test, but ya, vray with bucket shows machine name while rendering, but this is different, because each machine gets the full image to work on for a full pass.

It's better to setup test scenes which are known to render for 1 hour on one machine for example then do test on DR. In my experience is like it will be about 80% of the capacity of the second machine because of the extra waiting time for network traffic and job management from the master. Takes time to do these tests, but it's good to know your tools and what to expect
so you know how to set the deadlines and costs for the clients, for me an extra 30% of what I estimate is my golden number which I add all the time, even if I do a great work planning, it's going to be always something which requires more attention if it's a new project.

I would also recommend having the master machine the strongest if it's also doing rendering, otherwise it won't have enough cores to distribute the job to the slave, and slave might be waiting much longer for a job.

In some situations you can setup the master machine not to do any rendering at all if you use more DR nodes, so it will just spawn jobs for the nodes and compute the final image.

Good luck!

5
Hey guys,

I run Corona on our farm and some times it happens that a client uses just noise limit to stop a frame and we get some frames finishing faster than
they should be.

I've discussed with Thomas M. Grimes a few days ago at D2 in Vienna about this issue and he mentioned that he was able to reproduce and that's a known bug
which has not been addressed so far.

My suggestion is that instead of sampling every X passes (5 is default) is that you sample every pass for noise limit, and every X passes you do an average
of noise collected from the sampled data. In this way if there's a big difference between sample 4 and 5, it can be filtered out by doing math on curves, so it should give less false
stopping signals.

Not sure what is the cost of sampling every pass, but at least it will not stop frames unexpectedly and people can rely on using noise level in their animations
or still renders.

Let me know what you think.

Thanks!

Alex

6
[Max] General Discussion / Re: DR render problem.
« on: 2018-08-29, 20:30:50 »
Hey guys,

I see your render times are very small and here is my experience with distributed render on our render farm:

We use distributed render on frames which take longer than 30 minutes to render. I specify "Render" because
it might take up to 12 - 20 mins just to load the scene in memory, and that doesn't count as rendering.

If you are working with a scene which hase lots of assets, all those assets must be loaded in memory, transported over
network (network can be bottleneck as well as you mentioned you are using only 100 MB switches.)

My recommendation is to use scenes which takes longer to render for distributed render, otherwise I recommend submitting
1 frame per machine if it's 6-10 mins renders.

When using DBR server you should not expect rendering twice faster and that's because the Master machine is sending "pixel" coordinates
for each node to render then it receives the data and adds that information to the main file. This is happening over network for each job sent
to the nodes. While this is very fast on local PC with the CPU communicatingwith RAM, over network is a different story.

Good luck!

Alex

7
Hi,

Feel free to contact me at any time.

There's a single frame distributed function implemented into deadline for V-ray which needs to be replicated for corona as well in order to assign the nodes which needs to render the job, most of it is copy/paste and
adding the class for Corona DBR into the main deadline plugin for max. There's 2 types of job submission through API, command line and through deadline plugin. I remember we chose the deadline plugin which handles
the pop-ups from 3ds max for various plugins and dll errors, it depends on how you decided to work with this. CMD line could be more powerful but there's more stuff you need to take care of, and the guys from deadline
implemented quite some checks into their plugin which you can take advantage of.

I also had to modify the slave python script that's running when launching the corona DBR so it launches with the proper 3DS Max version. If your nodes has just one 3ds max version installed
you should be fine without that. Just the master node needs to properly load the slave list from the config file that you will be saving.

Good luck!

Alex

8
Hi,

Did you tried doing a regular workstation render on that node on the scene that you try rendering? This might be an obvious problem, but does the machine has enough RAM to render the scene?




9
Hi,

A while ago I wrote a power management script which will assign the nodes to a specific job using deadline webservice API.

Here's what's doing:

1. Send a signal to power on a node via ssh. This can be also implemented for cloud solutions
by creating a VM using the api.

2. Assign the node to a certain job. In order to have a job rendering on a certain number of nodes and not having them mixed with the search on lan, you need to specify
which node should render what. Deadline supports machine limit, making a white list of slaves that can render a job. The list is then given to the first node "master node"
and it's appended with maxscript to the DR list.

3. When denoising step is detected, it will mark as completed the slaves and leave just the master node rendering.

Notes, to achieve this I had to modify some deadline files as well, the last upgrade was for deadline  10.

4. The power management script will detect if there are any running jobs with queued frames and assign the powered on nodes which finished other jobs. If no jobs is queued it will shut down the nodes after a certain number of minutes of innactivity.

To further filter you machines if they are different configurations, you can use deadline groups assigned to your machines.

I can help implementing this in your environment as well if you are interested, it will take a couple of days to properly test everything.

If you want to implement this yourself, feel free to ask any questions here on this topic or get in touch on my skype id CGIFarm .

I wish you the best!

Alex

10
Thank you Alex. I will look at it!

You are welcome! Let me know if something isn't clear or if you have any issues.

Alex

11
Hi! I'm thinking about usinga renderfarm, but i don't want to make expensive mistakes ( i lost almost 200 € in a overnight render job that went wrong on rebus farm and i ended with nothing at the end. Rookie mistake i reckon, but nonetheless..). So, my question is, since i'm running some test on Ranch computing and i didn't get the expected results yet, your farm can manage lightmix and cxr outputs the same way my frame buffer does?

Hi, thanks for your interest in testing our renderfarm. Here's the answer to your question:

CGIFarm system will output the files which gets saved automatically when rendering finishes. There's no way right now to interact with the VFB from our machines, so the scene needs to be setup with the proper output and extensions before uploading it. Here's a list with things you should look for when preparing a job for rendering:

https://www.cgifarm.com/blog/post/cgifarm-render-submission-checklist/

Our farm will save the CXR file, the only downside is that these are pretty big (8 - 10 GB per frame on a 4K image), but if you really need them, you can export them.

Make sure render output is enabled and your render elements have proper filename and extension set.

Before commiting to big time renders, our system allows you rendering a small resolution or render just a region by exporting settings and post them when creating a job for a package.
You don't not need to reupload the package to test different render settings. Here's a video about that:


Let me know if you have other questions, you can reply to this thread or add me on skype with the id CGIFArm .

Alex

12
Hello,

We released Version 1 of our CGIFarm plugin, we added support for anima 3.0.1 assets gathering and marked the version as stable
for max 2014 up to 2018.

If you enounter any issues, let me know.

Alex

13
Hi,

I had a client today which has his 3ds max scene set on region and the slave nodes were constantly crashing 3ds max. In the corona logs I was receiving message about the region being received.

After setting area to render on view the single frame distributed worked fine with the slave nodes.

Alex

14
Hi there!

I have created a video on how to use Xref to render multiple cameras on CGIFarm. You can use the same geometry file xref-ed into your master scenes where you can hide objects, layers, choose different render settings, etc.. Check this video out:

Here's the video with our last update of the CGIFarm Sync application:

Still some things to fix/implement but we are getting into betta hopefully in a couple of weeks.

Getting your opinion about our workflow always help, even if you just test the farm with the $20 free credits we are offering, can help finding issues so we can advance faster.

Enjoy your week!

15
We updated our 3DS Max plugin and the CGIFarm Sync desktop application. The frames are automatically downloaded for each job and you have ability to select which jobs to download from the CGIFarm Sync interface. More to come on this application, allowing to control the mode of download "balanced vs. first in first out", so you can maximize your workflow, prioritizing which packages will arrive for render first or which download first.

Watch this video for more details on the update we just made:

IF you have any questions, don't hesitate to contact me.

Alex

Pages: [1] 2 3 4