Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seems there is no control on how many jobs an host can cache #10

Open
rdemaria opened this issue Jul 24, 2017 · 8 comments
Open

Seems there is no control on how many jobs an host can cache #10

rdemaria opened this issue Jul 24, 2017 · 8 comments

Comments

@rdemaria
Copy link

Comptuing options in the client allow to specify Store at least n days of work and Store up to an additional n days of work. However in the server we have the options

 <max_wus_in_progress> 2 </max_wus_in_progress>

thats is multiplied by the ncpu.

We still observe several hosts that have many more that 2 tasks per cpus. This can be reproduced by increasing to 1 Store at least n days of work and Store up to an additional n days of work in a machine with 1 CPU.

@rdemaria
Copy link
Author

See also config_aux.html in https://boinc.berkeley.edu/trac/wiki/ProjectOptions. Maybe this works...

@rdemaria rdemaria changed the title Seems there is no control on how many task an host can cache Seems there is no control on how many jobs an host can cache Jul 25, 2017
@rdemaria
Copy link
Author

@Toby-Broom
Copy link

My computers will buffer alot of work even if I have store 0.5day and 0.01 day extra. I think the flops estimate is used in the calculations?

@rdemaria
Copy link
Author

I am also experiencing the same issue in my clients. I don't know the cause. We were thinking to address this by decreasing the timeout time, such that pending tasks are removed.

@Toby-Broom
Copy link

Maybe you could ask David Anderson?, seems to me the setting you used should have worked? I see now that I have 75 tasks on a 4 core machine this should have only 8? For Theory they lowered the max_ncpus and it worked as my hosts only got 10 WU even when I had many unused cores, so I asked for it to be removed. Mabey this has to be set also to enable the limits? then I would expect the max number of WU to be 128, I think I have seen more during the big submission a while back but I don't know if the server setting were enabled then?

@amereghe
Copy link

I suspect it is related to a wrong fpops_estimate, which is then used, together with the peak CPU FLOPs, to estimate the time required for each task... Toby, what is the expected time for the tasks that you receive? Could you report a couple of cases together with their names?

@Toby-Broom
Copy link

Workunit name Dtwo_70_hlbbo_2222_1.1_0.75__1__s__62.309_60.3119__2_4__6__81_1_sixvf_boinc3628
Estimated app speed 5.60 GFLOPs/sec
Estimated task size 180,000 GFLOPs
CPU time 01:02:00
Estimated time remaining 07:27:21

Workunit name w-c0_job_helhc_n10060__59__s__62.28_60.31__2_3__6__49.5_1_sixvf_boinc14665
Estimated app speed 11.34 GFLOPs/sec
Estimated task size 180,000 GFLOPs
CPU time 02:11:59
Elapsed time 02:12:38
Estimated time remaining 00:18:32

@Toby-Broom
Copy link

These aren't from the 75 task PC as they were all crunched this morning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants