Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapping of BOINC Device to Card Number #8

Open
Ricks-Lab opened this issue Jan 25, 2020 · 16 comments
Open

Mapping of BOINC Device to Card Number #8

Ricks-Lab opened this issue Jan 25, 2020 · 16 comments
Labels
enhancement New feature or request

Comments

@Ricks-Lab
Copy link
Owner

Ricks-Lab commented Jan 25, 2020

Need to find a robust way to automate the mapping of BOINC Device numbers to physical card by Linux card number or pcie ID.

@Ricks-Lab Ricks-Lab added the enhancement New feature or request label Jan 25, 2020
@Ricks-Lab
Copy link
Owner Author

I am working to get device_num and opencl_device_index from BOINC's coproc_info.xml. Since I don't have any Nvidida cards, can an Nvidia user post from their file here?

@KeithMyers
Copy link

I figured I'd better not use my coproc_info.xml from my spoofed daily driver. That would certainly confuse you. So here is the file from my dedicated Einstein host.

    <coprocs>
    <have_cuda>1</have_cuda>
    <cuda_version>10020</cuda_version>
<coproc_cuda>
   <count>1</count>
   <name>GeForce GTX 1070 Ti</name>
   <available_ram>4160749568.000000</available_ram>
   <have_cuda>1</have_cuda>
   <have_opencl>0</have_opencl>
   <peak_flops>8186112000000.000000</peak_flops>
   <cudaVersion>10020</cudaVersion>
   <drvVersion>44048</drvVersion>
   <totalGlobalMem>4294967295.000000</totalGlobalMem>
   <sharedMemPerBlock>49152.000000</sharedMemPerBlock>
   <regsPerBlock>65536</regsPerBlock>
   <warpSize>32</warpSize>
   <memPitch>2147483647.000000</memPitch>
   <maxThreadsPerBlock>1024</maxThreadsPerBlock>
   <maxThreadsDim>1024 1024 64</maxThreadsDim>
   <maxGridSize>2147483647 65535 65535</maxGridSize>
   <clockRate>1683000</clockRate>
   <totalConstMem>65536.000000</totalConstMem>
   <major>6</major>
   <minor>1</minor>
   <textureAlignment>512.000000</textureAlignment>
   <deviceOverlap>1</deviceOverlap>
   <multiProcessorCount>19</multiProcessorCount>
<pci_info>
   <bus_id>7</bus_id>
   <device_id>0</device_id>
   <domain_id>0</domain_id>
</pci_info>
</coproc_cuda>
<coproc_cuda>
   <count>1</count>
   <name>GeForce GTX 1070 Ti</name>
   <available_ram>4160749568.000000</available_ram>
   <have_cuda>1</have_cuda>
   <have_opencl>0</have_opencl>
   <peak_flops>8186112000000.000000</peak_flops>
   <cudaVersion>10020</cudaVersion>
   <drvVersion>44048</drvVersion>
   <totalGlobalMem>4294967295.000000</totalGlobalMem>
   <sharedMemPerBlock>49152.000000</sharedMemPerBlock>
   <regsPerBlock>65536</regsPerBlock>
   <warpSize>32</warpSize>
   <memPitch>2147483647.000000</memPitch>
   <maxThreadsPerBlock>1024</maxThreadsPerBlock>
   <maxThreadsDim>1024 1024 64</maxThreadsDim>
   <maxGridSize>2147483647 65535 65535</maxGridSize>
   <clockRate>1683000</clockRate>
   <totalConstMem>65536.000000</totalConstMem>
   <major>6</major>
   <minor>1</minor>
   <textureAlignment>512.000000</textureAlignment>
   <deviceOverlap>1</deviceOverlap>
   <multiProcessorCount>19</multiProcessorCount>
<pci_info>
   <bus_id>9</bus_id>
   <device_id>0</device_id>
   <domain_id>0</domain_id>
</pci_info>
</coproc_cuda>
<coproc_cuda>
   <count>1</count>
   <name>GeForce GTX 1070 Ti</name>
   <available_ram>4160749568.000000</available_ram>
   <have_cuda>1</have_cuda>
   <have_opencl>0</have_opencl>
   <peak_flops>8186112000000.000000</peak_flops>
   <cudaVersion>10020</cudaVersion>
   <drvVersion>44048</drvVersion>
   <totalGlobalMem>4294967295.000000</totalGlobalMem>
   <sharedMemPerBlock>49152.000000</sharedMemPerBlock>
   <regsPerBlock>65536</regsPerBlock>
   <warpSize>32</warpSize>
   <memPitch>2147483647.000000</memPitch>
   <maxThreadsPerBlock>1024</maxThreadsPerBlock>
   <maxThreadsDim>1024 1024 64</maxThreadsDim>
   <maxGridSize>2147483647 65535 65535</maxGridSize>
   <clockRate>1683000</clockRate>
   <totalConstMem>65536.000000</totalConstMem>
   <major>6</major>
   <minor>1</minor>
   <textureAlignment>512.000000</textureAlignment>
   <deviceOverlap>1</deviceOverlap>
   <multiProcessorCount>19</multiProcessorCount>
<pci_info>
   <bus_id>10</bus_id>
   <device_id>0</device_id>
   <domain_id>0</domain_id>
</pci_info>
</coproc_cuda>
   <nvidia_opencl>
      <name>GeForce GTX 1070 Ti</name>
      <vendor>NVIDIA Corporation</vendor>
      <vendor_id>4318</vendor_id>
      <available>1</available>
      <half_fp_config>0</half_fp_config>
      <single_fp_config>191</single_fp_config>
      <double_fp_config>63</double_fp_config>
      <endian_little>1</endian_little>
      <execution_capabilities>1</execution_capabilities>
      <extensions>cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics</extensions>
      <global_mem_size>8513978368</global_mem_size>
      <local_mem_size>49152</local_mem_size>
      <max_clock_frequency>1683</max_clock_frequency>
      <max_compute_units>19</max_compute_units>
      <nv_compute_capability_major>6</nv_compute_capability_major>
      <nv_compute_capability_minor>1</nv_compute_capability_minor>
      <amd_simd_per_compute_unit>0</amd_simd_per_compute_unit>
      <amd_simd_width>0</amd_simd_width>
      <amd_simd_instruction_width>0</amd_simd_instruction_width>
      <opencl_platform_version>OpenCL 1.2 CUDA 10.2.115</opencl_platform_version>
      <opencl_device_version>OpenCL 1.2 CUDA</opencl_device_version>
      <opencl_driver_version>440.48.02</opencl_driver_version>
      <device_num>0</device_num>
      <peak_flops>8186112000000.000000</peak_flops>
      <opencl_available_ram>4160749568.000000</opencl_available_ram>
      <opencl_device_index>0</opencl_device_index>
      <warn_bad_cuda>0</warn_bad_cuda>
   </nvidia_opencl>
   <nvidia_opencl>
      <name>GeForce GTX 1070 Ti</name>
      <vendor>NVIDIA Corporation</vendor>
      <vendor_id>4318</vendor_id>
      <available>1</available>
      <half_fp_config>0</half_fp_config>
      <single_fp_config>191</single_fp_config>
      <double_fp_config>63</double_fp_config>
      <endian_little>1</endian_little>
      <execution_capabilities>1</execution_capabilities>
      <extensions>cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics</extensions>
      <global_mem_size>8510701568</global_mem_size>
      <local_mem_size>49152</local_mem_size>
      <max_clock_frequency>1683</max_clock_frequency>
      <max_compute_units>19</max_compute_units>
      <nv_compute_capability_major>6</nv_compute_capability_major>
      <nv_compute_capability_minor>1</nv_compute_capability_minor>
      <amd_simd_per_compute_unit>0</amd_simd_per_compute_unit>
      <amd_simd_width>0</amd_simd_width>
      <amd_simd_instruction_width>0</amd_simd_instruction_width>
      <opencl_platform_version>OpenCL 1.2 CUDA 10.2.115</opencl_platform_version>
      <opencl_device_version>OpenCL 1.2 CUDA</opencl_device_version>
      <opencl_driver_version>440.48.02</opencl_driver_version>
      <device_num>1</device_num>
      <peak_flops>8186112000000.000000</peak_flops>
      <opencl_available_ram>4160749568.000000</opencl_available_ram>
      <opencl_device_index>1</opencl_device_index>
      <warn_bad_cuda>0</warn_bad_cuda>
   </nvidia_opencl>
   <nvidia_opencl>
      <name>GeForce GTX 1070 Ti</name>
      <vendor>NVIDIA Corporation</vendor>
      <vendor_id>4318</vendor_id>
      <available>1</available>
      <half_fp_config>0</half_fp_config>
      <single_fp_config>191</single_fp_config>
      <double_fp_config>63</double_fp_config>
      <endian_little>1</endian_little>
      <execution_capabilities>1</execution_capabilities>
      <extensions>cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics</extensions>
      <global_mem_size>8513978368</global_mem_size>
      <local_mem_size>49152</local_mem_size>
      <max_clock_frequency>1683</max_clock_frequency>
      <max_compute_units>19</max_compute_units>
      <nv_compute_capability_major>6</nv_compute_capability_major>
      <nv_compute_capability_minor>1</nv_compute_capability_minor>
      <amd_simd_per_compute_unit>0</amd_simd_per_compute_unit>
      <amd_simd_width>0</amd_simd_width>
      <amd_simd_instruction_width>0</amd_simd_instruction_width>
      <opencl_platform_version>OpenCL 1.2 CUDA 10.2.115</opencl_platform_version>
      <opencl_device_version>OpenCL 1.2 CUDA</opencl_device_version>
      <opencl_driver_version>440.48.02</opencl_driver_version>
      <device_num>2</device_num>
      <peak_flops>8186112000000.000000</peak_flops>
      <opencl_available_ram>4160749568.000000</opencl_available_ram>
      <opencl_device_index>2</opencl_device_index>
      <warn_bad_cuda>0</warn_bad_cuda>
   </nvidia_opencl>
<warning>NVIDIA library reports 3 GPUs</warning>
<warning>ATI: libaticalrt.so: cannot open shared object file: No such file or directory</warning>
    </coprocs>

@Ricks-Lab
Copy link
Owner Author

This is not what I expected! So confused. For AMD, there is only and <ati_opencl> for each card which contains a <device_num> and <opencl_device_index> and definitely no pcie id info.

Your file shows <nvidia_opencl> entries that are similar, but also found <coproc_cuda> entries that have no <device_num> entries, but do contain pcie id information. I am not sure what to do with the <coproc_cuda> entries as there is no <device_num> to link to anything else. Do you think these could be ignored?

@JStateson
Copy link

JStateson commented Jan 25, 2020

====nvidia linux
https://stateson.net/images/h110btc_coproc.xml
https://stateson.net/images/tb85_coproc.xml
=====amd windows
https://stateson.net/images/s9x00_coproc.xml

the following are defective coproc_nfo.xml files
that were discussed at boinc forums about the date
shown. They generated phantom GPUs exactly 2x as
many as actually existed and were fixed by
?reinstall of driver after cleaning with ddu
?revert driver to older
?moved boards to different slot

12/30/2019
https://stateson.net/images/bad_coproc_info_NV_441_66.xml

4/7/2019
https://stateson.net/images/coproc_info_10_nfg.xml

pretty sure the following was also defective but don't remember
but I could put a pair in crossfire (or SLI) to see the difference
2/19/2019 difference in clinfo when crossfire is enabled and
includes the coproc.xml file in the zip
https://stateson.net/images/gpuinfo.zip

@KeithMyers
Copy link

That is why I gave the coproc_info.xml from a stock system. NO perturbations from any spoofing going on. Why can't you use both sections of the file? The CUDA section for each device shows the PCIE busID number and the OpenCL section shows the BOINC number.

AFAIK No Nvidia card system ever had the doubled card issue that the AMD cards had. That was fixed in the client in last six months or so. Had to do with the AMD API that BOINC reads. They had to fix the FLOPS readback too because the cards were shown as 1000X more powerful than actual which caused task estimation times to completely unrealistic.

I am running from the 7.16.3 client branch with most of the latest bug fixes.

@JStateson
Copy link

Got another system: this has a pair of different AMD and a single nvidia 3d processor, all on linux and working.
https://stateson.net/images/z400_coproc.xml

@Ricks-Lab
Copy link
Owner Author

Got another system: this has a pair of different AMD and a single nvidia 3d processor, all on linux and working.
https://stateson.net/images/z400_coproc.xml

From your file, it seems like there is always entries for NV (coproc_cuda and nvidia_opencl), while AMD only has (ati_opencl). Summarizing device_num and opencl_device_index:

  • AMD RX570, device_num = 0, opencl_device_index = 0
  • AMD RX560, device_num = 1, opencl_device_index = 1
  • NV P106, device_num = 0, opencl_device_index = 0

Since the device_num fields are not unique, it would seem that what is called device_num in this file is not necessarily the same as the final boinc device. Do you have actual device numbers for this system? Maybe it is sequential with one vendor followed by another...

@Ricks-Lab
Copy link
Owner Author

@KeithMyers Your posting with the latest version in the Energy Metrics verification thread indicated that the coproc_info.xml file was not found:
coproc_info.xml file not found: [None]
Where is this file located on your system? Currently, benchMT is looking in boinc_home.

@KeithMyers
Copy link

boinc_home is set in the BenchCFG file.
#Specify path for BOINC
mode boinc_home /home/keith/Desktop/BOINC/

@Ricks-Lab
Copy link
Owner Author

boinc_home is set in the BenchCFG file.
#Specify path for BOINC
mode boinc_home /home/keith/Desktop/BOINC/

The latest on master will print out full path name when giving the error next time you run it. That should make it more clear.

@KeithMyers
Copy link

So why does it always complain about the location of the boinc_home location?

Invalid mode specified in CFG file: [boinc_home][/home/keith/Desktop/BOINC/]

keith@Serenity:~/Downloads/benchMT-master$ ./benchMT --gpu_devices 0 --energy --debug --devmap 0:0,1:1,2:2
Using python: 3.6.9
Using Linux Kernel: 5.3.0-26-generic
mb_const.boinc_home: [/home/boinc/BOINC/]
mb_const.cpu_app_subdir: [APPS_CPU/]
mb_const.gpu_app_subdir: [APPS_GPU/]
mb_const.ref_app_subdir: [APPS_REF/]
mb_const.ref_results_subdir: [REF_RESULTS/]
mb_const.wu_subdir: [WU_test/]
mb_const.std_signal_subdir: [WU_std_signal/]
mb_const.testdata_subdir: [testData/]
mb_const.workdir_subdir: [workdir/]
mb_const.slots_subdir: [Slots/]
mb_const.command_line_filename: [BenchCFG]
mb_const.boinccmd: [boinccmd]
mb_const.template_file: [init_data.xml.template]
mb_const.coproc_file_name: [coproc_info.xml]
mb_const.wu_cmp: [rescmpv5_l]
mb_const.suspend_args: [['boinccmd --set_gpu_mode never 172800', 'boinccmd --set_run_mode never 172800']]
mb_const.resume_args: [['boinccmd --set_gpu_mode never 1', 'boinccmd --set_run_mode never 1']]
mb_const.activeWU: [work_unit.sah]
mb_const.activeAPWU: [in.dat]
mb_const.DEBUG: [True]
mb_const.noBS: [False]
mb_const.env: [<main.BENCH_ENV object at 0x7feb06de3978>]
mb_const.card_root: [/sys/class/drm/]
mb_const.hwmon_sub: [hwmon/hwmon]
mb_const.cmd_lspci: [/usr/bin/lspci]
mb_const.cmd_lshw: [/usr/bin/lshw]
mb_const.cmd_lscpu: [/usr/bin/lscpu]
mb_const.cmd_clinfo: [/usr/bin/clinfo]
mb_const.cmd_time: [/usr/bin/time]
mb_const.cmd_lsb_release: [/usr/bin/lsb_release]
mb_const.cmd_nvidia_smi: [/usr/bin/nvidia-smi]
benchMT workdir Path [ /home/keith/Downloads/benchMT-master/workdir/ ] does not exist, making...
TestData Path [ /home/keith/Downloads/benchMT-master/testData/ ] does not exist, making...
coproc_file [/home/boinc/BOINC/coproc_info.xml] does not exist
Invalid mode specified in CFG file: [boinc_home][/home/keith/Desktop/BOINC/]
Read valid mode from CFG file: [num_repetitions][1]
Read valid mode from CFG file: [max_threads][4]
Read valid mode from CFG file: [max_gpus][3]
Invalid mode specified in CFG file: [gpu_devices][0,1,2]
Invalid mode specified in CFG file: [devmap][0:0,2:2,2:2]

Initial app list
┌────┬────┬───┬────────────────────────────────────────────────────────────┬────────┬────────┬───────────┬────────┐
│Job#│Slot│xPU│app_name │ start │ finish │tot_time │ state │
│ │ │ │app_args │wu_name │
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│0 │ NA │GPU│setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102 │ NA │ NA │ NA │PENDING │
│ │ │ │ │not assigned │
└────┴────┴───┴────────────────────────────────────────────────────────────┴──────────────────────────────────────┘
CFG_mode: yes None
CFG_mode: run_name None
CFG_mode: boinc_home /home/keith/Desktop/BOINC/
CFG_mode: noBS None
CFG_mode: display_compact None
CFG_mode: display_slots None
CFG_mode: num_repetitions 1
CFG_mode: max_threads 4
CFG_mode: max_gpus 3
CFG_mode: gpu_devices 0,1,2
CFG_mode: devmap 0:0,2:2,2:2
CFG_mode: std_signals None
CFG_mode: no_ref None
CFG_mode: force_ref None
CFG_mode: energy None
CFG_mode: astropulse None
ocl_device_name [GeForce RTX 2080]
ocl_device_version [OpenCL 1.2 CUDA]
ocl_pcie_id [08:00.0]
cl_index: ['GeForce RTX 2080', 'OpenCL 1.2 CUDA', '0']
ocl_device_name [GeForce RTX 2080]
ocl_device_version [OpenCL 1.2 CUDA]
ocl_pcie_id [0a:00.0]
cl_index: ['GeForce RTX 2080', 'OpenCL 1.2 CUDA', '1']
ocl_device_name [GeForce GTX 1080]
ocl_device_version [OpenCL 1.2 CUDA]
ocl_pcie_id [0b:00.0]
cl_index: ['GeForce GTX 1080', 'OpenCL 1.2 CUDA', '2']
{'08:00.0': ['GeForce RTX 2080', 'OpenCL 1.2 CUDA', '0'], '0a:00.0': ['GeForce RTX 2080', 'OpenCL 1.2 CUDA', '1'], '0b:00.0': ['GeForce GTX 1080', 'OpenCL 1.2 CUDA', '2']}
coproc_info.xml file not found: [None]
{}
{}
Found 3 GPUs
GPU: 08:00.0
['08:00.0 VGA compatible controller: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1)', '\tSubsystem: eVga.com. Corp. Device 2184', '\tKernel driver in use: nvidia', '\tKernel modules: nvidiafb, nouveau, nvidia_drm, nvidia', '']
hw_file_search: []
Power reading for 08:00.0: 41.7 W
GPU: 0a:00.0
['0a:00.0 VGA compatible controller: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1)', '\tSubsystem: eVga.com. Corp. Device 2184', '\tKernel driver in use: nvidia', '\tKernel modules: nvidiafb, nouveau, nvidia_drm, nvidia', '']
hw_file_search: []
Power reading for 0a:00.0: 56.94 W
GPU: 0b:00.0
['0b:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)', '\tSubsystem: eVga.com. Corp. GP104 [GeForce GTX 1080]', '\tKernel driver in use: nvidia', '\tKernel modules: nvidiafb, nouveau, nvidia_drm, nvidia', '']
hw_file_search: []
Power reading for 0b:00.0: 46.37 W
GPU_ITEM: uuid: d59a941a57574e6fa702393e9ea0eb6f
pcie_id: 08:00.0
model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1)
vendor: NVIDIA
driver: nvidiafb, nouveau, nvidia_drm, nvidia
openCL Device: GeForce RTX 2080
openCL Version: OpenCL 1.2 CUDA
openCL Index: 0
card number: 0
BOINC Device number: None
card path: /sys/class/drm/card0/device
hwmon path: None
Compute compatible: True
Energy compatible: True
GPU_ITEM: uuid: f2033af3e22548058e51bb5521bb02e9
pcie_id: 0a:00.0
model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1)
vendor: NVIDIA
driver: nvidiafb, nouveau, nvidia_drm, nvidia
openCL Device: GeForce RTX 2080
openCL Version: OpenCL 1.2 CUDA
openCL Index: 1
card number: 1
BOINC Device number: None
card path: /sys/class/drm/card1/device
hwmon path: None
Compute compatible: True
Energy compatible: True
GPU_ITEM: uuid: dc5c304a92cc4fc08ee45571f89c26f7
pcie_id: 0b:00.0
model: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)
vendor: NVIDIA
driver: nvidiafb, nouveau, nvidia_drm, nvidia
openCL Device: GeForce GTX 1080
openCL Version: OpenCL 1.2 CUDA
openCL Index: 2
card number: 2
BOINC Device number: None
card path: /sys/class/drm/card2/device
hwmon path: None
Compute compatible: True
Energy compatible: True
GPU_ITEM: uuid: d59a941a57574e6fa702393e9ea0eb6f
pcie_id: 08:00.0
model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1)
vendor: NVIDIA
driver: nvidiafb, nouveau, nvidia_drm, nvidia
openCL Device: GeForce RTX 2080
openCL Version: OpenCL 1.2 CUDA
openCL Index: 0
card number: 0
BOINC Device number: None
card path: /sys/class/drm/card0/device
hwmon path: None
Compute compatible: True
Energy compatible: True
GPU_ITEM: uuid: f2033af3e22548058e51bb5521bb02e9
pcie_id: 0a:00.0
model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1)
vendor: NVIDIA
driver: nvidiafb, nouveau, nvidia_drm, nvidia
openCL Device: GeForce RTX 2080
openCL Version: OpenCL 1.2 CUDA
openCL Index: 1
card number: 1
BOINC Device number: None
card path: /sys/class/drm/card1/device
hwmon path: None
Compute compatible: True
Energy compatible: True
GPU_ITEM: uuid: dc5c304a92cc4fc08ee45571f89c26f7
pcie_id: 0b:00.0
model: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)
vendor: NVIDIA
driver: nvidiafb, nouveau, nvidia_drm, nvidia
openCL Device: GeForce GTX 1080
openCL Version: OpenCL 1.2 CUDA
openCL Index: 2
card number: 2
BOINC Device number: None
card path: /sys/class/drm/card2/device
hwmon path: None
Compute compatible: True
Energy compatible: True
devmap: {0: 0, 1: 1, 2: 2}
Specified gpu_devices: [0]
Mismatch with allocated GPUS [3], reset allocated devices to 1
init_data.xml_template file [/home/keith/Downloads/benchMT-master/workdir/init_data.xml.template] does not exist, creating...
GPU_ITEM: uuid: d59a941a57574e6fa702393e9ea0eb6f
pcie_id: 08:00.0
model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1)
vendor: NVIDIA
driver: nvidiafb, nouveau, nvidia_drm, nvidia
openCL Device: GeForce RTX 2080
openCL Version: OpenCL 1.2 CUDA
openCL Index: 0
card number: 0
BOINC Device number: 0
card path: /sys/class/drm/card0/device
hwmon path: None
Compute compatible: True
Energy compatible: True
GPU_ITEM: uuid: f2033af3e22548058e51bb5521bb02e9
pcie_id: 0a:00.0
model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1)
vendor: NVIDIA
driver: nvidiafb, nouveau, nvidia_drm, nvidia
openCL Device: GeForce RTX 2080
openCL Version: OpenCL 1.2 CUDA
openCL Index: 1
card number: 1
BOINC Device number: 1
card path: /sys/class/drm/card1/device
hwmon path: None
Compute compatible: True
Energy compatible: True
GPU_ITEM: uuid: dc5c304a92cc4fc08ee45571f89c26f7
pcie_id: 0b:00.0
model: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)
vendor: NVIDIA
driver: nvidiafb, nouveau, nvidia_drm, nvidia
openCL Device: GeForce GTX 1080
openCL Version: OpenCL 1.2 CUDA
openCL Index: 2
card number: 2
BOINC Device number: 2
card path: /sys/class/drm/card2/device
hwmon path: None
Compute compatible: True
Energy compatible: True
GPU Energy metrics enabled for 3 GPUs.
GPU_ITEM: uuid: d59a941a57574e6fa702393e9ea0eb6f
pcie_id: 08:00.0
model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1)
vendor: NVIDIA
driver: nvidiafb, nouveau, nvidia_drm, nvidia
openCL Device: GeForce RTX 2080
openCL Version: OpenCL 1.2 CUDA
openCL Index: 0
card number: 0
BOINC Device number: 0
card path: /sys/class/drm/card0/device
hwmon path: None
Compute compatible: True
Energy compatible: True
GPU_ITEM: uuid: f2033af3e22548058e51bb5521bb02e9
pcie_id: 0a:00.0
model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1)
vendor: NVIDIA
driver: nvidiafb, nouveau, nvidia_drm, nvidia
openCL Device: GeForce RTX 2080
openCL Version: OpenCL 1.2 CUDA
openCL Index: 1
card number: 1
BOINC Device number: 1
card path: /sys/class/drm/card1/device
hwmon path: None
Compute compatible: True
Energy compatible: True
GPU_ITEM: uuid: dc5c304a92cc4fc08ee45571f89c26f7
pcie_id: 0b:00.0
model: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)
vendor: NVIDIA
driver: nvidiafb, nouveau, nvidia_drm, nvidia
openCL Device: GeForce GTX 1080
openCL Version: OpenCL 1.2 CUDA
openCL Index: 2
card number: 2
BOINC Device number: 2
card path: /sys/class/drm/card2/device
hwmon path: None
Compute compatible: True
Energy compatible: True
Hostname: Serenity
Run Name:
APP Mode: MultiBeam
benchMT version: v2.0.0
Platform: Linux 5.3.0-26-generic
OS Description: Ubuntu 18.04.3 LTS
CPU Model: AMD Ryzen 9 3950X 16-Core Processor
CPU MHz: 4200.0000
CPU Cores: 16
CPU Threads: 32
GPU Count: 3
GPU Threads: 3
Specified GPU Device List: [0]
Devices Map: {0: 0, 1: 1, 2: 2}
BOINC Device List: [0, 1, 2]
GPU Card Number List: [0, 1, 2]
GPU Details:
Card0, PCIE ID: 08:00.0, Model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1)
Card1, PCIE ID: 0a:00.0, Model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1)
Card2, PCIE ID: 0b:00.0, Model: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)
Current Dir: /home/keith/Downloads/benchMT-master
Slots Dir: /home/keith/Downloads/benchMT-master/workdir/Slots/
TimeNow: Sun Jan 26 02:44:50 2020
TimeNowShort: 0126_024450
CPU App Path: /home/keith/Downloads/benchMT-master/APPS_CPU/
GPU App Path: /home/keith/Downloads/benchMT-master/APPS_GPU/
REF App Path: /home/keith/Downloads/benchMT-master/APPS_REF/
Reference Results Path: /home/keith/Downloads/benchMT-master/APPS_REF/REF_RESULTS/
STD Signal WU Path: /home/keith/Downloads/benchMT-master/WU_std_signal/
WU Path: /home/keith/Downloads/benchMT-master/WU_test/
Test Data Path: /home/keith/Downloads/benchMT-master/testData/
BOINC Home: /home/keith/Desktop/BOINC/
Repetitions: 1
Allocated CPU Threads: 4
Allocated GPU Threads: 1
Mode yes: False
Mode noBS: False
Mode std_signals: False
Mode display_slots: False
Mode display_compact: False
Mode no_ref: False
Mode force_ref: False
Mode energy: True
Mode astropulse: False

Initial WU list
┌────┬────┬───┬────────────────────────────────────────────────────────────┬────────┬────────┬───────────┬────────┐
│Job#│Slot│xPU│app_name │ start │ finish │tot_time │ state │
│ │ │ │app_args │wu_name │
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│0 │ NA │UNK│ │ NA │ NA │ NA │PENDING │
│ │ │ │ │21no18aa.19740.24238.14.41.27.wu │
└────┴────┴───┴────────────────────────────────────────────────────────────┴──────────────────────────────────────┘

Reference Job List
┌────┬────┬───┬────────────────────────────────────────────────────────────┬────────┬────────┬───────────┬────────┐
│Job#│Slot│xPU│app_name │ start │ finish │tot_time │ state │
│ │ │ │app_args │wu_name │
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│0 │ NA │REF│ref-cpu.setiathome_8.00_x86_64-pc-linux-gnu │SKIPPED │SKIPPED │SKIPPED │COMPLETE│
│ │ │ │ --nographics │21no18aa.19740.24238.14.41.27.wu │
└────┴────┴───┴────────────────────────────────────────────────────────────┴──────────────────────────────────────┘

Final Job List
┌────┬────┬───┬────────────────────────────────────────────────────────────┬────────┬────────┬───────────┬────────┐
│Job#│Slot│xPU│app_name │ start │ finish │tot_time │ state │
│ │ │ │app_args │wu_name │
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│0 │ NA │GPU│setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102 │ NA │ NA │ NA │PENDING │
│ │ │ │ │21no18aa.19740.24238.14.41.27.wu │
└────┴────┴───┴────────────────────────────────────────────────────────────┴──────────────────────────────────────┘
For 0 CPU jobs and 1 GPU slots. Allocated Threads reduced to 1
List of Initialized Slots
SlotNum | platform | device | state | job | SlotDir
-0------| GPU | 0 | EMPTY | None| /home/keith/Downloads/benchMT-master/workdir/Slots/0

1 total slots

Pending jobs (CPU/GPU): 0 / 1
Pending reference jobs: 0
Execute listed jobs? [y/N]y

1 of 2 jobs complete

┌────┬────┬───┬────────────────────────────────────────────────────────────┬────────┬────────┬───────────┬────────┐
│Job#│Slot│xPU│app_name │ start │ finish │tot_time │ state │
│ │ │ │app_args │wu_name │
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│0 │ NA │GPU│setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102 │ NA │ NA │ NA │PENDING │
│ │ │ │ │21no18aa.19740.24238.14.41.27.wu │
└────┴────┴───┴────────────────────────────────────────────────────────────┴──────────────────────────────────────┘
boinc is not running, skip suspend
Platform: CPU Total Pending Jobs: 1
Pending jobs (CPU/GPU/REF): 0 / 1 / 0
Available slots (CPU/GPU): 0 / 1

Platform: GPU Total Pending Jobs: 1
Pending jobs (CPU/GPU/REF): 0 / 1 / 0
Available slots (CPU/GPU): 0 / 1

SlotNum: 0 JobUUID: 6c1250feffbb4aa1b46363cae49c7de5
Device 0 maps to card 0
/home/keith/Downloads/benchMT-master/APPS_GPU/setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102 -device 0

Run Details
Final command: /usr/bin/time -f "Real=%e|User=%U|System=%S|MaxMem=%M|SwapNum=%W|CtxSwt=%c|MajPF=%F" /home/keith/Downloads/benchMT-master/APPS_GPU/setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102 -device 0
Execution directory: /home/keith/Downloads/benchMT-master/workdir/Slots/0

1 of 2 jobs complete

┌────┬────┬───┬────────────────────────────────────────────────────────────┬────────┬────────┬───────────┬────────┐
│Job#│Slot│xPU│app_name │ start │ finish │tot_time │ state │
│ │ │ │app_args │wu_name │
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│0 │ 0 │GPU│setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102 │02:47:01│ NA │ NA │ACTIVE │
│ │ │ │ -device 0 │21no18aa.19740.24238.14.41.27.wu │
└────┴────┴───┴────────────────────────────────────────────────────────────┴──────────────────────────────────────┘
gCudaDevProps.multiProcessorCount = 46
Work data buffer for fft results size = 320864256
MallocHost G=67108864 T=33554432 P=18874368 (16)
MallocHost tmp_PoTP=16777216
MallocHost tmp_PoTP2=16777216
MallocHost tmp_PoTT=16777216
MallocHost tmp_PoTG=4194304
MallocHost best_PoTP=16777216
MallocHost bestPoTG=4194304
Allocating tmp data buf for unroll 1
MallocHost tmp_smallPoT=524288
MallocHost PowerSpectrumSumMax=1572864
CUDA stream priority range: low 0 and high: -1
GPSF 1.845660 2 3.518231
Gauss: start 2 stop 62 len 60
AcIn 16779264 AcOut 33558528
Mallocing blockSums 24576 bytes
before async chirp
after fft plans
bBbB...............Power reading for 08:00.0: 160.6 W
Cumulative Energy for 08:00.0: 0.000135 kWh
.......................Power reading for 08:00.0: 164.08 W
Cumulative Energy for 08:00.0: 0.000272 kWh
........................Power reading for 08:00.0: 163.04 W
Cumulative Energy for 08:00.0: 0.000409 kWh
........................Power reading for 08:00.0: 164.12 W
Cumulative Energy for 08:00.0: 0.000546 kWh
........................Power reading for 08:00.0: 163.84 W
Cumulative Energy for 08:00.0: 0.000683 kWh
........................Power reading for 08:00.0: 164.99 W
Cumulative Energy for 08:00.0: 0.000821 kWh
........................Power reading for 08:00.0: 164.52 W
Cumulative Energy for 08:00.0: 0.000959 kWh
........................Power reading for 08:00.0: 163.62 W
Cumulative Energy for 08:00.0: 0.001096 kWh
........................Power reading for 08:00.0: 158.6 W
Cumulative Energy for 08:00.0: 0.001229 kWh
..................bB........Power reading for 08:00.0: 162.61 W
Cumulative Energy for 08:00.0: 0.001365 kWh
...................bB...bBP...Power reading for 08:00.0: 169.16 W
Cumulative Energy for 08:00.0: 0.001507 kWh
..........................Power reading for 08:00.0: 159.43 W
Cumulative Energy for 08:00.0: 0.00164 kWh
................FFtLen : spike gauss autocorr triplet pulse
1: 0 0 0 0 0
2: 0 0 0 0 0
4: 0 0 0 0 0
8: 15 15 0 15 7
16: 29 29 0 29 15
32: 57 57 0 57 29
64: 115 115 0 115 57
128: 229 229 0 229 115
256: 459 459 0 459 229
512: 919 919 0 919 459
1024: 1837 1837 0 1837 919
2048: 3673 3673 0 3673 1837
4096: 7347 7347 0 7347 0
8192: 14695 14695 0 0 0
16384: 29389 29389 0 0 0
32768: 13525 0 0 0 0
65536: 16229 0 0 0 0
131072: 64917 0 64918 0 0

Best scores written
Out file closed
Cuda free done
Cuda device reset done
Power reading for 08:00.0: 61.28 W
Cumulative Energy for 08:00.0: 0.001691 kWh
Time output: Real=40.08|User=7.72|System=2.26|MaxMem=691596|SwapNum=0|CtxSwt=147|MajPF=0
Copy2: /home/keith/Downloads/benchMT-master/workdir/Slots/0/results.sah /home/keith/Downloads/benchMT-master/testData/Serenity_benchMT__0126_024450/result.setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102.21no18aa.19740.24238.14.41.27.wu.6c1250feffbb4aa1b46363cae49c7de5.sah
Copy2: /home/keith/Downloads/benchMT-master/workdir/Slots/0/stderr.txt /home/keith/Downloads/benchMT-master/testData/Serenity_benchMT__0126_024450/stderr.setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102.21no18aa.19740.24238.14.41.27.wu.6c1250feffbb4aa1b46363cae49c7de5.txt
Angle Range: [0.73613926793513]
Results compare command: /home/keith/Downloads/benchMT-master/rescmpv5_l /home/keith/Downloads/benchMT-master/testData/Serenity_benchMT__0126_024450/result.setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102.21no18aa.19740.24238.14.41.27.wu.6c1250feffbb4aa1b46363cae49c7de5.sah /home/keith/Downloads/benchMT-master/APPS_REF/REF_RESULTS/ref-result.setiathome_8.00_x86_64-pc-linux-gnu.21no18aa.19740.24238.14.41.27.wu.sah 2>/dev/null
Results compare output: Result : Strongly similar, Q= 99.72%
Clean slots removed: /home/keith/Downloads/benchMT-master/workdir/Slots/0/boinc_finish_called
Clean slots removed: /home/keith/Downloads/benchMT-master/workdir/Slots/0/state.sah

2 of 2 jobs complete

┌────┬────┬───┬────────────────────────────────────────────────────────────┬────────┬────────┬───────────┬────────┐
│Job#│Slot│xPU│app_name │ start │ finish │tot_time │ state │
│ │ │ │app_args │wu_name │
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│0 │ NA │GPU│setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102 │02:47:01│02:47:44│0:00:42.185│COMPLETE│
│ │ │ │ -device 0 │21no18aa.19740.24238.14.41.27.wu │
└────┴────┴───┴────────────────────────────────────────────────────────────┴──────────────────────────────────────┘
0

2 of 2 jobs complete

┌────┬────┬───┬────────────────────────────────────────────────────────────┬────────┬────────┬───────────┬────────┐
│Job#│Slot│xPU│app_name │ start │ finish │tot_time │ state │
│ │ │ │app_args │wu_name │
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│0 │ NA │GPU│setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102 │02:47:01│02:47:44│0:00:42.185│COMPLETE│
│ │ │ │ -device 0 │21no18aa.19740.24238.14.41.27.wu │
└────┴────┴───┴────────────────────────────────────────────────────────────┴──────────────────────────────────────┘
Slot list:
SlotNum | platform | device | state | job | SlotDir
-0------| GPU | 0 | EMPTY | None| /home/keith/Downloads/benchMT-master/workdir/Slots/0

1 total slots

boinc is not running, skip resume
BenchMT run complete. Results location: /home/keith/Downloads/benchMT-master/testData/Serenity_benchMT__0126_024450

keith@Serenity:~/Downloads/benchMT-master$

You really need to refer to the specified boinc_home location on the command line parameters or the specified location of boinc_home in the BenchCFG file for the location of the coproc_info.xml file.

@Ricks-Lab
Copy link
Owner Author

Apologies, but I just realized that I didn't actually push my latest update. Just pushed it. It will give the full path benchMT is using and suppress app output.

Not sure if you are downloading each time or using git. git would be much easier:
Download once using:
git clone https://github.com/Ricks-Lab/benchMT.git
Update with:
git pull

@JStateson
Copy link

JStateson commented Feb 3, 2020

some topography mapping I put together using ms word. Not sure how usefull
https://stateson.net/images/mb_topography.docx

@KeithMyers
Copy link

James the URL you provide goes to Rick's repo and not yours.

@JStateson
Copy link

JStateson commented Feb 4, 2020

Thanks
Was trying to make some sense of the bus id as it relates to which slot a card is in.. Going to try a crossfire and an SLI when I get a chance to see if that makes a difference. I see really different values for windows and Linux on same motherboard "TB85"

BTW, I broke down and ordered my first RTX board, an RTX2080 super from Gigabyte. They repaired two of my "blown out" RX-570s at no cost and that made me select them instead of eVga.

@KeithMyers
Copy link

Happy to hear that Gigabyte stepped up. I didn't have much luck with them back in the day with my GTX 460 which had a fan die. Had to buy a aftermarket fan myself and kludge it in.

Have had no issues getting EVGA cards replace or repaired under warranty. No reason to change since they have always treated me well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants