Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BuildRules] Use hipcc to link product which has hip.cc sources #8274

Merged
merged 3 commits into from
Jan 31, 2023

Conversation

smuzaffar
Copy link
Contributor

@smuzaffar smuzaffar commented Jan 30, 2023

  • use hipcc for linking products ( libs, bin/tests) which have hit.cc files
  • Bug fix: Avoid duplicate source files e.g. file="*.cc *.hip.cc" can have duplicate hip.cc files (once matched due to *.cc and one due to *.hip.cc)
  • Fix _rocm.a build order

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @smuzaffar (Malik Shahzad Muzaffar) for branch IB/CMSSW_13_0_X/master.

@cmsbuild, @smuzaffar, @aandvalenzuela, @iarspider can you please review it and eventually sign? Thanks.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.
cms-bot commands are listed here

@smuzaffar
Copy link
Contributor Author

please test with cms-sw/cmssw#40619

@cmsbuild
Copy link
Contributor

Pull request #8274 was updated.

@smuzaffar
Copy link
Contributor Author

please test with cms-sw/cmssw#40605

@fwyzard
Copy link
Contributor

fwyzard commented Jan 30, 2023

@smuzaffar I found the next problem with the HIPCC linking: it looks like some input files are listed twice, resulting in "duplicate symbol" errors.

For example, when linking HeterogeneousCore/ROCmTestDevice from cms-sw/cmssw#40637 I get

>> Building edm plugin tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/plugins/HeterogeneousCoreROCmTestDevicePlugins/libHeterogeneousCoreROCmTestDevicePlugins.so
/cvmfs/patatrack.cern.ch/externals/x86_64/rhel8/amd/rocm-5.4.2/bin/hipcc -fgpu-rdc --offload-arch=gfx900 --target=x86_64-redhat-linux-gnu --gcc-toolchain=/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02770/el8_amd64_gcc11/external/gcc/11.2.1-f9b9dfdd886f71cd63f5538223d8f161 -O2 -pthread -pipe -Werror=main -Werror=pointer-arith -Werror=overlength-strings -Wno-vla -Werror=overflow -std=c++1z -ftree-vectorize -Werror=array-bounds -Werror=type-limits -fvisibility-inlines-hidden -fno-math-errno --param vect-max-version-for-alias-checks=50 -Xassembler --compress-debug-sections -msse3 -felide-constructors -fmessage-length=0 -Wall -Wno-long-long -Wreturn-type -Wextra -Wpessimizing-move -Wclass-memaccess -Wno-cast-function-type -Wno-unused-but-set-parameter -Wno-ignored-qualifiers -Wno-deprecated-copy -Wno-unused-parameter -Wunused -Wparentheses -Wno-deprecated -Werror=return-type -Werror=missing-braces -Werror=unused-value -Werror=unused-label -Werror=address -Werror=format -Werror=sign-compare -Werror=write-strings -Werror=delete-non-virtual-dtor -Werror=strict-aliasing -Werror=narrowing -Werror=reorder -Werror=unused-variable -Werror=conversion-null -Wnon-virtual-dtor -Werror=switch -fdiagnostics-show-option -Wno-unused-local-typedefs -Wno-attributes -Wno-psabi -Wno-c99-extensions -Wno-c++11-narrowing -D__STRICT_ANSI__ -Wno-unused-private-field -Wno-unknown-pragmas -Wno-unused-command-line-argument -Wno-unknown-warning-option -ftemplate-depth=512 -Wno-error=potentially-evaluated-expression -Wno-tautological-type-limit-compare -fsized-deallocation -Wno-error=unused-variable -DBOOST_DISABLE_ASSERTS -shared -Wl,-E -Wl,-z,defs tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/plugins/HeterogeneousCoreROCmTestDevicePlugins/ROCmTestDeviceAdditionAlgo.hip.cc.o tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/plugins/HeterogeneousCoreROCmTestDevicePlugins/ROCmTestDeviceAdditionModule.cc.o tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/plugins/HeterogeneousCoreROCmTestDevicePlugins/ROCmTestDeviceAdditionAlgo.hip.cc.o -o tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/plugins/HeterogeneousCoreROCmTestDevicePlugins/libHeterogeneousCoreROCmTestDevicePlugins.so -Wl,-E -Wl,--hash-style=gnu -L/data/user/fwyzard/CMSSW_13_0_X_2023-01-29-1100/biglib/el8_amd64_gcc11 -L/data/user/fwyzard/CMSSW_13_0_X_2023-01-29-1100/lib/el8_amd64_gcc11 -L/data/user/fwyzard/CMSSW_13_0_X_2023-01-29-1100/external/el8_amd64_gcc11/lib -L/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc11/cms/cmssw/CMSSW_13_0_X_2023-01-29-1100/biglib/el8_amd64_gcc11 -L/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc11/cms/cmssw/CMSSW_13_0_X_2023-01-29-1100/lib/el8_amd64_gcc11 -L/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc11/cms/cmssw/CMSSW_13_0_X_2023-01-29-1100/external/el8_amd64_gcc11/lib -L/data/user/fwyzard/CMSSW_13_0_X_2023-01-29-1100/static/el8_amd64_gcc11 -L/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc11/cms/cmssw/CMSSW_13_0_X_2023-01-29-1100/static/el8_amd64_gcc11 -lHeterogeneousCoreROCmTestDevice_rocm -lFWCoreFramework -lFWCoreCommon -lFWCoreServiceRegistry -lDataFormatsCommon -lFWCoreParameterSet -lFWCoreMessageLogger -lDataFormatsProvenance -lFWCorePluginManager -lFWCoreReflection -lDataFormatsStdDictionaries -lFWCoreConcurrency -lFWCoreUtilities -lFWCoreVersion -lHeterogeneousCoreROCmTestDevice -lHeterogeneousCoreROCmUtilities -lTree -lNet -lThread -lMathCore -lRIO -lCore -lboost_thread -lboost_date_time -lpcre -lbz2 -luuid -ltbb -llzma -lz -lfmt -lcms-md5 -lamdhip64 -lcrypt -ldl -lrt -lstdc++fs -ltinyxml2
lld: error: duplicate symbol: HeterogeneousCoreROCmTestDevicePlugins::kernel_add_vectors_f(float const*, float const*, float*, unsigned long)
>>> defined in /tmp/fwyzard/ROCmTestDeviceAdditionAlgo-126b16/ROCmTestDeviceAdditionAlgo-gfx900.o
>>> defined in /tmp/fwyzard/ROCmTestDeviceAdditionAlgo-fcd91c/ROCmTestDeviceAdditionAlgo-gfx900.o
clang-15: error: amdgcn-link command failed with exit code 1 (use -v to see invocation)

tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/plugins/HeterogeneousCoreROCmTestDevicePlugins/ROCmTestDeviceAdditionAlgo.hip.cc.o is listed twice, hence the error.

If I remove it the duplicate entry, I move on to the next error

/cvmfs/patatrack.cern.ch/externals/x86_64/rhel8/amd/rocm-5.4.2/bin/hipcc -fgpu-rdc --offload-arch=gfx900 --target=x86_64-redhat-linux-gnu --gcc-toolchain=/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02770/el8_amd64_gcc11/external/gcc/11.2.1-f9b9dfdd886f71cd63f5538223d8f161 -O2 -pthread -pipe -Werror=main -Werror=pointer-arith -Werror=overlength-strings -Wno-vla -Werror=overflow -std=c++1z -ftree-vectorize -Werror=array-bounds -Werror=type-limits -fvisibility-inlines-hidden -fno-math-errno --param vect-max-version-for-alias-checks=50 -Xassembler --compress-debug-sections -msse3 -felide-constructors -fmessage-length=0 -Wall -Wno-long-long -Wreturn-type -Wextra -Wpessimizing-move -Wclass-memaccess -Wno-cast-function-type -Wno-unused-but-set-parameter -Wno-ignored-qualifiers -Wno-deprecated-copy -Wno-unused-parameter -Wunused -Wparentheses -Wno-deprecated -Werror=return-type -Werror=missing-braces -Werror=unused-value -Werror=unused-label -Werror=address -Werror=format -Werror=sign-compare -Werror=write-strings -Werror=delete-non-virtual-dtor -Werror=strict-aliasing -Werror=narrowing -Werror=reorder -Werror=unused-variable -Werror=conversion-null -Wnon-virtual-dtor -Werror=switch -fdiagnostics-show-option -Wno-unused-local-typedefs -Wno-attributes -Wno-psabi -Wno-c99-extensions -Wno-c++11-narrowing -D__STRICT_ANSI__ -Wno-unused-private-field -Wno-unknown-pragmas -Wno-unused-command-line-argument -Wno-unknown-warning-option -ftemplate-depth=512 -Wno-error=potentially-evaluated-expression -Wno-tautological-type-limit-compare -fsized-deallocation -Wno-error=unused-variable -DBOOST_DISABLE_ASSERTS -shared -Wl,-E -Wl,-z,defs tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/plugins/HeterogeneousCoreROCmTestDevicePlugins/ROCmTestDeviceAdditionAlgo.hip.cc.o tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/plugins/HeterogeneousCoreROCmTestDevicePlugins/ROCmTestDeviceAdditionModule.cc.o -o tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/plugins/HeterogeneousCoreROCmTestDevicePlugins/libHeterogeneousCoreROCmTestDevicePlugins.so -Wl,-E -Wl,--hash-style=gnu -L/data/user/fwyzard/CMSSW_13_0_X_2023-01-29-1100/biglib/el8_amd64_gcc11 -L/data/user/fwyzard/CMSSW_13_0_X_2023-01-29-1100/lib/el8_amd64_gcc11 -L/data/user/fwyzard/CMSSW_13_0_X_2023-01-29-1100/external/el8_amd64_gcc11/lib -L/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc11/cms/cmssw/CMSSW_13_0_X_2023-01-29-1100/biglib/el8_amd64_gcc11 -L/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc11/cms/cmssw/CMSSW_13_0_X_2023-01-29-1100/lib/el8_amd64_gcc11 -L/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc11/cms/cmssw/CMSSW_13_0_X_2023-01-29-1100/external/el8_amd64_gcc11/lib -L/data/user/fwyzard/CMSSW_13_0_X_2023-01-29-1100/static/el8_amd64_gcc11 -L/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc11/cms/cmssw/CMSSW_13_0_X_2023-01-29-1100/static/el8_amd64_gcc11 -lHeterogeneousCoreROCmTestDevice_rocm -lFWCoreFramework -lFWCoreCommon -lFWCoreServiceRegistry -lDataFormatsCommon -lFWCoreParameterSet -lFWCoreMessageLogger -lDataFormatsProvenance -lFWCorePluginManager -lFWCoreReflection -lDataFormatsStdDictionaries -lFWCoreConcurrency -lFWCoreUtilities -lFWCoreVersion -lHeterogeneousCoreROCmTestDevice -lHeterogeneousCoreROCmUtilities -lTree -lNet -lThread -lMathCore -lRIO -lCore -lboost_thread -lboost_date_time -lpcre -lbz2 -luuid -ltbb -llzma -lz -lfmt -lcms-md5 -lamdhip64 -lcrypt -ldl -lrt -lstdc++fs -ltinyxml2
lld: error: undefined hidden symbol: cms::rocmtest::add_vectors_f(float const*, float const*, float*, unsigned long)
>>> referenced by lto.tmp:(HeterogeneousCoreROCmTestDevicePlugins::kernel_add_vectors_f(float const*, float const*, float*, unsigned long))
>>> referenced by lto.tmp:(HeterogeneousCoreROCmTestDevicePlugins::kernel_add_vectors_f(float const*, float const*, float*, unsigned long))

The reason seems that the _rocm.a library was not generated.
If I build it by hand:

ar rcs tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/src/HeterogeneousCoreROCmTestDevice/libHeterogeneousCoreROCmTestDevice_rocm.a tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/src/HeterogeneousCoreROCmTestDevice/DeviceAddition.hip.cc.o
cp tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/src/HeterogeneousCoreROCmTestDevice/libHeterogeneousCoreROCmTestDevice_rocm.a static/el8_amd64_gcc11/

then the same linker command works.

@smuzaffar
Copy link
Contributor Author

@fwyzard , cms-sw/cmssw#40605 and cms-sw/cmssw#40619 works with latest build rules. I can not build cms-sw/cmssw#40637

fatal error: HeterogeneousCore/ROCmUtilities/interface/requireDevices.h: No such file or directory

do I need any other Pr to go with it?

@fwyzard
Copy link
Contributor

fwyzard commented Jan 30, 2023

Yes, cms-sw/cmssw#40637 needs cms-sw/cmssw#40619 .

@smuzaffar
Copy link
Contributor Author

@fwyzard , issue with duplicate obj file is fixed ( scram was added the hip.cc file twice as it was match by both *.cc and *.hip.cc. I also have fixed the rocm.a build order issue but still I get the error

Singularity> ls -l static/el8_amd64_gcc11/
total 56
-rw-r--r--. 1 muzaffar zh 30510 Jan 30 17:23 libHeterogeneousCoreROCmTestDevicePlugins_rocm.a
-rw-r--r--. 1 muzaffar zh 22202 Jan 30 17:23 libHeterogeneousCoreROCmTestDevice_rocm.a
Singularity> scram b -v tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/plugins/HeterogeneousCoreROCmTestDevicePlugins/libHeterogeneousCoreROCmTestDevicePlugins.so
...
>> Building edm plugin tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/plugins/HeterogeneousCoreROCmTestDevicePlugins/libHeterogeneousCoreROCmTestDevicePlugins.so
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02770/el8_amd64_gcc11/external/rocm/5.0.2-95c215630c939706b0552e3eee38861c/bin/hipcc -fgpu-rdc --offload-arch=gfx900 --target=x86_64-redhat-linux-gnu --gcc-toolchain=/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02770/el8_amd64_gcc11/external/gcc/11.2.1-f9b9dfdd886f71cd63f5538223d8f161 -O2 -pthread -pipe -Werror=main -Werror=pointer-arith -Werror=overlength-strings -Wno-vla -Werror=overflow -std=c++1z -ftree-vectorize -Werror=array-bounds -Werror=type-limits -fvisibility-inlines-hidden -fno-math-errno --param vect-max-version-for-alias-checks=50 -Xassembler --compress-debug-sections -msse3 -felide-constructors -fmessage-length=0 -Wall -Wno-long-long -Wreturn-type -Wextra -Wpessimizing-move -Wclass-memaccess -Wno-cast-function-type -Wno-unused-but-set-parameter -Wno-ignored-qualifiers -Wno-deprecated-copy -Wno-unused-parameter -Wunused -Wparentheses -Wno-deprecated -Werror=return-type -Werror=missing-braces -Werror=unused-value -Werror=unused-label -Werror=address -Werror=format -Werror=sign-compare -Werror=write-strings -Werror=delete-non-virtual-dtor -Werror=strict-aliasing -Werror=narrowing -Werror=reorder -Werror=unused-variable -Werror=conversion-null -Wnon-virtual-dtor -Werror=switch -fdiagnostics-show-option -Wno-unused-local-typedefs -Wno-attributes -Wno-psabi -Wno-c99-extensions -Wno-c++11-narrowing -D__STRICT_ANSI__ -Wno-unused-private-field -Wno-unknown-pragmas -Wno-unused-command-line-argument -Wno-unknown-warning-option -ftemplate-depth=512 -Wno-error=potentially-evaluated-expression -Wno-tautological-type-limit-compare -fsized-deallocation -Wno-error=unused-variable -DBOOST_DISABLE_ASSERTS -shared -Wl,-E -Wl,-z,defs tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/plugins/HeterogeneousCoreROCmTestDevicePlugins/ROCmTestDeviceAdditionAlgo.hip.cc.o tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/plugins/HeterogeneousCoreROCmTestDevicePlugins/ROCmTestDeviceAdditionModule.cc.o -o tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/plugins/HeterogeneousCoreROCmTestDevicePlugins/libHeterogeneousCoreROCmTestDevicePlugins.so -Wl,-E -Wl,--hash-style=gnu -L/build/muz/del/CMSSW_13_0_X_2023-01-29-1100/biglib/el8_amd64_gcc11 -L/build/muz/del/CMSSW_13_0_X_2023-01-29-1100/lib/el8_amd64_gcc11 -L/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc11/cms/cmssw/CMSSW_13_0_X_2023-01-29-1100/biglib/el8_amd64_gcc11 -L/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc11/cms/cmssw/CMSSW_13_0_X_2023-01-29-1100/lib/el8_amd64_gcc11 -L/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc11/cms/cmssw/CMSSW_13_0_X_2023-01-29-1100/external/el8_amd64_gcc11/lib -L/build/muz/del/CMSSW_13_0_X_2023-01-29-1100/static/el8_amd64_gcc11 -L/cvmfs/cms-ib.cern.ch/sw/x86_64/week0/el8_amd64_gcc11/cms/cmssw/CMSSW_13_0_X_2023-01-29-1100/static/el8_amd64_gcc11 -lHeterogeneousCoreROCmTestDevice_rocm -lFWCoreFramework -lFWCoreCommon -lFWCoreServiceRegistry -lDataFormatsCommon -lFWCoreParameterSet -lFWCoreMessageLogger -lDataFormatsProvenance -lFWCorePluginManager -lFWCoreReflection -lDataFormatsStdDictionaries -lFWCoreConcurrency -lFWCoreUtilities -lFWCoreVersion -lHeterogeneousCoreROCmTestDevice -lHeterogeneousCoreROCmUtilities -lTree -lNet -lThread -lMathCore -lRIO -lCore -lboost_thread -lboost_date_time -lpcre -lbz2 -luuid -ltbb -llzma -lz -lfmt -lcms-md5 -lamdhip64 -lcrypt -ldl -lrt -lstdc++fs -ltinyxml2
lld: error: undefined hidden symbol: cms::rocmtest::add_vectors_f(float const*, float const*, float*, unsigned long)
>>> referenced by lto.tmp:(HeterogeneousCoreROCmTestDevicePlugins::kernel_add_vectors_f(float const*, float const*, float*, unsigned long))
>>> referenced by lto.tmp:(HeterogeneousCoreROCmTestDevicePlugins::kernel_add_vectors_f(float const*, float const*, float*, unsigned long))
clang-14: error: amdgcn-link command failed with exit code 1 (use -v to see invocation)
gmake: *** [config/SCRAM/GMake/Makefile.rules:1739: tmp/el8_amd64_gcc11/src/HeterogeneousCore/ROCmTestDevice/plugins/HeterogeneousCoreROCmTestDevicePlugins/libHeterogeneousCoreROCmTestDevicePlugins.so] Error 1
gmake: *** [There are compilation/build errors. Please see the detail log above.] Error 2

can you please try cmssw-config tag V07-05-02 ? This should fix the issue with duplicate objs and rocm.a build order

@fwyzard
Copy link
Contributor

fwyzard commented Jan 30, 2023

Thanks, I'll check it.

@cmsbuild
Copy link
Contributor

Pull request #8274 was updated.

@fwyzard
Copy link
Contributor

fwyzard commented Jan 30, 2023

OK, finally this seems to be working:

22:44:56 Done installation via rpm for rocm
22:44:56 Done external+rocm+5.4.2-37e239458a900c22235b86bfbe0dd1a0

@cmsbuild
Copy link
Contributor

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-1e4b84/30259/summary.html
COMMIT: 786ab37
CMSSW: CMSSW_13_0_X_2023-01-30-1100/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/8274/30259/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation error when building:

ERROR: no such package '@tf_runtime//': java.io.IOException: Error downloading [http://mirror.tensorflow.org/github.com/tensorflow/runtime/archive/b570a1921c9e55ac53c8972bd2bfd37cd0eb510d.tar.gz, https://github.com/tensorflow/runtime/archive/b570a1921c9e55ac53c8972bd2bfd37cd0eb510d.tar.gz] to /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc11/external/tensorflow-sources/2.6.4-9f1be03565af434ebe39d0eacdd0154e/build/5f8913d9a58ba3308d4327e8feaa665f/external/tf_runtime/temp577731506896425596/b570a1921c9e55ac53c8972bd2bfd37cd0eb510d.tar.gz: Checksum was 653f631e961a0e885e5ee805d65cd6409350759ab1d7145aed9b89123b760e19 but wanted 01295fc2a90aa2d665890adbe8701e2ae2372028d3b8266cba38ceddccb42af6
INFO: Elapsed time: 5.668s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
FAILED: Build did NOT complete successfully (0 packages loaded)
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.FcV6Xb (%build)


RPM build errors:
line 37: It's not recommended to have unversioned Obsoletes: Obsoletes: external+tensorflow-sources+2.6.4-9f1be03565af434ebe39d0eacdd0154e
Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.FcV6Xb (%build)


@fwyzard
Copy link
Contributor

fwyzard commented Jan 30, 2023

wtf

@fwyzard
Copy link
Contributor

fwyzard commented Jan 30, 2023

please test with #8273,cms-sw/cmssw#40619,cms-sw/cmssw#40637,cms-sw/cmssw#40605

@cmsbuild
Copy link
Contributor

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-1e4b84/30262/summary.html
COMMIT: 786ab37
CMSSW: CMSSW_13_0_X_2023-01-30-1100/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/8274/30262/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation error when building:

ERROR: no such package '@tf_runtime//': java.io.IOException: Error downloading [http://mirror.tensorflow.org/github.com/tensorflow/runtime/archive/b570a1921c9e55ac53c8972bd2bfd37cd0eb510d.tar.gz, https://github.com/tensorflow/runtime/archive/b570a1921c9e55ac53c8972bd2bfd37cd0eb510d.tar.gz] to /pool/condor/dir_43994/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc11/external/tensorflow-sources/2.6.4-9f1be03565af434ebe39d0eacdd0154e/build/e36e10f9b432475838502db74c081938/external/tf_runtime/temp8131256082103605744/b570a1921c9e55ac53c8972bd2bfd37cd0eb510d.tar.gz: Checksum was 653f631e961a0e885e5ee805d65cd6409350759ab1d7145aed9b89123b760e19 but wanted 01295fc2a90aa2d665890adbe8701e2ae2372028d3b8266cba38ceddccb42af6
INFO: Elapsed time: 3.681s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
FAILED: Build did NOT complete successfully (0 packages loaded)
error: Bad exit status from /pool/condor/dir_43994/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.krj0o0 (%build)


RPM build errors:
line 37: It's not recommended to have unversioned Obsoletes: Obsoletes: external+tensorflow-sources+2.6.4-9f1be03565af434ebe39d0eacdd0154e
Bad exit status from /pool/condor/dir_43994/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.krj0o0 (%build)


@smuzaffar
Copy link
Contributor Author

test parameters:

  • addpkg = FWCore,HeterogeneousCore

@smuzaffar
Copy link
Contributor Author

please test

@smuzaffar
Copy link
Contributor Author

+externals
this only updates the build rules for rocm code. It does not break any existing build rules so this can go in

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next IB/CMSSW_13_0_X/master IBs after it passes the integration tests. This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@smuzaffar smuzaffar merged commit 2405ef1 into IB/CMSSW_13_0_X/master Jan 31, 2023
@cmsbuild
Copy link
Contributor

-1

Failed Tests: UnitTests
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-1e4b84/30264/summary.html
COMMIT: 786ab37
CMSSW: CMSSW_13_0_X_2023-01-30-2300/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/8274/30264/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found errors in the following unit tests:

---> test test-das-selected-lumis had ERRORS

Comparison Summary

Summary:

  • You potentially removed 19 lines from the logs
  • Reco comparison results: 20 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3555486
  • DQMHistoTests: Total failures: 933
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3554531
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.007 KiB( 48 files compared)
  • DQMHistoSizes: changed ( 11634.0,... ): -0.001 KiB HLT/Filters
  • Checked 211 log files, 162 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@smuzaffar smuzaffar deleted the smuzaffar-patch-12 branch February 15, 2023 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants