Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream of AOCL 2.2.1 changes. #448

Merged
merged 274 commits into from
Nov 1, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
274 commits
Select commit Hold shift + click to select a range
3e16033
Updates to docs/Multithreading.md.
fgvanzee Mar 18, 2019
2d1cd32
Renamed --enable-export-all to --export-shared=[].
fgvanzee Mar 18, 2019
b503627
Adjusted cache blocksizes for zen subconfig.
fgvanzee Mar 19, 2019
c6793be
Added docs/Performance.md and docs/graphs subdir.
fgvanzee Mar 19, 2019
25db903
Fixed broken section links in docs/Performance.md.
fgvanzee Mar 19, 2019
6385a3e
Minor fixes to docs/Performance.md.
fgvanzee Mar 19, 2019
cd81a6a
Very minor tweaks to Performance.md.
fgvanzee Mar 19, 2019
14bc42f
ReleaseNotes.md update in advance of next version.
fgvanzee Mar 19, 2019
38e2180
CHANGELOG update (0.5.2)
fgvanzee Mar 19, 2019
366c4b1
More minor tweaks to docs/Performance.md.
fgvanzee Mar 19, 2019
a1c8b11
Added Eigen support to test/3 Makefile, runme.sh.
fgvanzee Mar 20, 2019
a9270be
Allow disabling of BLAS prototypes at compile-time.
fgvanzee Mar 21, 2019
686aa86
Fix clang version detection (#305)
isuruf Mar 25, 2019
728e966
Add more support for Eigen to drivers in test/3.
fgvanzee Mar 26, 2019
b495ca9
Link to Eigen BLAS for non-gemm drivers in test/3.
fgvanzee Mar 27, 2019
aa9d8e4
Test with shared on windows (#306)
isuruf Mar 27, 2019
bd6cdd8
Fixed mislabeled eigen output from test/3 drivers.
fgvanzee Mar 27, 2019
ee44719
Added ability to plot with Eigen in test/3/matlab.
fgvanzee Mar 27, 2019
3e94a0f
Added Eigen results to performance graphs.
fgvanzee Mar 27, 2019
cb45eb9
Minor text updates (Eigen) to docs/Performance.md.
fgvanzee Mar 27, 2019
22768bf
Updated Eigen results in docs/graphs with 3.3.90.
fgvanzee Mar 28, 2019
231a4b7
Use pthreads on MinGW and Cygwin (#307)
isuruf Apr 2, 2019
f061c75
Renamed armv8a gemm kernel filename.
fgvanzee Apr 2, 2019
9b16d8e
Use void_fp for function pointers instead of void*.
fgvanzee Apr 2, 2019
959d8d9
Minor update to docs/HardwareSupport.md document.
fgvanzee Apr 2, 2019
c5b447e
Minor bugfix to flatten-headers.py.
fgvanzee Apr 9, 2019
184ba1b
GNU-like handling of installation prefix et al.
fgvanzee Apr 11, 2019
f205eea
Applied forgotten variable rename from 89a70cc.
fgvanzee Apr 16, 2019
f73cef4
Support row storage in Eigen gemm test/3 driver.
fgvanzee Apr 17, 2019
e253bfc
make unix friendly archives on appveyor (#310)
isuruf Apr 27, 2019
4f08619
Implemented gemm on skinny/unpacked matrices.
fgvanzee Apr 27, 2019
7d54fc5
Fixed typo in --disable-sup-handling macro guard.
fgvanzee Apr 27, 2019
5c4bb0c
Ceased use of BLIS_ENABLE_SUP_MR/NR_EXT macros.
fgvanzee Apr 28, 2019
20391ab
Minor bugfixes in sup dgemm implementation.
fgvanzee Apr 28, 2019
e394c74
Define _POSIX_C_SOURCE in bli_system.h.
fgvanzee May 2, 2019
cd8e74a
add info about CXX in configure (#311)
May 14, 2019
fb305d0
Minor build system housekeeping.
fgvanzee May 23, 2019
d488d7a
Increased MT sup threshold for double to 180.
fgvanzee May 23, 2019
73970bf
Added BLIS theading info to Performance.md.
fgvanzee May 23, 2019
defe789
Minor rewording of language around mt env. vars.
fgvanzee May 23, 2019
c90a1a8
Inadvertantly hidden xerbla_() in blastest (#313).
fgvanzee May 28, 2019
5e03ca6
Increased MT sup threshold for double to 201.
fgvanzee May 31, 2019
55e7b04
Added sup performance graphs/document to 'docs'.
fgvanzee Jun 3, 2019
d5903a8
Minor edits to docs/PerformanceSmall.md.
fgvanzee Jun 3, 2019
2ae8faa
ReleaseNotes.md update in advance of next version.
fgvanzee Jun 3, 2019
5052917
CHANGELOG update (0.6.0)
fgvanzee Jun 3, 2019
ecfc223
Minor tweaks to test/sup.
fgvanzee Jun 4, 2019
bb4a01f
Added BLASFEO results to docs/PerformanceSmall.md.
fgvanzee Jun 4, 2019
df67302
Tweaked language in README.md related to sup/AMD.
fgvanzee Jun 5, 2019
2bf1ad1
Fixed formatting/typo in docs/PerformanceSmall.md.
fgvanzee Jun 6, 2019
78adbe9
Added missing #include "bli_family_thunderx2.h".
fgvanzee Jun 7, 2019
6e9b1e0
Adjust -fopenmp-simd for icc's preferred syntax.
fgvanzee Jun 7, 2019
428921c
Fixed typo in README.md's MixedDatatypes.md link.
fgvanzee Jun 8, 2019
d192d47
Trivial change to MixedDatatypes.md link text.
fgvanzee Jun 8, 2019
66c43ca
Updated BLASFEO results in PerformanceSmall.md.
fgvanzee Jun 19, 2019
7366bf2
Fixed thrinfo_t printing bug for small problems.
fgvanzee Jun 24, 2019
b3974da
New cntx_t blksz "set" functions + misc tweaks.
fgvanzee Jul 16, 2019
0e3f0ce
More updates to comments in testsuite modules.
fgvanzee Jul 16, 2019
ba4a771
Updated -march flags for sandybridge, haswell.
fgvanzee Jul 19, 2019
d44c42d
Updated haswell MC cache blocksizes.
fgvanzee Jul 19, 2019
06c5a5c
Added test/1m4m driver directory.
fgvanzee Jul 22, 2019
99c7d15
Added "Education and Learning" section to README.
fgvanzee Jul 23, 2019
9034c88
Added "Education and Learning" ToC entry to README.
fgvanzee Jul 23, 2019
f85d336
CHANGELOG update (0.3.0)
fgvanzee Feb 23, 2018
d9c0b8b
Re-enabling Zen optimized cache block sizes for config target zen
nisanthmpamd Mar 19, 2018
ca6f5b7
Re-enabling the small matrix gemm optimization for target zen
nisanthmpamd Mar 19, 2018
d56ca14
small matrix trsm intrinsics optimization code for AX=B and XA'=B
BiplabRaut Jun 6, 2018
bc9dbce
AMD Copyright information changed to 2018
BiplabRaut Jun 6, 2018
73ddc58
Small TRSM optimization changes :- 1) single precision small trsm ker…
BiplabRaut Oct 1, 2018
d805fdf
This is a fix to floating-point exception error for BLIS SGEMM with l…
kvaragan Oct 4, 2018
1720efe
Update version number to 1.2
pradeeptrgit Oct 23, 2018
2752b51
Fix on EPYC machine for multi instance performance issue,
BiplabRaut Dec 18, 2018
d6bb56d
Fixed BLAS test failures of small matrix SYRK for single and double p…
BiplabRaut Dec 19, 2018
016acd3
Merged BLIS Release 1.3
kvaragan Mar 5, 2019
34c2c22
Disabled BLIS_ENABLE_ZEN_BLOCK_SIZES in bli_family_zen.h for ROME tuning
kvaragan Mar 6, 2019
d605a19
config_registry: New AMD zen2 architecture configuration added.
kvaragan May 20, 2019
b5eb348
config/zen/bli_cntx_init_zen.c: removed BLIS_ENBLE_ZEN_BLOCK_SIZES ma…
kvaragan May 21, 2019
874aee6
Adding threshold condition to dgemm small matrix kernels, defining t…
kiran-amd May 23, 2019
2e9b5c3
make checkblis fails for matrix dimension check at the begining hence…
kiran-amd May 23, 2019
3f88a44
Implemented TRSM for small matrices for cases where A is on the right
Meghana-vankadari May 23, 2019
ec907c3
Defined small matrix thresholds for TRSM for various cases for NAPLES…
Meghana-vankadari May 27, 2019
c4368c6
This check in has changes w.r.t Copyright information, which is chan…
kiran-amd May 27, 2019
c195d9a
CPP template wrapper implementation done for all BLASroutines
chsankar Aug 28, 2019
ce0b1ca
Added Doxygen Comment to all functions; Fixed Review comments; Modifi…
chsankar Sep 5, 2019
ea25ba2
Added back BLIS_ENABLE_ZEN_BLOCK_SIZES macro to zen configuration, th…
kvaragan May 31, 2019
be25ec0
CPP Implementtaion of dsdot included. Test application refactored to …
chsankar Sep 20, 2019
851589c
Return typename corrected in dot function
chsankar Sep 30, 2019
9777b8e
Merge branch 'amd-staging-rome2.1' of ssh://git.amd.com:29418/cpulibr…
chsankar Oct 1, 2019
95d6e2b
test folder files reverted to previous commit
chsankar Oct 3, 2019
a000c61
test/Makefile reverted to correct version to retain copyright informa…
chsankar Oct 3, 2019
574bdae
Modified cblas.hh not to include cblas.h ,as this file gets generated…
chsankar Oct 3, 2019
b2479b1
Merge branch 'amd-staging-rome2.1' of ssh://git.amd.com:29418/cpulibr…
kiran-amd Oct 7, 2019
4158e7f
missed changes while rebasing field's SUP code
kiran-amd Oct 23, 2019
97a4236
Matrices are not initialized when inputs dimensions are fed through f…
kvaragan Oct 24, 2019
c3d4464
Removed extra 'endif' statement causing build failures for zen config…
kvaragan Oct 24, 2019
d21c726
update version 2.1
Oct 30, 2019
b5475f5
Adding context initialisation for SUP kernels in zen2 architecture
kiran-amd Nov 12, 2019
5f04fdd
CPP Templatee test files update
pradeeptrgit Nov 20, 2019
49c2704
Instll CPP Template headers
pradeeptrgit Nov 20, 2019
d63f9b7
checkcpp test rule in Makefile
pradeeptrgit Nov 20, 2019
3d20128
Merge branch 'amd-staging-rome2.1' of ssh://git.amd.com:29418/cpulibr…
pradeeptrgit Nov 20, 2019
5560f75
Modified makefiles for zen and zen2 to pick up compiler flags based o…
Meghana-vankadari Nov 21, 2019
ba86a38
Merge branch 'amd-staging-rome2.1' of ssh://git.amd.com:29418/cpulibr…
pradeeptrgit Nov 21, 2019
c63a078
Fixed segemntation fault in trsm_small kernels for cases XAuB, XAltB,…
Meghana-vankadari Nov 21, 2019
33648bb
CPP Test comparison util function fix
pradeeptrgit Nov 21, 2019
27fe3d2
Merge "Fixed segemntation fault in trsm_small kernels for cases XAuB,…
kvaragan Nov 21, 2019
85fa9e4
resolved merge conflicts when merged with public repo master branch
kiran-amd Nov 25, 2019
764d6f4
changed configure script to support AOCC
Meghana-vankadari Nov 25, 2019
37badee
Updated build infra to use python detected by auto config.
dzambare Nov 22, 2019
e6e66fb
Fixed reentrancy issues with bli_sgemm_small() and bli_dgemm_small().
dzambare Nov 27, 2019
c4047e4
Merge branch 'amd-blis-nov-mergetest' into amd-staging-rome2.1
kiran-amd Nov 29, 2019
e0fb039
Merge branch 'amd' of https://github.com/flame/blis into amd-blis-nov…
pradeeptrgit Nov 30, 2019
13249e8
Replace bli_thread_init_rntm with bli_rntm_init_from_global in zen sm…
pradeeptrgit Nov 30, 2019
d72b509
Pass actual enum type to bli_mem_set_buf_type function if C++
pradeeptrgit Nov 30, 2019
b074c5e
Added a macro MATRIX_INITIALISATION for matrix initialisation in tes…
kiran-amd Dec 1, 2019
fb75044
Removed zen and zen2 configurations from amd64 family
Meghana-vankadari Dec 2, 2019
cef1852
Fixed Segmentation fault in trsm_small kernels for the case AlXB.
Meghana-vankadari Nov 27, 2019
31bfe89
re-enabling the boundary check condition for bli_dtrsm_small_AlXB. It…
Meghana-vankadari Dec 3, 2019
af94ba2
Added sup support for sgemm under zen and related frame work changes.
BhaskarNallani Dec 4, 2019
17b3a26
Made some improvements to trsm_small kernels
Meghana-vankadari Dec 5, 2019
27d2b5a
Merge "Made some improvements to trsm_small kernels" into amd-staging…
kvaragan Dec 6, 2019
3192914
change in threshold condition for SUP and small kernels
kiran-amd Dec 8, 2019
9b6c04d
Merge " change in threshold condition for SUP and small kernels" into…
kvaragan Dec 9, 2019
82ec21f
Fix for CPUPL-541,When threading is enabled blis-mt library gets gene…
kvaragan Dec 10, 2019
44edee7
Added support to handle 7x16,8x16,9x16 efficiently in 6x16n kernel
BhaskarNallani Dec 10, 2019
edc8f04
Merge "Fix for CPUPL-541,When threading is enabled blis-mt library ge…
kvaragan Dec 11, 2019
e4a6af3
Merge Selective Packing code from amd branch flame/blis
kiran-amd Dec 12, 2019
dc4e7d1
Fix for CPUPL-550: AOCC clang compiler error. Resolved: Duplicate bac…
BhaskarNallani Dec 12, 2019
1650bcb
Revert " Merge Selective Packing code from amd branch flame/blis"
kvaragan Dec 13, 2019
10a26a7
Merge "Fix for CPUPL-550: AOCC clang compiler error. Resolved: Duplic…
BhaskarNallani Dec 13, 2019
21224e8
Merge "Revert " Merge Selective Packing code from amd branch flame/bl…
kiran-amd Dec 13, 2019
a8af07f
Added support to handle unsupported storage formats in sgemmsup using…
BhaskarNallani Dec 13, 2019
1fe8edb
"Merge Selective Packing code from amd branch flame/blis"
kiran-amd Dec 16, 2019
8eb264f
Change in threshold condition for trsm_small kernels
Meghana-vankadari Dec 16, 2019
62e00b4
Merge "Change in threshold condition for trsm_small kernels" into amd…
Meghana-vankadari Dec 18, 2019
72f4a7a
Increased pool buffer size to accommodate packing buffers needed in s…
dzambare Dec 16, 2019
b3e2938
Fix for CPUPL-549: TRSM for AlXB case results in NaN values
Meghana-vankadari Dec 21, 2019
f965b95
CPUPL-587: Corrected condition for A packing in sgemm_small
dzambare Jan 24, 2020
cc98047
Made framework changes to initialize specific cache block sizes for T…
Meghana-vankadari Feb 12, 2020
e0c95d7
Beta Zero Checks for sgemm_small
BhaskarNallani Mar 6, 2020
83745c7
Beta Zero Check for sgemm small. Core Software Group SWLCSG-137 BLIS-…
BhaskarNallani Mar 9, 2020
1a28482
Support multithreading within the sup framework.
fgvanzee Feb 17, 2020
574de9e
Fixed bug(s) in mt sup when single-threaded.
fgvanzee Feb 17, 2020
04fc9d3
Merge "Fixed bug(s) in mt sup when single-threaded." into amd-staging…
kiran-amd Mar 13, 2020
a7c5723
Skip building thrinfo_t tree when mt is disabled.
fgvanzee Feb 18, 2020
efe85b3
Added missing return to bli_thread_partition_2x2().
fgvanzee Mar 14, 2020
c20c96d
Made some critical changes to small_gemm kernels
Meghana-vankadari Mar 18, 2020
ddcb3d8
Modified test_trsm.c file in test folder to read input sizes from a file
Meghana-vankadari Mar 13, 2020
b5fe75e
Closing input and output files in test_gemm.c and test_trsm.c
Meghana-vankadari Mar 24, 2020
d40edf7
Execution and Debug trace support.
dzambare Apr 7, 2020
e56cf63
Optimized "bli_dotv_zen_int10" kernels
Meghana-vankadari Apr 10, 2020
489d501
Merge "Execution and Debug trace support." into amd-staging-rome-2.2
Apr 15, 2020
80086fa
Modified function definition for AXPY BLAS interface
Meghana-vankadari Apr 17, 2020
80de43a
Disable execution and debug trace by default.
dzambare Apr 17, 2020
f7bb291
Merge "Disable execution and debug trace by default." into amd-stagin…
dzambare Apr 20, 2020
b846059
Added opt kernels for SWAPV
Meghana-vankadari Apr 17, 2020
0fdb539
Fixed CPUPL-845 - expert interfaces consistent with other interfaces …
kvaragan Apr 20, 2020
139fbbb
Merge "Added opt kernels for SWAPV" into amd-staging-rome-2.2
Meghana-vankadari Apr 21, 2020
1c76723
Block parameters tuning to improve sgemm performance on Rome
kiran-amd Apr 16, 2020
138bc75
Modified function definition for AXPY CBLAS interface
Meghana-vankadari Apr 20, 2020
f80e21c
Modified Function definition for BLAS and CBLAS interfaces of I?AMAX API
Meghana-vankadari Apr 22, 2020
ea3865f
JIRA: CPUPL-853: Fix for the redefinition of _unsigned int __get_cpui…
BhaskarNallani Apr 23, 2020
ba00f75
Merge "JIRA: CPUPL-853: Fix for the redefinition of _unsigned int __g…
BhaskarNallani Apr 24, 2020
4caee59
Adding a simd kernel for copyv function
kiran-amd Apr 16, 2020
4ad5b1a
Update zen2 kernel context with number of level1 kernels
kiran-amd Apr 24, 2020
49cd7a9
CPUPL-866: ZenDNN gtest cases failing with blis 2.1 and later releases
BhaskarNallani May 3, 2020
28bb28b
Modified Function definition for BLAS and CBLAS interfaces of DOTV an…
Meghana-vankadari Apr 30, 2020
830f1a4
CPUPL-849: BLIS SGEMM general stride test cases fails for smaller mat…
BhaskarNallani May 5, 2020
d6db8d1
Merge "Modified Function definition for BLAS and CBLAS interfaces of …
Meghana-vankadari May 6, 2020
6f33fd6
Modified Function definition for BLAS and CBLAS interfaces of ?SC…
kiran-amd May 5, 2020
884f2fe
Revert "Block parameters tuning to improve sgemm performance on Rome"
kiran-amd May 14, 2020
b3a308b
CPUPL-948: Selective Packing changes are imlplemented in sgemm sup
BhaskarNallani May 17, 2020
310dda9
CPUPL-709: Improve Complex GEMM performance - Level 1 Optimization
May 16, 2020
af1ad80
CPUPL-929: Improve Complex GEMM performance - Support all storage for…
May 18, 2020
4fcc4e4
Optimized DGEMV kernel and changed BLAS interface call
Meghana-vankadari May 14, 2020
9ea0472
Replaced all the instances of zen_basic with zen_ref_c
Meghana-vankadari May 19, 2020
f630b3f
CPUPL-929:Improve Complex GEMM performance
May 20, 2020
718b648
Add prototypes for POWER9 reference kernels (#365)
nicholaiTukanov Dec 4, 2019
afee36b
Annoted missing thread-related symbols for export.
fgvanzee Dec 6, 2019
d988a5b
Fixed bugs in cblas_sdsdot(), sdsdot_().
fgvanzee Dec 16, 2019
dd54e79
fix link to docs
jeffhammond Jan 2, 2020
afc57ad
blacklist Intel 19+
jeffhammond Jan 3, 2020
570d514
blacklist ICC 18 for knl/skx due to test failures
Jan 4, 2020
38ecda4
Updated 1m draft article link in README.md.
fgvanzee Jan 6, 2020
99da76f
Fixed 'configure' breakage introduced in 6433831.
fgvanzee Jan 6, 2020
291ee5f
Fix parsing in vpu_count on workstation SKX (#351)
loveshack Jan 6, 2020
142df1b
CREDITS file update.
fgvanzee Jan 14, 2020
51f87f3
Removed 'attic/windows' (to prevent confusion).
fgvanzee Jan 14, 2020
d6496d5
ReleaseNotes.md update in advance of next version.
fgvanzee Jan 14, 2020
b3c0309
CHANGELOG update (0.6.1)
fgvanzee Jan 14, 2020
08709d4
Removed sorting on LDFLAGS in common.mk (#373).
fgvanzee Jan 15, 2020
2096f41
Updates to octave scripts in test/sup[mt]/octave.
fgvanzee Feb 27, 2020
6d36953
Updated sup[mt] Makefiles for variable dim ranges.
fgvanzee Mar 2, 2020
c7faae9
Merged test/sup, test/supmt into test/sup.
fgvanzee Mar 10, 2020
6a957d7
List Gentoo under supported external packages.
fgvanzee Feb 24, 2020
9e76059
Renamed bli_thread_obarrier(), _obroadcast().
fgvanzee Feb 25, 2020
b325f1e
Warn user when auto-detection returns 'generic'.
fgvanzee Mar 26, 2020
3597284
Updates, tweaks to runme.sh in test/1m4m.
fgvanzee Mar 28, 2020
d560d10
OSX: specify the full path to the location of libblis.dylib (#390)
balay Mar 31, 2020
4a5e76e
Minor updates/elaborations to RELEASING file.
fgvanzee Apr 6, 2020
27b2911
Rename more bli_thread_obarrier(), _obroadcast().
fgvanzee Apr 6, 2020
052a3c5
ReleaseNotes.md update in advance of next version.
fgvanzee Apr 7, 2020
4907b32
CHANGELOG update (0.7.0)
fgvanzee Apr 7, 2020
93023d0
README.md update to promote supmt dgemm.
fgvanzee Apr 7, 2020
8e3f143
Adding missing conjy to her2/syr2 in typed API doc.
fgvanzee Apr 18, 2020
562b9ee
Update KernelsHowTo.md (#395)
YingboMa Apr 27, 2020
66ec227
New kernel set for Arm SVE using assembly (#396)
docularxu Apr 29, 2020
994a2d8
Documented Perl prerequisite for build system.
fgvanzee May 5, 2020
f973f00
Defined netlib equivalent of xerbla_array().
fgvanzee May 8, 2020
72443e7
avoid loading twice in armv8a gemm kernel (#403)
docularxu May 20, 2020
11570db
CPUPL-929:Improve Complex GEMM performance
May 22, 2020
8ce6e49
Added file and copyright header for aoclflist.c file.
dzambare May 22, 2020
154bedc
CPUPL-929:Improve Complex GEMM performance
May 22, 2020
9b09dd7
CPUPL-929:Improve Complex GEMM performance
May 23, 2020
739803a
DGEMM Packing Kernels for Native DGEMM implementation
kvaragan May 22, 2020
bb7eeec
Change loop test expression in bli_packm_zen_int.c
pradeeptrgit May 30, 2020
0c52aae
Merge branch 'ref/heads/amd-staging-rome-2.2' of ssh://git.amd.com:29…
pradeeptrgit May 30, 2020
f8ddd48
Code Clean-up in DGEMM packing kernels
kvaragan May 30, 2020
711f261
Update AMD BLIS version to 2.2
pradeeptrgit May 31, 2020
3ebd5f8
Code cleanup in 6xk DGEMM pack Kernel
kvaragan May 30, 2020
2413c31
CPUPL-923: Implemented dot Product Kernels in SGEMM SUP for transpose…
BhaskarNallani May 31, 2020
5e0ad13
Code Cleanup and replaced vzeroall with vxorps
BhaskarNallani Jun 1, 2020
c8f3cec
Merge "Code cleanup in 6xk DGEMM pack Kernel" into amd-staging-rome-2.2
kvaragan Jun 1, 2020
f7bc37e
CPUPL-929: Improve Complex GEMM performance - Support all storage for…
Jun 1, 2020
6f01cd2
Fix for sblat3.x failure in make check
BhaskarNallani Jun 1, 2020
b4e599e
CPUPL-929: Improve Complex GEMM performance - Support all storage for…
Jun 2, 2020
5d57d67
Checking for zero dimension is moved to bli_gemm_xx call.
dzambare Jun 3, 2020
f4d2bb2
Enabled AOCC specific flags for all versions of AOCC compiler
Meghana-vankadari Jun 2, 2020
8a367c9
Merge "Checking for zero dimension is moved to bli_gemm_xx call." int…
dzambare Jun 4, 2020
9fce1ec
Optimized SGEMV kernel and changed BLAS interface call
Meghana-vankadari Jun 3, 2020
305c744
Added traces in dgemm and sgemm paths.
dzambare Jun 6, 2020
3620e47
Replace back major version number variable in Makefile
pradeeptrgit Jun 10, 2020
dad7e2f
Added support multiple trace levels & optimization of file size requi…
dzambare Jun 10, 2020
80b3127
Added support for logging gemm input values.
dzambare Jun 12, 2020
ccf0772
BLIS library porting on to Windows:
Jun 16, 2020
32365b3
Ensure random objects' 1-norms are non-zero.
fgvanzee Jun 17, 2020
f59d4be
Added framework support and interface APIs for GEMMT
Meghana-vankadari Jun 30, 2020
6a0a65e
Added sup kernels and code path for gemmt similar to GEMM.GEMMT now a…
Meghana-vankadari Jul 10, 2020
af1f9ab
BLIS: 'zdotc_' API modified to support Fortran invocation in flang en…
Jun 22, 2020
6896f92
Fixed bug in SUP code path
Meghana-vankadari Jul 15, 2020
eeea264
Annotating prototype of bli_abort with BLIS_EXPORT_BLIS
Meghana-vankadari Jul 17, 2020
89245a7
set the gemmt slot to the default gemmt sup handler for reference ker…
Meghana-vankadari Jul 20, 2020
3d35af3
Using weighted thread range partitioning for GEMMT
Meghana-vankadari Jul 24, 2020
0b38efc
Added support to handle col-major storage of C in SUP kernel
Meghana-vankadari Jul 26, 2020
f80a9c9
Added some optimizations for gemmt default path
Meghana-vankadari Jul 20, 2020
25f5a4e
Merge "Added some optimizations for gemmt default path" into amd-stag…
Meghana-vankadari Jul 27, 2020
12b1215
Added testsuite for gemmt APIs.
dzambare Jul 20, 2020
725bf5a
CPUPL-1059: Failures seen in DGEMM SUP for specific size is fixed
Jul 28, 2020
ac90bac
Revert "CPUPL-1059: Failures seen in DGEMM SUP for specific size is f…
Jul 28, 2020
434b018
BLIS library porting on to Windows:
Jul 29, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Instll CPP Template headers
Change-Id: Ib15dc9bda08d1f3fdc68e31520daee90a287357c
  • Loading branch information
pradeeptrgit committed Nov 20, 2019
commit 49c27040d1f92a03bce017b4ef323f3bbe0ac0e4
3 changes: 2 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -249,7 +249,8 @@ ifeq ($(MK_ENABLE_CBLAS),yes)
HEADERS_TO_INSTALL += $(CBLAS_H_FLAT)
endif


# Install BLIS CPP Template header files
HEADERS_TO_INSTALL += $(CPP_HEADER_DIR)/*.hh

#
# --- public makefile fragment definitions -------------------------------------
Expand Down
1 change: 1 addition & 0 deletions common.mk
Original file line number Diff line number Diff line change
Expand Up @@ -283,6 +283,7 @@ LIB_DIR := lib
INCLUDE_DIR := include
BLASTEST_DIR := blastest
TESTSUITE_DIR := testsuite
CPP_HEADER_DIR := cpp

# The filename suffix for reference kernels.
REFNM := ref
Expand Down