testsuite reports failure for GEMM for Randomize vectors and matrices using powers of 2 in narrow precision range #413

kvaragan · 2020-06-17T15:20:08Z

We get error failures as:
Output:
blis_sgemm_cn_ccc 1 1 1 0.00 -nan FAILURE
blis_cgemm1m_hc_ccc 1 1 1 0.00 -nan FAILURE
blis_dgemm_nc_ccc 1 1 1 0.00 -nan FAILURE
blis_dgemm_cc_ccc 1 1 1 0.00 -nan FAILURE
blis_dgemm_ct_ccc 1 1 1 0.00 -nan FAILURE
blis_dgemm_ht_ccc 1 1 1 0.00 -nan FAILURE
blis_dgemm_hh_ccc 1 1 1 0.00 -nan FAILURE
blis_zgemm1m_tc_ccc 1 1 1 0.00 -nan FAILURE

In input.general we enable 1 to use Randomize vectors and matrices using powers of 2 in narrow precision range.
However this error doesn't occur when use real values [-1,1].

devinamatthews · 2020-06-17T15:31:31Z

@kvaragan can you explain a little more how the initialization is done (or @fgvanzee if this is something built-in)?

kvaragan · 2020-06-17T15:45:05Z

Initialization is built into testsuite.

devinamatthews · 2020-06-17T17:16:30Z

@fgvanzee ?

fgvanzee · 2020-06-17T19:56:43Z

@kvaragan Thanks for this bug report, Kiran. I've isolated the issue.

The problem is that the current implementation of both scalar randomization functions (regular or narrow powers of 2) can sometimes return 0. For the narrow powers of 2, this happens much more frequently because only a handful of values are possible. Why is this of concern? When I wrote the narrow powers of 2 randomizer, I figured that 0 should be a valid value in our test matrices just like any other value in the range. The problem is that it can sometimes cause the numerical test to malfunction because it will attempt to normalize by the largest value in the matrix, even if it's 0. (So the nans were presumably there result of computing 0/0.) This can very easily happen for 1x1 matrices when using the narrow powers of 2 randomization.

So, in retrospect, this problem can manifest for randomization on the real range [-1,1], but never does in practice because it is highly improbable.

At first I was going to simply disallow 0 as a valid random value--likely for both types of randomization--but then I realized that the problem is not zeroes, per se, but rather objects that are completely zero. So instead, I think I will check the randomized object's 1-norm, and if it's zero, re-randomize until it's not zero.

Details: - Fixed an innocuous bug that manifested when running the testsuite on extremely small matrices with randomization via the "powers of 2 in narrow precision range" option enabled. When the randomization function emits a perfect 0.0 to fill a 1x1 matrix, the testsuite will then compute 0.0/0.0 during the normalization process, which leads to NaN residuals. The solution entails smarter implementaions of randv, randnv, randm, and randnm, each of which will compute the 1-norm of the vector or matrix in question. If the object has a 1-norm of 0.0, the object is re-randomized until the 1-norm is not 0.0. Thanks to Kiran Varaganti for reporting this issue (#413). - Updated the implementation of randm_unb_var1() so that it loops over a call to the randv_unb_var1() implementation directly rather than calling it indirectly via randv(). This was done to avoid the overhead of multiple calls to norm1v() when randomizing the rows/columns of a matrix. - Updated comments.

fgvanzee · 2020-06-18T18:35:01Z

@kvaragan Please try out b5b604e. It should fix the test failures (which were false to begin with).

kvaragan · 2020-06-19T09:19:40Z

Its resolved now. Thanks Field.

Details: - Fixed an innocuous bug that manifested when running the testsuite on extremely small matrices with randomization via the "powers of 2 in narrow precision range" option enabled. When the randomization function emits a perfect 0.0 to fill a 1x1 matrix, the testsuite will then compute 0.0/0.0 during the normalization process, which leads to NaN residuals. The solution entails smarter implementaions of randv, randnv, randm, and randnm, each of which will compute the 1-norm of the vector or matrix in question. If the object has a 1-norm of 0.0, the object is re-randomized until the 1-norm is not 0.0. Thanks to Kiran Varaganti for reporting this issue (flame#413). - Updated the implementation of randm_unb_var1() so that it loops over a call to the randv_unb_var1() implementation directly rather than calling it indirectly via randv(). This was done to avoid the overhead of multiple calls to norm1v() when randomizing the rows/columns of a matrix. - Updated comments. Change-Id: I0e3d65ff97b26afde614da746e17ed33646839d1

kvaragan closed this as completed Jun 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

testsuite reports failure for GEMM for Randomize vectors and matrices using powers of 2 in narrow precision range #413

testsuite reports failure for GEMM for Randomize vectors and matrices using powers of 2 in narrow precision range #413

kvaragan commented Jun 17, 2020

devinamatthews commented Jun 17, 2020

kvaragan commented Jun 17, 2020

devinamatthews commented Jun 17, 2020

fgvanzee commented Jun 17, 2020

fgvanzee commented Jun 18, 2020

kvaragan commented Jun 19, 2020

testsuite reports failure for GEMM for Randomize vectors and matrices using powers of 2 in narrow precision range #413

testsuite reports failure for GEMM for Randomize vectors and matrices using powers of 2 in narrow precision range #413

Comments

kvaragan commented Jun 17, 2020

devinamatthews commented Jun 17, 2020

kvaragan commented Jun 17, 2020

devinamatthews commented Jun 17, 2020

fgvanzee commented Jun 17, 2020

fgvanzee commented Jun 18, 2020

kvaragan commented Jun 19, 2020