Diego Garrido Martín <dgarrimar@gmail.com>
Mon 16/11/2020 21:11
Hi Pierre,

I hope everything is going well. Is there any plan in the short term to enhance CompQuadForm with a multiple-precision library? On the other hand, I found -i think- a small bug in the farebrother method: the break of the loop directly runs after ifault: 4, even when the accuracy required is achieved.

Please let me know!

Have a nice week,

Diego


---------- Forwarded message ---------
De: Lafaye de Micheaux Pierre <lafaye@dms.umontreal.ca>
Date: sáb., 21 ene. 2017 a las 11:25
Subject: Re: CompQuadForm questions
To: Diego Garrido Martín <dgarrimar@gmail.com>


Dear Diego,

I know about this problem. The problem comes from numerical precision. I
should consider using a multiple-precision library (see for example
www.mpfr.org or www.boost.org/doc/libs/1_60_0/libs/multiprecision) in my
C code. But for the moment, I really do not have time for this. I am in
the process of moving from France to Australia. If you know how to do
this and are willing to do it, I could incorporate your code in my
package (with due credit of course). If not, I'm afraid you will have to
wait for a few months.

Best regards,
Pierre
Le vendredi 20 janvier 2017 à 17:50 +0100, Diego Garrido Martín a
écrit :
> Dear proffessor Lafaye,
>
>
> I am a current user of your R package "CompQuadForm". Until recently i
> was using version 1.4.1. I just realized there is a newer version. I
> esssentially employ it to compute p-values and find it really useful.
> One of my major concerns is precision limit. I am performing millions
> of tests, and the smallest p-values are bounded by 10^-15. I would
> like to achieve smaller values. I had a look at the C++ code, but did
> not manage to achieve more precission. How would this be possible?
> Could you please help me on this?
>
> Additionally, I tried the different methods, and found  some problems
> with extreme values. Especially Davies, that frequently outputs 2.
> Also Farebrother provides sometimes a value of 1 when imhof works
> properly, and Imhof can give rise to negative P-values. I wondered if
> these issues are corrected in the newest version, or they will be
> corrected in the future, as i am interested in using the package as
> part of another packages.
>
>
> Many thanks in advance,
>
>
> Best,
>
>
> Diego Garrido

############################################################
From:	N.Xu@lumc.nl
To:	lafaye@DMS.UMontreal.CA, duchesne@DMS.UMontreal.CA
Cc:	J.J.Goeman@lumc.nl


Dear Dr. Lafaye de Micheaux and Dr. Duchesne,
 
I am writing to you to present an example for which functions in your package CompQuadForm, especially imhof() and farebrother() do not converge. I’m Ningning Xu, a PhD student in Leiden University Medical Center of The Netherlands, supervised by Prof Jelle Goeman, who first spotted this problem.
 
Our current project is based on computation of the distribution function of a linear combination of quadratic forms of standard normal variables. We are happy to find that your Package CompQuadForm is exactly designed for this goal. But we also find some problems when using it to compute p[Q>q] for large scale lambdas.
 
Attached is the R file where you can check 1) how we generate the data, 2) how we create the value point q and non-negative weights vector lambda, 3) the results of functions in the package and 4) the results of our own functions. In the following table, we compared the result of P[Q>q]:
 
Package CompQuadForm
our functions
Imhof():  0.5
pImhof():  3.395724e-10
Liu():  9.609379e-20
pPearson():    9.609379e-20     
Farebrother():  8
pRP():  1e-12                    [Robbins&Pitman,1949]
Davies(): 0
 
 
Ideally, for the same q and lambda, these functions should give similar rather than such diverse results. Farebrother even returns a probability of *8*. It seems like imhof() does not converge to the truth because it is expected to be 0 (as q is far greater than the sum of lambda ).  
 
Please let me known if I made anything incorrect, and hope what we find can help to make the package perfecter and help users to be clear about these “potential risks”.
 
Best wishes,
 
Ningning Xu
Jelle Goeman
 
## An example that imhof() in package "CompQuadForm" does not provide a converged result

library(CompQuadForm)

n = 200 ## observations
m = 250 ## covariates
X = matrix(0, n, m,byrow = T )
for ( i in 1:n){
  set.seed(1234+i)
  X[i,] =  as.vector(arima.sim(model = list(order = c(1, 0, 0), ar = 0.89), n = m) )
}
y = rbinom(n,1,0.6) ## binary response
X[which(y==1),1:120] = X[which(y==1),1:120] + 0.8
X[which(y==1),180:250] = X[which(y==1),180:250] + 0.98

library(stringr)
xs = str_replace_all(paste(rep("x",m),seq(1,m,1)),fixed(" "), "")
colnames(X) = xs

sqrW = sqrt(mean(y)*(1-mean(y)) )
WIHX = sqrW *(sweep(X,2,colMeans(X))) 
IHX = WIHX/sqrW 

Tf = sum((y%*%IHX[,xs,drop=F])^2) ## Tf = 480335.9
Lamf = round(eigen(WIHX[,xs,drop=F]%*%t(WIHX[,xs,drop=F]),symmetric = T,only.values = T)$values,8)
##Lamf is the  positive numeric vector with length 200, sum = 60976.23

imhof(Tf,Lamf)$Qq ## 0.5


farebrother(Tf,Lamf)$Qq ## 8 


davies(Tf,Lamf)$Qq  ## 0 

liu(Tf, Lamf)  ##9.609379e-20


## Not run
## our own function
#pPearson(Tf, Lamf) ## 9.609379e-20
#pImhof(Tf, Lamf)$value ## 3.395724e-10
#pRPw(Tf, Lamf)  ## 1e-12

From:	Pierre Duchesne <duchesne@dms.umontreal.ca>
To:	N.Xu@lumc.nl, lafaye@DMS.UMontreal.CA
Cc:	J.J.Goeman@lumc.nl

Hello Ningning,
 
Thank you for using our package. Reproducing your example:
 
> farebrother(Tf,Lamf)
$`dnsty`
[1] 0
 
$ifault
[1] -200
 
$Qq
[1] 8
 
Please note that there is an error code, $ifault is –200, meaning that the 200th eigenvalue is zero, clearly indicated by the complete output.
 
> Lamf[200]
[1] 0
> Lamf[200]==0
[1] TRUE
 
Please check the documentation carefully.
 
Best regards,
Pierre
 
------------------
Pierre Duchesne

Dear Dr. Duchesne,
 
Thanks for pointing out my mistakes.
 
But after removing zeros in $Lamf$, I found that imhof() = 0.5, which means that it is still not converged and far away from liu()=9.609379e-20. Maybe they are not exactly the same order of magnitude, but I think they should be close to 0.
 
Best wishes,
Ningning  

#######################################################################

I am using the imhof and the farebrother methods intensively for 
computing the p-value of a certain statistical test. Thus I detected 
that, in some cases, both methods have some difficulties when the 
quantiles are extreme:

- imhof sometimes gives negative values. For example:
     > imhof(1.729879e+01,0.5)
     $Qq
     [1] -0.01641256

     $abserr
     [1] 0.0004728838

- farebrother gives the value 3 if the quantile is zero. (Since the 
distribution of the underlying test statistic is asymptotically given by 
those of a quadratic form, the value zero indeed occurs.)
     > farebrother(0,0.5)
     $lambda
     [1] 0.5

     $h
     [1] 1

     $delta
     [1] 0

     $r
     [1] 1

     $q
     [1] 0

     $mode
     [1] 1

     $maxit
     [1] 100000

     $eps
     [1] 1e-10

     $dnsty
     [1] 0

     $ifault
     [1] 2

     $res
     [1] 3

Do you have an idea what the reason could be?

Best regards,
Stefan Aulbach
Email: stefan.aulbach@uni-wuerzburg.de
http://www.statistik-mathematik.uni-wuerzburg.de/en/mitarbeiter/stefan_aulbach/

##################################################################################


Dear  Dr Pierre Lafaye de Micheaux,

I am postdoc at the Netherlands Cancer Institute and currently work on non-nested hypotheses testing. In a recent work (http://arxiv.org/abs/1210.4584v2) the test-statistic which I propose is asymptotically distributed as a weighted-sum-of-chi-squares. There are  no non-centrality parameters involved, however the weights can be negative. I use this test in a multiple testing scenario and therefore I need an approximation which is accurate also for very small p-values.

I have recently read your very interesting paper on a comparison of several methods and have a question concerning your R package CompQuadForm:

Which function can you recommend me to use for my problem (computing accurate, small p-values based on weighted-sum-of-chi-squares with pos. and neg. weights)? If I understand correctly the farebrother approach requires positive weights. I tried davies and imhof. I have the impression that I can obtain smaller p-values (e.g. of the order 10^-10) with imhof, whereas  davies gives me a value of zero.

I am very thankful for any kind of thoughts or help !

Best wishes from Amsterdam,

Nicolas
"n.stadler" <staedler.n@gmail.com>

##################################################################################

Dear P. Lafaye de Micheaux,

I am using your R package CompQuadForm to calculate p-values. But the p-values could be too small, e.g., less than 10^{-8}. Is it possible to get the too small p-value with accuracy less than 10^{-9}?

I tried an example using three methods and got different results as follows:
Q=200
> davies(Q,c(6,3,1))$Qq
[1] 0
> farebrother(Q,c(6,3,1),eps=10^{-13})$res
[1] 1.224786e-08
> imhof(Q,c(6,3,1),epsabs=10^(-13))$Qq
[1] -1.092451e-07
So which method gives a correct value? And why is imhof's p-value negative?

Thank you.

Best regards,
ChangJiang


In Mathematica, I obtain a positive value:
0.5+NIntegrate[Sin[(ArcTan[6*u]+ArcTan[3*u]+ArcTan[u]-200*u)/2]/(u*(1+36*u^2)^(1/4)*(1+9*u^2)^(1/4)*(1+u^2)^(1/4)),{u,0,+Infinity}]/Pi

The problem comes from numerical precision. I should consider using a multiple-precision library (see www.mpfr.org or www.boost.org/doc/libs/1_60_0/libs/multiprecision) in my C code.

Voir aussi:
https://cran.r-project.org/web/packages/Rmpfr/vignettes/Rmpfr-pkg.pdf

####################################################################################


My name is Zhonghua Liu, a student in biostatistics from Harvard. I am using your CompQuadForm R package to compute p values which might require high accuracy. 

I tried the following R code and obtain the following:

> davies(200,c(1,2,3),lim=900000,acc =0.0000000000001)
$trace
[1] 0.000000e+00 0.000000e+00 0.000000e+00 3.141588e-02 8.425265e+04 5.436841e-05 5.600000e+01

$ifault
[1] 1

$Qq
[1] 2


The Qq should be within [0,1], would you let me know how to fix this? Thanks a lot!

"Zhonghua Liu" <zhl618@mail.harvard.edu>

####################################################################

Dear Professor Lafaye de Micheaux,

for some educational reasons I am currently trying to re-compute Durbin and Watsons (1951, II) critical values of the lower and upper bounds of their Test-Statistic dL and dH  (and also table extensions like the one of Savin and White (1977)) (I know that this has no practical relevance as we can compute exact p-values, but as mentioned, this is just for educational purposes). The only promising source I found on the web was your R-package CompQuadForm and a 2002 Handbook Publication "Computing the Distribution of a quadratic form in normal variables" by Professor Farebrother. However, I am stuck in section 4 in which he writes, that one can determine the bounds "by setting g=1 and solving the equation 

Pr[ sum( (lambda - c) * z ²) ] = alpha         (4.1)

for c". My problem is that I can evaluate the left-hand side of (4.1) using pan/gadsol but I do not know how to solve (4.1) for c. The only (and maybe naive) ideas that I have are (i) to reverse the pan/gradsol algorithm such that it returns quantiles (which seems to be a quite tedious task) or to do a binary search over function values of pan/gradsol function (which is computationally inefficient). How exactly are the quantile-tables of dL and dH generated?
The other problem I have is to implement CompQuadForm-functions or the original Imhof/Koerts/Abrahamse algorithm "fquad" for the above purpose as it seems not to evaluate the same quadratic form as pan/gradsol does. How do I have to use/modify this algorithm such that it is fully compatible with pan/gradsol?

I know that these questions my be trivial to you and I feel really sorry to bother you with that but I see no other solution than to contact you directly. You would do me a great favour if you could give me some short help.

Thank you very much in advance.
Sincerely, Yours

Sönke Hoffmann, Germany

-----------
Dr. rer. pol. Sönke Hoffmann
Otto-von-Guericke University
Faculty of Economics and Management
Chair of Economic Policy (VWL III)
"Hoffmann, Soenke" <soenke.hoffmann@ovgu.de>
