|
|
|
A list of our published paper can be found at the
GIS Google Scholar profile
(goo.gl/KsjbmE).
Projects:
Bayesian Meta-analysis.
The role of meta-analysis is to summarize published studies on a specific problem
through statistics. It becomes increasingly important due to the advancement of
science and the growth in the number of publications. The synthesis of available
information facilitates the understanding and generates more robust conclusions.
Glass (1976) defines meta-analysis as an analysis of analysis, that is, a statistical
analysis that aims at combining results already found in previous analyses of
different studies with the same interest. The meta-analysis combines studies under
different conditions, with different levels of precision and research groups from
different regions and backgrounds. Thus, conclusions are expected to be broader than
those obtained by each of the studies that constitute the systematic
(Fagard et al., 1996). The meta-analysis also offers us the opportunity to reconcile
differences between regions, countries and groups and presents estimates of average
effect combining the results of several studies of the composition.
Estimation of the reliability of coherent systems.
The estimation of the reliability of coherent systems is greatly important
in engineering, but not limited to it. As far I know, there is not a full
solution for the problem yet, and we have been working on this problem and
advancing on the state of the art. Many ideas developed for reliability can
be used to solve problem in survival analysis. The most famous system in
reliability theory is the bridge system (Barlow and Proschan, 1981). It is
a coherent system and its properties have been studied and discussed.
However, the statistical properties for the estimation of the reliability
function and mainly the estimation of the reliability function of its
components have not been fully studied yet. Relating this problem with
survival analysis, there are important references such as Cox (1972),
Breslow and Crowley (1974) and Kaplan and Meier (1958). Examples of developments
in competitive risk (series system) are Aalen (1976), Tsiatis (1975),
Peterson (1977) and Salinas-Torres et al. (2002). For the parallel system,
Polpo and Pereira (2009) presented a Bayesian nonparametric solution to
the estimation of the distribution of the components of the system.
Functional data analysis.
Functional data are those where the observation is a real function and not just a
simple vector or scalar. This kind of problem has become more common since the
development of real time measure devices. It is a recent area in statistics and the
main used procedures were defined by Ramsay and Dalzell (1991);
Ramsay and Silverman (1997). This project was motivated by a physiotherapy problem
of the human gait. The human gait is important in order to understand whether the
human movement is normal or not and to develop methods to prevent/recover the
changes in the normal movement. Olshen et al. (1989) suggested a model to obtain
confidence regions to the functional data using the Bootstrap procedure.
Genomic Analysis.
A common procedure in Statistics and Machine Learning when dealing with data sets
of thousands of variables is to sort all these variables according to some measure
that identifies how important they are to predict and/or retrospectively understand
a certain target variable (or equivalently an indicator that tells in which group
or population belongs each sample). Classical examples of such a procedure are the
Students t-test and the Wilcoxons rank-sum u-test (Demsar, 2006;
Fay and Proschan, 2010; Mann and Whitney, 1947), whose statistics are often used to
sort variables into some order of importance. Arguably, they represent the most
commonly used methods for this problem in biomedical applications, in part because
of their prompt availability and easiness of use. A typical scenario is to have
gene expression data of cancer patients, and a class variable that identifies whether
the patient relapsed or not (in other word, whether the cancer came back after
treatment/surgery or not). The ability to sort variables in some meaningful order
has a range of applications in many fields, and can also be seen as means of
performing feature selection (Mitchell, 1997; Witten et al., 2011).
Quantile regression.
There are many studies that deal with the quantile regression model under
non-parametric and semi-parametric approaches for right-censored data (see,
for example, BuHamra et al. (2004); Fung et al. (2012); Koenker (2008);
Lin et al. (2012)). Semi-parametric models such as Cox’s (1972) proportional
hazards model and linear transformation models (Cheng et al., 1995) are very
popular for modeling effects of covariates on a survival response. Several
authors, including Ying et al. (1995), gave compelling arguments in favor of
focusing on the quantiles of the survival time for modeling and reporting data
analysis results. The many semi-parametric and non-parametric approaches are
mostly based on self-consistency and martingales, which estimate equations
for the median regression (Cheng et al., 1997; Portnoy, 2003; Peng and Huang, 2008).
Carroll and Ruppert (1984) and Fitzmaurice et al. (2007) propose parametric
versions of a Box-Cox transform-both-sides regression model, considering only
uncensored continuous responses, the original Box-Cox transformation, and the
normal distribution for the error.
Regression models for categorical data.
The development of the generalized linear
model brought an important tool to regression of categorical data (Nelder and
Wedderburn, 1972). The most popular linkage functions are logit and probit
(McCullagh and Nelder, 1989). Many studies discussed the limitations of these
symmetrical links. It is well accepted that, when the proportion of a binary
response goes to zero differently when it goes to one, a symmetrical link may
not be appropriate (Chen et al., 1999). Many parametric classes for link functions
were developed. Some works with one parameter link functions are Aranda Ordaz (1981);
Chen et al. (1999); Guerrero and Johnson (1982), and with two parameters are
Stukel (1988); Prentice (1976); Czado (1994).
|
|
|