Bonjour, ceci est un commentaire. Pour supprimer un commentaire, connectez-vous et affichez les commentaires de cet article. Vous pourrez alors…
Huang, X.H., Pan, W., Han, X.Q., Chen, Y.J., Miller, L.W. and Hall, J. (2005) Borrowing information from relevant microarray studies for sample classification using weighted partial least squares. Computational Biology and Chemistry, 29(3), 204-211.
- Listed: 2 June 2026 15 h 51 min
Description
Huang, X.H., Pan, W., Han, X.Q., Chen, Y.J., Miller, L.W. and Hall, J. (2005) Borrowing information from relevant microarray studies for sample classification using weighted partial least squares. Computational Biology and Chemistry, 29(3), 204-211.
**Huang, X.H., Pan, W., Han, X.Q., Chen, Y.J., Miller, L.W. and Hall, J. (2005) Borrowing information from relevant microarray studies for sample classification using weighted partial least squares. Computational Biology and Chemistry, 29(3), 204-211.**
—
When you skim through the ever‑growing literature of **computational biology**, one citation often catches the eye of data scientists and **bioinformaticians** alike: the 2005 paper by Huang et al. that introduced a clever way to “borrow information” from existing **microarray studies** to improve **sample classification**. If you’ve ever wrestled with noisy gene‑expression data, struggled to find a robust **machine‑learning** model, or simply wondered how to make the most of previously published datasets, this landmark study offers a practical roadmap.
### Why borrowing information matters
In the early 2000s, microarray technology exploded, producing thousands of gene‑expression profiles across cancers, developmental stages, and drug‑response experiments. Yet each study often suffered from a limited number of samples, making statistical inference shaky. Huang and colleagues recognized that **relevant microarray studies**—those that share biological context, platform, or experimental design—hold untapped knowledge that can be transferred to a new classification problem. Their central insight: rather than treating each dataset in isolation, we can **integrate** them, weighting each source according to its relevance. This approach reduces variance, mitigates over‑fitting, and boosts predictive accuracy.
### Weighted Partial Least Squares (WPLS) explained
The authors paired the borrowing concept with **Weighted Partial Least Squares (WPLS)**, a variation of the classic **partial least squares regression** technique. Traditional PLS finds latent variables that capture the maximum covariance between predictor (gene expression) and response (class label) matrices. WPLS extends this by assigning a weight vector to each external study, effectively telling the algorithm how much trust to place in each source.
Key steps in the WPLS pipeline include:
1. **Selection of relevant external studies** – similarity is assessed using correlation of gene‑expression patterns or shared phenotype descriptors.
2. **Computation of study‑specific weight factors** – studies that are more biologically aligned receive higher weights.
3. **Construction of a combined predictor matrix** – weighted data are concatenated, preserving the original dimensionality while enriching the signal.
4. **Model training and validation** – the weighted matrix feeds into a PLS model that yields robust **sample classification** results.
The outcome is a classifier that not only learns from the target dataset but also leverages the collective wisdom of related experiments.
### Real‑world impact and modern relevance
Since its publication, the Huang et al. framework has inspired numerous **bioinformatics** tools for **gene‑expression analysis**, **cancer sub‑typing**, and **drug‑response prediction**. Researchers have applied the borrowing strategy to RNA‑seq data, proteomics, and even metabolomics, confirming that the principle transcends platform technology.
In today’s **big‑data** era, where repositories like **GEO** and **ArrayExpress** host millions of microarray and sequencing profiles, the WPLS methodology aligns perfectly with **data‑driven** research. By automatically extracting and weighting relevant information, scientists can accelerate **biomedical discovery**, reduce experimental costs, and improve the reliability of **machine‑learning** models in **personalized medicine**.
### Take‑away tips for practitioners
– **Curate your external pool carefully** – Use ontology terms (e.g., MeSH, GO) to filter studies that share disease type, tissue, or treatment.
– **Validate weighting schemes** – Experiment with correlation‑based, distance‑based, or Bayesian weighting to find the best fit for your data.
– **Combine WPLS with modern algorithms** – Pair the weighted matrix with **support vector machines**, **random forests**, or deep learning for hybrid models.
– **Leverage open‑source packages** – R packages like `pls` and `mixOmics` now include weighted options, making implementation straightforward.
### Final thoughts
The 2005 citation may look like a dense reference list, but beneath it lies a timeless lesson: **knowledge is cumulative**. By borrowing information from relevant microarray studies and applying **Weighted Partial Least Squares**, researchers can construct more accurate, stable, and biologically meaningful classifiers. Whether you’re a seasoned computational biologist or a newcomer to **gene‑expression data mining**, revisiting Huang et al.’s approach can sharpen your analytical toolkit and keep you at the forefront of **computational chemistry**, **bioinformatics**, and **precision health**.
*Keywords: microarray, weighted partial least squares, sample classification, computational biology, bioinformatics, gene expression analysis, machine learning, data integration, biomedical research, statistical methods.*
5 total views, 5 today
Sponsored Links
Ye Q., Krug R.M. and Tao Y.J. (2006) The mechanism by which influenza A vir...
Ye Q., Krug R.M. and Tao Y.J. (2006) The mechanism by which influenza A virus nucleoprotein forms oligomers and binds RNA. Nature, 444, 1078-1082. Okay, […]
No views yet
Dua, Q.S., Wang, S.Q., Huang, R.B. and Chou, K.C. (2010) Computational 3D s...
Dua, Q.S., Wang, S.Q., Huang, R.B. and Chou, K.C. (2010) Computational 3D structures of drug-targeting proteins in the 2009-H1N1 influenza A virus. Chemical Physics Letters, […]
No views yet
Baudin, F., Petit, I., Weissenhorn, W. and Ruigrok, R.W.H. (2001) In vitro ...
Baudin, F., Petit, I., Weissenhorn, W. and Ruigrok, R.W.H. (2001) In vitro dissection of the membrane binding and RNP binding activities of influenza virus M1 […]
No views yet
Liu, T. and Ye, Z.P. (2005) Attenuating mutations of the matrix gene of inf...
Liu, T. and Ye, Z.P. (2005) Attenuating mutations of the matrix gene of influenza A/WSN/33 Virus. Journal of Virology, 79(3), 1918-1923. None
No views yet
Maurer-Stroh, S. Ma, J.M., Lee, R.T.C., Sirota, F.L. and Eisenhaber, F. (20...
Maurer-Stroh, S. Ma, J.M., Lee, R.T.C., Sirota, F.L. and Eisenhaber, F. (2009) Mapping the sequence mutations of the 2009 H1N1 influenza A virus neuraminidase relative […]
1 total views, 1 today
Colman, P.M., Hoyne, P.A. and Lawrence, M.C. (1993) Sequence and structure ...
Colman, P.M., Hoyne, P.A. and Lawrence, M.C. (1993) Sequence and structure alignment of paramyxovirus hemagglutinin-neuraminidase with influenza virus neuraminidase. Journal of Virology, 67, 2972-2980. **Colman, […]
No views yet
KováccaronOVá, A., Ruttkay-Nedecký, G., Karol HaverlíK1, I. and Janecccaron...
KováccaronOVá, A., Ruttkay-Nedecký, G., Karol HaverlíK1, I. and Janecccaronek, S. (2002) Sequence similarities and evolutionary relationships of influenza virus A hemagglutinins. Virus Genes, 24, 57-63. […]
1 total views, 1 today
Hu, W. (2009) Identifying predictive markers of chemosensitivity of breast ...
Hu, W. (2009) Identifying predictive markers of chemosensitivity of breast cancer with random forests. Journal of Biomedical Science and Engineering, 3(1), 59-64. Okay, the user […]
1 total views, 1 today
Gao, D., Zhang, Y.X. and Zhao, Y.H. (2009) Random forest algorithm for clas...
Gao, D., Zhang, Y.X. and Zhao, Y.H. (2009) Random forest algorithm for classification of multi-wavelength data. Research in Astronomy and Astrophysics, 9(2), 220-226. None
2 total views, 2 today
Menze1, B.H., Kelm, B.M., Masuch, R., Himmelreich, U., Bachert, P., Petrich...
Menze1, B.H., Kelm, B.M., Masuch, R., Himmelreich, U., Bachert, P., Petrich, W. and Hamprecht, F.A. (2009) A comparison of random forest and its Gini importance […]
2 total views, 2 today
Ye Q., Krug R.M. and Tao Y.J. (2006) The mechanism by which influenza A vir...
Ye Q., Krug R.M. and Tao Y.J. (2006) The mechanism by which influenza A virus nucleoprotein forms oligomers and binds RNA. Nature, 444, 1078-1082. Okay, […]
No views yet
Dua, Q.S., Wang, S.Q., Huang, R.B. and Chou, K.C. (2010) Computational 3D s...
Dua, Q.S., Wang, S.Q., Huang, R.B. and Chou, K.C. (2010) Computational 3D structures of drug-targeting proteins in the 2009-H1N1 influenza A virus. Chemical Physics Letters, […]
No views yet
Baudin, F., Petit, I., Weissenhorn, W. and Ruigrok, R.W.H. (2001) In vitro ...
Baudin, F., Petit, I., Weissenhorn, W. and Ruigrok, R.W.H. (2001) In vitro dissection of the membrane binding and RNP binding activities of influenza virus M1 […]
No views yet
Liu, T. and Ye, Z.P. (2005) Attenuating mutations of the matrix gene of inf...
Liu, T. and Ye, Z.P. (2005) Attenuating mutations of the matrix gene of influenza A/WSN/33 Virus. Journal of Virology, 79(3), 1918-1923. None
No views yet
Maurer-Stroh, S. Ma, J.M., Lee, R.T.C., Sirota, F.L. and Eisenhaber, F. (20...
Maurer-Stroh, S. Ma, J.M., Lee, R.T.C., Sirota, F.L. and Eisenhaber, F. (2009) Mapping the sequence mutations of the 2009 H1N1 influenza A virus neuraminidase relative […]
1 total views, 1 today
Colman, P.M., Hoyne, P.A. and Lawrence, M.C. (1993) Sequence and structure ...
Colman, P.M., Hoyne, P.A. and Lawrence, M.C. (1993) Sequence and structure alignment of paramyxovirus hemagglutinin-neuraminidase with influenza virus neuraminidase. Journal of Virology, 67, 2972-2980. **Colman, […]
No views yet
KováccaronOVá, A., Ruttkay-Nedecký, G., Karol HaverlíK1, I. and Janecccaron...
KováccaronOVá, A., Ruttkay-Nedecký, G., Karol HaverlíK1, I. and Janecccaronek, S. (2002) Sequence similarities and evolutionary relationships of influenza virus A hemagglutinins. Virus Genes, 24, 57-63. […]
1 total views, 1 today
Hu, W. (2009) Identifying predictive markers of chemosensitivity of breast ...
Hu, W. (2009) Identifying predictive markers of chemosensitivity of breast cancer with random forests. Journal of Biomedical Science and Engineering, 3(1), 59-64. Okay, the user […]
1 total views, 1 today
Gao, D., Zhang, Y.X. and Zhao, Y.H. (2009) Random forest algorithm for clas...
Gao, D., Zhang, Y.X. and Zhao, Y.H. (2009) Random forest algorithm for classification of multi-wavelength data. Research in Astronomy and Astrophysics, 9(2), 220-226. None
2 total views, 2 today
Menze1, B.H., Kelm, B.M., Masuch, R., Himmelreich, U., Bachert, P., Petrich...
Menze1, B.H., Kelm, B.M., Masuch, R., Himmelreich, U., Bachert, P., Petrich, W. and Hamprecht, F.A. (2009) A comparison of random forest and its Gini importance […]
2 total views, 2 today
Recent Comments