Welcome, visitor! [ Login

 

U. Sarkans, H. Parkinson, G. G. Lara, A. Oezcimen, A. Sharma, N. Abeygunawardena, S. Contrino, E. Holloway, P. Rocca- Serra, G. Mukherjee, M. Shojatalab, M. Kapushesky, S. A. San-sone, A. Farne, T. Rayner and A. Brazma. (2005) The ArrayEx-press gene expression database: a software engineering and im-plementation perspective. Bioinformatics 21(8): 1495- 1501.

  • Listed: 22 May 2026 23 h 24 min

Description

U. Sarkans, H. Parkinson, G. G. Lara, A. Oezcimen, A. Sharma, N. Abeygunawardena, S. Contrino, E. Holloway, P. Rocca- Serra, G. Mukherjee, M. Shojatalab, M. Kapushesky, S. A. San-sone, A. Farne, T. Rayner and A. Brazma. (2005) The ArrayEx-press gene expression database: a software engineering and im-plementation perspective. Bioinformatics 21(8): 1495- 1501.

**U. Sarkans, H. Parkinson, G. G. Lara, A. Oezcimen, A. Sharma, N. Abeygunawardena, S. Contrino, E. Holloway, P. Rocca‑Serra, G. Mukherjee, M. Shojatalab, M. Kapushesky, S. A. Sansone, A. Farne, T. Rayner and A. Brazma. (2005) The ArrayExpress gene expression database: a software engineering and implementation perspective. *Bioinformatics* 21(8): 1495‑1501.**

### Introduction: Why ArrayExpress Still Matters in 2024

When the 2005 *Bioinformatics* paper introduced **ArrayExpress**, it wasn’t just another data repository—it was a bold statement about the future of **gene expression databases** and **software engineering** in life sciences. More than a decade later, researchers still turn to ArrayExpress for high‑quality **microarray** and **RNA‑Seq** datasets, underscoring the paper’s lasting influence on **bioinformatics**, **functional genomics**, and **open science**. In this post we’ll unpack the key innovations described by Sarkans et al., explore how the platform has evolved, and highlight why its engineering principles remain a benchmark for modern **data sharing** solutions.

### A Software‑Centric Vision for Biological Data

The authors framed ArrayExpress as a **software engineering challenge** rather than a purely biological one. Their design goals—**scalability**, **interoperability**, and **robust metadata handling**—anticipated the explosion of high‑throughput experiments that would follow. By leveraging a **modular architecture**, the team enabled seamless integration with the **MIAME** (Minimum Information About a Microarray Experiment) standards, ensuring that every dataset carried the contextual information needed for reproducibility.

Key engineering takeaways include:

1. **Layered Architecture** – Separation of the presentation, business logic, and data storage layers made it easier to upgrade components without disrupting user access.
2. **Metadata‑Driven Indexing** – Rich, searchable annotations allowed scientists to locate relevant experiments quickly, a feature that today’s **FAIR** (Findable, Accessible, Interoperable, Reusable) initiatives still champion.
3. **Open‑Source Toolkit** – The release of the underlying codebase encouraged community contributions, fostering a collaborative ecosystem that later projects like **EBI’s Expression Atlas** could build upon.

### From Microarrays to Multi‑Omics: Evolution of the Platform

While the 2005 article focused primarily on **microarray** data, the underlying infrastructure proved flexible enough to accommodate emerging technologies. By 2010, ArrayExpress began ingesting **RNA‑Seq** datasets, and more recently it supports **single‑cell transcriptomics**, **ChIP‑Seq**, and even **proteomics** experiments. This adaptability is a direct result of the **software engineering principles** highlighted by Sarkans et al.:

– **Extensible Data Model** – New assay types could be added by extending the existing schema rather than redesigning it from scratch.
– **API‑First Approach** – Robust RESTful services enable programmatic access, powering downstream tools such as **Bioconductor**, **Galaxy**, and custom **machine‑learning pipelines**.
– **Continuous Integration/Deployment (CI/CD)** – Automated testing pipelines ensure that updates do not compromise data integrity, a practice that aligns with modern **DevOps** standards.

### Impact on the Bioinformatics Community

The paper’s citation count (over 2,500 times) reflects its role as a cornerstone reference for anyone building **biological data repositories**. Researchers benefit from:

– **High‑Quality Curated Datasets** – Consistent metadata improves statistical power in meta‑analyses and cross‑study comparisons.
– **Reproducibility** – Transparent versioning and provenance tracking make it easier to replicate published findings, addressing a major concern in contemporary science.
– **Education & Training** – The open‑source code and detailed documentation serve as teaching material for bioinformatics curricula, illustrating best practices in **software engineering for life sciences**.

### Looking Forward: Lessons for Future Databases

What can new projects learn from the ArrayExpress blueprint?

1. **Prioritize Standards Early** – Aligning with community standards (e.g., **FAIR**, **MIAME**, **MINSEQE**) avoids costly retrofits later.
2. **Design for Extensibility** – A modular codebase accommodates novel assay types without breaking existing functionality.
3. **Invest in Community** – Open‑source licensing and clear contribution guidelines attract developers who can extend the platform’s capabilities.

By embracing these principles, the next generation of **gene expression databases** can achieve the same longevity and relevance that ArrayExpress enjoys today.

### Conclusion

The 2005 *Bioinformatics* article by Sarkans et al. remains a seminal work that married **software engineering rigor** with the needs of **functional genomics**. Its foresight in building a scalable, interoperable, and metadata‑rich system set the stage for today’s thriving **bioinformatics ecosystem**. Whether you’re a bench scientist searching for expression data, a bioinformatician developing analysis pipelines, or a software architect designing a new data repository, the lessons from ArrayExpress are as valuable now as they were fifteen years ago.

*Keywords: ArrayExpress, gene expression database, bioinformatics, software engineering, microarray, RNA‑Seq, functional genomics, data sharing, open science, FAIR data, metadata, reproducibility, data repository, high‑throughput sequencing.*

No Tags

6 total views, 4 today

  

Listing ID: N/A

Report problem

Processing your request, Please wait....

Sponsored Links

 

X. Xiao, S. Shao, Y. Ding, Z. Huang, X. Chen & K.C. Chou. (2005) An app...

X. Xiao, S. Shao, Y. Ding, Z. Huang, X. Chen & K.C. Chou. (2005) An application of gene comparative image for predicting the effect on […]

2 total views, 2 today

 

K. C. Chou. (2004) Coupling interaction between thromboxane A2 receptor and...

K. C. Chou. (2004) Coupling interaction between thromboxane A2 receptor and alpha-13 subunit of guanine nucleotide-binding protein. J. Proteome Res. 2005, 4, 1681-1686. Okay, the […]

2 total views, 2 today

 

K. C. Chou. Insights from modelling the 3D structure of extracel-lular doma...

K. C. Chou. Insights from modelling the 3D structure of extracel-lular domain of alpha 7 nicotinic acetylcholine receptor. Bio-chem. Biophys. Res. Commun, 319, 433-438. Okay, […]

4 total views, 4 today

 

K. C. Chou. (2004) Modelling extracellular domains of GABA-A receptors: sub...

K. C. Chou. (2004) Modelling extracellular domains of GABA-A receptors: subtypes 1, 2, 3, and 5. Biochem. Biophys. Res. Commun, 316, 636-642. **K. C. Chou. […]

4 total views, 4 today

 

K. C. Chou. (2004) Structure bioinformatics and its impact to biomedical sc...

K. C. Chou. (2004) Structure bioinformatics and its impact to biomedical science. Curr. Med. Chem, 11, 2105-2134. None

4 total views, 4 today

 

J. C. Obenauer, J. Denson, P. K. Mehta, X. Su, S. Mukatira, D. B. Finkelste...

J. C. Obenauer, J. Denson, P. K. Mehta, X. Su, S. Mukatira, D. B. Finkelstein, X. Xu, J. Wang, J. Ma, Y. Fan, K.M. Rakestraw, […]

5 total views, 5 today

 

E. Ghedin, N. A. Sengamalay, M. Shumway, J. Zaborsky, T. Feldblyum, V. Subb...

E. Ghedin, N. A. Sengamalay, M. Shumway, J. Zaborsky, T. Feldblyum, V. Subbu, D.J. Spiro, J. Sitz, H. Koo, P. Bolotov, D. Dernovoy, T. Tatusova, […]

6 total views, 6 today

 

G. Wu & S. Yan. (2008) Lecture Notes on Computational Muta-tion. Nova S...

G. Wu & S. Yan. (2008) Lecture Notes on Computational Muta-tion. Nova Science Publishers, New York. Okay, I need to write a blog post where […]

5 total views, 5 today

 

S. Khalid, M. Khan, C. B. Gorle, K. Fraser, P. Wang, X. Liu and S. Li, MaXl...

S. Khalid, M. Khan, C. B. Gorle, K. Fraser, P. Wang, X. Liu and S. Li, MaXlab: A novel application for the cross comparison and […]

5 total views, 5 today

 

BioCarta, Charting pathways of life. http://www.biocarta.com.

BioCarta, Charting pathways of life. http://www.biocarta.com. “BioCarta, Charting pathways of life. http://www.biocarta.com” In the realm of biological research and discovery, understanding the complex pathways of […]

5 total views, 5 today

 

X. Xiao, S. Shao, Y. Ding, Z. Huang, X. Chen & K.C. Chou. (2005) An app...

X. Xiao, S. Shao, Y. Ding, Z. Huang, X. Chen & K.C. Chou. (2005) An application of gene comparative image for predicting the effect on […]

2 total views, 2 today

 

K. C. Chou. (2004) Coupling interaction between thromboxane A2 receptor and...

K. C. Chou. (2004) Coupling interaction between thromboxane A2 receptor and alpha-13 subunit of guanine nucleotide-binding protein. J. Proteome Res. 2005, 4, 1681-1686. Okay, the […]

2 total views, 2 today

 

K. C. Chou. Insights from modelling the 3D structure of extracel-lular doma...

K. C. Chou. Insights from modelling the 3D structure of extracel-lular domain of alpha 7 nicotinic acetylcholine receptor. Bio-chem. Biophys. Res. Commun, 319, 433-438. Okay, […]

4 total views, 4 today

 

K. C. Chou. (2004) Modelling extracellular domains of GABA-A receptors: sub...

K. C. Chou. (2004) Modelling extracellular domains of GABA-A receptors: subtypes 1, 2, 3, and 5. Biochem. Biophys. Res. Commun, 316, 636-642. **K. C. Chou. […]

4 total views, 4 today

 

K. C. Chou. (2004) Structure bioinformatics and its impact to biomedical sc...

K. C. Chou. (2004) Structure bioinformatics and its impact to biomedical science. Curr. Med. Chem, 11, 2105-2134. None

4 total views, 4 today

 

J. C. Obenauer, J. Denson, P. K. Mehta, X. Su, S. Mukatira, D. B. Finkelste...

J. C. Obenauer, J. Denson, P. K. Mehta, X. Su, S. Mukatira, D. B. Finkelstein, X. Xu, J. Wang, J. Ma, Y. Fan, K.M. Rakestraw, […]

5 total views, 5 today

 

E. Ghedin, N. A. Sengamalay, M. Shumway, J. Zaborsky, T. Feldblyum, V. Subb...

E. Ghedin, N. A. Sengamalay, M. Shumway, J. Zaborsky, T. Feldblyum, V. Subbu, D.J. Spiro, J. Sitz, H. Koo, P. Bolotov, D. Dernovoy, T. Tatusova, […]

6 total views, 6 today

 

G. Wu & S. Yan. (2008) Lecture Notes on Computational Muta-tion. Nova S...

G. Wu & S. Yan. (2008) Lecture Notes on Computational Muta-tion. Nova Science Publishers, New York. Okay, I need to write a blog post where […]

5 total views, 5 today

 

S. Khalid, M. Khan, C. B. Gorle, K. Fraser, P. Wang, X. Liu and S. Li, MaXl...

S. Khalid, M. Khan, C. B. Gorle, K. Fraser, P. Wang, X. Liu and S. Li, MaXlab: A novel application for the cross comparison and […]

5 total views, 5 today

 

BioCarta, Charting pathways of life. http://www.biocarta.com.

BioCarta, Charting pathways of life. http://www.biocarta.com. “BioCarta, Charting pathways of life. http://www.biocarta.com” In the realm of biological research and discovery, understanding the complex pathways of […]

5 total views, 5 today