Similarities of gene expression patterns have been widely used to classify gene functions or elucidate gene regulatory mechanisms, because genes in the same, or related, biological function often show similar expression patterns. Therefore, large-scale gene expression data from various experimental conditions have an important role to evaluate the statistical similarities of genome-wide gene expression patterns. In addition, it allows us to construct gene expression network (GEN).

To elucidate gene functions and their transcription factors in rice, we have added information of GEN into OryzaExpress. The GEN in OryzaExpress has been constructed by the CA method (Yano et al. 2006), as well as other statistical indices for similarities of gene expression patterns between a gene pair; Pearson correlation coefficients (PCCs), Mutual Ranks (MRs) (see the database ATTED-II) and partial correlation coefficient (PAC).

CA is a powerful tool to summary a large-scale expression data and remove the effect of sample redundancy (containing many replicates). For genes and samples, CA provides coordinates (scores) in the low dimensional space. A distance between gene pairs depends on a degree of similarity of gene expression pattern: short distance means similar gene expression pattern and long distance means different expression pattern. Therefore, distances could be used as an index for similarities of expression patterns. In addition, CA can be performed for a few to 30 minutes for even large-scale data with more than 50,000 probes and 600 samples. PCC calculations is simple, but it needs a long calculation times (days to weeks). An index PAC correctly and directly provides the strength of linear association between a gene pair. However, the highest order PACs could not be calculated by common analysis servers. If PACs are used to construct GEN, the strength of linear association for each gene pair should be statistically tested by all first order PACs. CA permits to correctly and quickly construct GEN in any organisms.

The microarray data for GEN are collected from NCBI GEO database. CEL files of microarray platform 'Affymetrix Rice Genome Array' are normalized and analyzed by CA, and distances between genes are calculated. Other indices PCCs, MRs and PACs between each gene pair are also obtained. Based on the statistical indices, GEN is constructed. In the OryzaExpress, GEN of rice can be accessible by an interactive graphical viewer. To compare GENs between rice and Arabidopsis, it also provides information of Arabidopsis GEN (ATTED-II). The data of GEN and integrated annotations in OryzaExpress allow us to effectively extract useful information of interest.

The current version of Rice Gene Expression Network (RGEN) has been constructed from 1,893 samples in the NCBI GEO database.

GEN analyses

Collection of gene expression data from NCBI GEO
(Affymetrix Genechip Rice genome array)
Microarray data normalized using the RMA normalization method
(BioConductor and R
Perform CA and calculations of distances between each gene pair. Calculations of Pearson's correlation coefficients (PCCs), Mutual Ranks (MRs) and partial correlation coefficient (PACs) between the expression patterns of each gene pair by using R.
Construction of gene expression networks

Database construction

The GEN database was constructed using MySQL (http://www.mysql.com/) and Hypertext Preprocessor (http://www.php.net/) in the Linux web server.

Search functions

From "Search" in the top menu, the Information of GEN can be searched by querying with gene (probe) identifiers (e.g. Os.23107.1.A1.at) or biological annotation keywords (e.g. DNA binding) . The results of the searches are viewed in a web-page which contain the list of genes with similar expression profiles. The database also contains a search function from the similarities (CA, PCCs, MR and PACs). The functional annotations for each probe are easily accessed in the OryzaExpress.


Search from IDs

Search IDs from Annotation

Search result

Viewers for GEN

  Browsing gene pairs with similar expression patterns

From "Browsing"in the top menu, the numbers of probe pairs with similar expression patterns are shown. Red and blue font-colors in the table indicate similar and reciprocal pattens, respectively.

Network Viewer

The interactive viewer for GEN, probes with similar expression patterns are shown in a network graph which contains nodes (genes, probes) and edges (similarities). In the network graph, red and blue edges indicate similar and reciprocal pattens, respectively. Brief annotations of genes in the network are described in the page. More detailed annotations are also available from internal links to the detailed page of OryzaExpress.

What's new

The data of 'Gene Expression Networks' were updated.
The microarray samples used to construct networks increased from 871 samples to 1,893 samples.


