Home Join Contact
 

Research Article

Open Access
Microarray Analysis of Differentially Expressed Genes Between Diabetes vs Healthy
Allam Appa Rao , Shyambabu M , Srinubabu Gedela*
*Corresponding authors: Dr. Srinubabu Gedela,
Phone   : +91-891-2844204,
Fax        : +91-891-2747969,  
Email    :  srinubabuau6@gmail.com
Received April 20, 2008; Accepted May 15, 2008; Published May 25, 2008
Citation: Allam AR, Shyambabu M, Srinubabu G (2008) Microarray Analysis of Differentially Expressed Genes Between Diabetes vs Healthy. J Proteomics Bioinform S1: S055-S084. doi:10.4172/jpb.s1000010
 
Copyright: © 2008 Allam AR, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
 
Abstract
The expression profiling of diabetes vs. healthy is a method of identifying genes potentially involved in the pathogenic process. Microarray  analysis enable one to determine the relative level of expression of practically all genes in a genome, allowing the prediction of cellular  plans for protein synthesis to be established. We therefore took the approach of using microarray analysis to provide a list of genes that  are differentially expressed between diabetes vs. healthy. Statistical methodologies are employed for interpretation of microarray  results. The present chapter discusses the introduction to microarray analysis and statistical methods along with the application of our  present study on differentially expressed genes of diabetes vs healthy.

Introduction
Completion of the human genome project, and the availability  of complete genomes for model organisms, provided unprecedented  prospects to the scientific community to carry out  investigations regarding the greater mysteries of life at the molecular  level, i.e. “ from the bottom”. The availability of several  genomic blueprints has allowed new approaches that are based  on comprehensive molecular analyses (and which enhance the  understanding of biological systems) to be devised especially for  biomedical applications. These new approaches offer the potential  to describe specific types of genetic changes as well as patterns  of altered gene expression and functions that define, for  instance, actual medical problems in the context of, but not entirely  based on, symptoms. It is anticipated that these new methods  will lead to identification of previously unknown features of  individual disease characteristics and profile progression and response  to treatment on the molecular basis. One of the most powerful  tools that has been developed as a vehicle for carrying out  such comprehensive analyses is the DNA microarray, or the “Gene  chip”, which consists of a flat solid support with multiple probes  that can be used to yield analytical signals (Suzuki et al, 2007).

Since is inception, DNA microarray technology hasgained widespread popularity for several reasons, including the fact that it allows a global snapshot of an organism’s gene expression at a given point time to be obtained. This is important because it is widely believed that thousands of genes and their products in a given living organism function in concert in a complicated and well-coordinated way to support its activities. Thus,a technology that allows such a global picture to be obtained enhances the understanding of the molecular- level biology of an organism and is highly desirable from that perspective. Traditional molecular biology methods of research have generally worked on a single experiment basis, determining the functions of a specific gene in given physiological, chemical and/or biochemical conditions, which means that the throughput is very limited and a comprehensive picture is hard to obtain.

Biological Background

Perhaps one of the most fundamental biological precepts  is the crucial role played by proteins as functional molecules of living cells. They are known to be responsible for energy production, biosynthesis of macromolecular components, aintenance of the structural architecture of the cell and response to external stimuli. Specialization of cellular functions occurs when certain  specific proteins are produced to direct the essential activities of  a given cell type. These proteins may be synthesized when the  need arises for non- routine functions such as response to environmental  results. On the other hand, “housekeeping proteins"  are required for basic processes such as replication, transcription,  translation, protein folding and primary metabolism. Recently  (Xi etal 2007) used a high throughput method called DNase-chip  to identify 3,904 DNaseI HS sites from six cell types across 1% of  the human genome. A significant number (22%) of DNaseI HS  sites from each cell type are ubiquitously present among all cell  types studied.


Although it is not clear at what levels housekeeping  proteins are produced, there is a general agreement that specialized proteins are produced in fluctuating concentrations, and are  important for influencing many of the unique cellular dynamics. Thus, understanding the well – orchestrated molecular networks  that control the synthesis, stability and degradation of these proteins  is important in appreciating most vital biological functions  of cells (Sato and Brivinalov, 2006). Understanding these regulatory  networks provides insight into possible molecular interventions  in cases of cellular malfunction. This is the driving force  behind the advent of studies culminating in the recent high  throughput technologies in general molecular biology.

Consequently, two options are available for investigating  molecular dynamics of the cell: (i) analyzing the complete set  of proteins in the cell (proteomics) or (ii) studying the variation in  transcription of genetic information that leads to the production of these proteins (transcriptomics). While proteomics provides a  snapshot of the status of the current molecular machinery of a  cell, transcriptomics allows one to identify the cell’s strategy for  protein synthesis in the conditions under which it is being investigated.  The goals of both transcriptomics and proteomics are  most often met using high – throughput technologies such as  DNA microarrays and mass spectrometry. Although protein array  technology has also been developed for proteomics, its use is  currently not as widespread as its DNA counterpart. DNA  microarrays enable one to determine the relative level of expression  of practically all genes in a genome, allowing the prediction  of cellular plans for protein synthesis to be established. The greater  goal of genomics is to determine the functional pathways influenced  by the interactions of all the expressed genes in a genome  under a specific set of conditions. Unfortunately, this goal has  not been met in the past, perhaps due to the lack of technological  ability to survey a large number of gene transcripts or proteins  simultaneously, and the scarcity of genes whose DNA sequences  had been determined (Romero et al, 2006).

The genome is a blueprint for the biology of cells and its  transcription is a regulatory step leading to cellular functional diversity. A genome is defined as the entire repertoire of genes in  an organism’s chromosomes, while genes are described as sequences  of DNA nucleotides capable of encoding biological information.  Some genes encode proteins, others functional RNAs  such as ribosomal RNAs and transfer RNAs, required for the  translation process itself. The fluctuations in the amount of expressed  genetic information lead to a cascade of events influencing  the cell’s function. If such functions are routine, then it would b e expected that the amount of genetic information expressed  would stay relatively stable. In principle, for any given expressed  gene in a cell, it is possible that a protein, whose function is  required by the cell, will be synthesized. In practice however, the  quantitative correlation between gene expression and protein  synthesis is quite poor due to differences in mRNA stabilities and  transnational efficiencies.

However, a comprehensive evaluation of whole – genome  expression is expected to be very informative with respect  to cell dynamics. Consequently, by evaluating fluctuations in the  levels of thousands of expressed genes, greater confidence can be placed on inferences concerning the functional needs of the  cell.

The molecular transmission of information in eukaryotes  follows a pathway between DNA, RNA and proteins. The biological information provided in the DNA nucleotide sequence of a  gene is transcribed into mRNA, which is ultimately translated into protein. The mRNA primary transcript is complementary of the  DNA sequence and must be correctly spliced to remove non – coding intronic sequences in order to yield the mature mRNA,  which consists of information – coding (exonic) segments of a gene. In addition to the coding region, the mature transcript contains  a 5’ untranslated region. (UTR), a 1’UTR and a polyadenylation signal which specifies the addition of a  polyadenosine tail to the 3’ end. Translation into proteins is performed on ribosomes and starts at an initiator methionine codon  (ATG). An initiator transfer RNA (tRNA) forms a complex that  results in the beginning of the nascent peptide. Sequentially, complexes  are formed between codons and the appropriately charged tRNA and amino acids are added (with the ribosome  moving from codon to codon along the mRNA) until a stop codon  is encountered. The order of amino acids added during translation  is determined by the order of codons on the mRNA between a start codon and a stop codon, known as an open reading frame  (ORF). A caricature of this process is shown in Figure 1.1.

Figure 1.1:   Molecular transmission of information in eukaryotes  starts from the transcription of a gene to RNA followed by splicing,  to eliminate the non – information coding introns, and ultimately  to the translation of mRNA to a functional protein


Single – Channel Microarrays
Single – channel microarrays represent perhaps some of  the best known commercial platforms for DNA microarray technology,  epitomized by the Affymetrix Gene Chips (Downey et al,  2006).These are made by synthesizing, in situ, thousands of short  nucleotide sequences based on ESTs, cDNAs or genomic DNA  on silicon wagers. For purposes of expression monitoring, fluorescent  labeled cDNA are hybridized to the array to allow probetarget  interactions through base-pairing.

Although these arrays have a number of positive features,  there are also several drawbacks. Perhaps chief among these is cost, since the technology is currently proprietary and therefore  not subject to market influences. Another important limitation  is that the availability of the arrarys is restricted to a small  number of specific organisms that have been extensively sequenced  and that are of general interest. Layout designs are standardized,  although custom arrays can be produced at a cost. The  requirement of knowledge of exact DNA sequences for the probes  has also put these arrays at a relative disadvantage in terms of the  discovery of novel genes. In addition, due to the short lengths of  the probes, it is anticipated that, when attached to a surface, the  bases nearest to the surface will be strictly inaccessible due to  duplex formation with complementary molecules in mixture.

Although single –channel arrays are widely used; the  focus of our present study will be on two-channel arrays, which appear to have established themselves to greater extent in research/ academic laboratories because of their lower cost and greater flexibility.

Two – Channel Microarrays
The basis of two – channel microarray platforms is the  comparison of mRNA abundance in similar cell samples fewer than two distinct physiological conditions on a single chip (Sjogren  et al, 2007). The approach for accomplishing this can be described  in four individual steps. First, mRNA from samples under two  conditions, where one condition is taken to be the reference (I e.g.  I normal physiological state), is independently extracted. The  amount of mRNA in the two samples is usually normalized through  absorbance measurements of the total RNA. In the second step,  mRNA from the two extracts is separately copied into cDNA in  vitro, using an enzymatic reaction known as reverse trascription.  During this synthesis, a deoxynucleotide triphosphate labeled  with one of two color fluorphores (red or green) or an aminoally 1  deoxynucleotide triphosphate, which is subsequently chemically  coupled to a fluorophore, is added into the respective reaction  mixtures and is incorporated into the synthesized cDNA. Third,  equal aliquots of the two-labeled cDNAs are co-hybridized onto a  single array containing single – stranded DNA probes. Finally,  fluorescence signals emitted by the targets are collected when the array is scanned with lasers set at Wavelengths corresponding  to the excitation frequencies of the two fluorophores. For every hybridization experiment, the emitted fluorescence is captured  and stored as a 16-bit tagged image file format (tiff). The relative abundance of mRNAs in the two samples is calculated as  the ratio of the fluorescence intensities of the two dye-labeled  cDNAs that hybridized with each probe. The general experimental  setup is represented in Figure 1.2.

Figure 1.2:   Spotted microarray experimental set –up. mRNA extracts (targets) from cells under two distinct physiological conditions are  obtained, reverse transcribed to cDNA and then labeled with different (Cy3 and Cy5) dyes. Equal aliquots of the dye – labeled nucleic acid  extracts are combined and applied to a glass substrate onto which single stranded cDNA or oligonucleotide complements (probes) of the  mRNA (dye- labeled cDNA) are immobilized.


The theory is that each probe will recognize and bind all  of its complementary partners in the sample through base pairing since the probes are in relative excess. The non – hybridized  transcripts are subsequently washed off so that the emitted fluorescence  is exclusively due to hybridized targets.

The principle of co- hybridization of transcripts and determination  of relative rather than absolute amounts of transcripts  is a consequence of the practical aspects of the experimental  setup for the spotted microarray platform. Relating the measured fluorescence intensity of hybridized transcripts to absolute gene  expression levels is impractical because; (a) the concentration  and length of probes among spots on a slide is variable, (b) probe  attachment is susceptible to aberrations that lead to non-uniform  spot morphologies, and (c) reference standards containing known  amounts of transcription products are not generally available.  Regarding (a), variation in the amount of probe can occur when  the probes are obtained from a library of expressed genes that  vary in length. While (b) is not a concern with in situ synthesized  microarrays, it is a fundamental problem in spotted microarrays.  Spotting of probes is performed robotically using pins (print  heads) that pick up DNA from 96 – or 384 – well microtitre plates by capillary action. These deposit probe aliquots sequentially  onto many glass microarray slides. Due either to non – uniform  surface properties of the glass slides, or temporal wear of the print  heads, the shapes of the spots may vary across a slide and among slides. Thus, when the fluorescence intensity is evaluated for  each spot, it is common for such morphological anomalies to result  in high signal variability. Finally, in view of (c), the lack of reference standards leads to the situation where one of the physiological  conditions from which the two cell samples are derived  must be considered as a reference state or considered.

This allows transcriptional re – adjustments in the cells  under perturbed chemical or physical environments to be evaluated based on this reference. Thus, analysis of two – channel  microarrays involves computing the relative fluorescence intensities of the two dyes for each probe, where the reference sample  acts as an internal standard. Ratios are believed to alleviate potential  experimental variability resulting from unequal concentrations  of probe, cross- hybridization and micro – spotting anomalies.  Although this may mitigate some of the variability, other  sources of these errors is important in appreciating the context in  which two – color microarrays are measured and analyzed.

One of the most widely used methods for ratio calculation  is the ratio of medians. This is a method where differential expression is measured as a ratio of the median of pixel intensities  within a spot mask for both dyes. The median is intended to represent  the center for the distribution of pixel intensities comprised  in the spot mask. Perhaps one of the major advantages of this  approach is that the measured ratios are robust to influence from  a few pixels with extreme values at either end of the distribution.  Unfortunately, when spots are characterized by substantial regions  (>50%) of low intensity pixels, as in the case of “donuts”, it  is anticipated that the low intensity pixels will dominate the spot  mask and result in ratios with a high uncertainty.

Another common measure of differential expression involves  evaluating the ratio of the mean of pixel intensities within  the spot mask. Calculation of mean values is straightforward and less affected by extended regions of low intensity fluorescence, but they are more susceptible to the influence of extreme values at  either end of a population, i.e., outliers in pixel population. For this  reason, the ratio of means is generally less robust. A less frequently  used approach to measuring the relative fluorescence is to calculate  pixel – by – pixel ratios of intensities across the spot and then  report the differential expression as the arithmetic mean or median  of the ratios. This is referred to as the “mean of ratios” or “median  of ratios”, respectively (Bakewell DJ, and Wit E, 2005). A major  drawback of this approach, especially when using means, is the  high sensitivity of the summary statistic to pixels.

Experimental Design Issues
One of the unfortunate consequences of the technical  and conceptual simplicity of microarray technology is its capacity  to yield data sets that are biased by inadequate design considerations.  In the absence of well – established experimental designs  for microarrays, poorly designed experiments continue to  yield multiply – confounded data with which one is unable to  answer the question for which the experiment was conducted.  The general objective of designing an experiment is to curtail  effects of confounding factors by generating data that span rich  and diverse sample spaces, have minimum effects of unwanted  variation and provide the potential for maximum efficiency for  probing the hypotheses under investigation. Yet, in microarrays,  there is often the false hope that due to the volume of data generated per experiment, confounding factors and unwanted variation  will be somewhat mitigated. The focus of interest in microarray  studies is typically genes that are differentially expressed in different  subjects, different tissues, cells exposed to varying physical/  biochemical conditions, or those undergoing growth, development, and degeneration. Some of the common reasons for evaluating  these variables are to discover the roles of genes in an  organism, to group genes according to common functions, to  understand the relationships among genes in a biological system  (systems biology), to classify biological specimens (e.g tumor  cells) on the basis of gene expression, and to identify important  biomarkers in disease progression.

Thus, analysis of these experiments involves identification  of genes that display uncharacteristic tendencies of increased  or decreased expression and achieving this goal must involve  careful experimental design to avoid spurious observations confounded  by unrelated experimental variables at multiple levels.  Microarray experiments can be regarded multilayered in the sense  that they involve several nested levels at which variability may be  introduced. In general microarray experiments must be designed  into three layers: (1) the selection of experimental units, (2) the  design of mRNA extraction, labeling and hybridization, and (3)  the arrangement of probes on the glass slides. Whereas the first  layer controls the span of the biological design space, the second  and third layers account for the analytical (technical) variability at  the lower levels of the experimental process and will be the focus  of this section.

Higher Level Data Analysis
At the primary level of data analysis, which might be  considered as data preprocessing from a chemometrics perspective, the steps are largely the same from one application to another:  griddling and segmentation, gagging, image processing, background subtraction, ratio calculation and normalization. Although  the details of these steps may differ, in the end the usual  result is a vector of ratios and their associated gene identifiers for  a series of samples, forming a two – way data matrix for further  analysis. At this stage, a variety of methods can be used to coax  the desired information from the data, depending on the nature of  the experiment. Typical goals include: (1) the identification of genes  exhibiting deferential expression (up – or down – regulation) relative  to some reference state, (2) the clustering or classification of  genes based on their expression across multiple samples, (4) the  identification of genes that may be used as biological markers (e.g  for a mutation, a disease, or resistance to some medication), and  (5) elucidation of gene function and mechanisms of interaction,  i.e. gene networks. In these studies, the term “expression profile”  is generally used to describe the normalized ratio (test / reference) or log- ratio of signals across all genes for a sample represented  on a particular microarray. From a chemometrics point of view, it  could be considered a kind of “ genetic spectrum” except that  there is no naturally contiguous ordering of channels. Changes, not absolute ratios, are important in time course experiments.  In other words, a change of 0.5 to 1 is equivalent to a change of 1 to 2 None the less, a consistent point of reference  should be chosen. It is also important to note that, due to the
proportional error structure, it becomes more useful to determine  the normalization factor, a, (Following quation ) through a regression  of the ratios on the log scale using the model:
log2 yi = log2 +log2xi + I

Methods Employed in the present study
1.1 Mahalanobis distance
The shape and the size of multivariate data are quantified  by the covariance matrix, which is taken into account in  the Mahalanobis distance.Thus, for a multivariate sample Xij,  where i = 1,2,3,...n (number of genes) and j = 1,2,3...p (number of samples), the Mahalanobis distance is defined as,

MDi=((xij – m)TC-1(xij - m))0.5 ...(1)

where m is the estimated multivariate location parameter and C  is the estimated covariance matrix. For multivariate normal data,  the squared MD values are approximately chi-square distributed  with p degrees of freedom. Multivariate outliers can now be defined as the observations having large (squared) MD  values. A quantile for a chi-square distribution can be fixed (say  95%) and the observations with MD values greater than the chisquare  cut-off at 95% are considered as outliers. The location and the  covariance parameters are estimated using robust estimation methods.  One of the well-known methods of estimation viz. Minimum  Covariance Determinant (MCD) has been used in the study.

Methadology
This analysis was performed at www.ocimumbio.com  at Hyderabad, using GenowizTM of their proprietary microarray  and pathway analysis tool. GenowizTM is a gene expression  analysis and tracking tool that enables researchers to analyze microarray data in an intuitive and comprehensive bioenvironment.  It includes novel quantification matrices and algorithms that facilitate expression pattern analysis and give an insight  into metabolic pathways. It offers an easy to use customizable
interface and allows integration of biotools and laboratory  information management system. GenowizTM incorporates an entire army of analysis tools for the efficient analysis of microarray data. 

Analysis Requirements
Identification of differentially expressed genes for Diabetic Vs Healthy.
Identification of differentially expressed genes for  Diabetic Vs Obese.
Identification of differentially expressed  genes for Diabetic with  family history Vs Diabetic without family history.

Analysis Performed
Data from six samples were hybridized on Human  40 K OchiChip Array. Gene expression values were  obtained after quantification of TIFF images. Data has 40,320  X 6 data-points (or probes). Empty spots and control  probes were removed before proceeding with data analysis.

Analysis Process Involved
1. Differential expression analysis.
2. Functional classification of differentially  expressed genes.

Differential Expression Analysis
The primary objective of any microarray study is to  assess the mRNA transcript levels of samples under different  experimental conditions. A fundamental question is, which of  the thousands of genes show significant difference in the expression levels across the samples. When the number  of replicates for each condition is adequate, the  identification of differentially expressed genes is meaningful.  However, in majority of the experiments, there are no or limited  replicates due to practical constraints of cost and feasibility.  In that case, appropriate statistical techniques are  required to furnish realistic information on the differentially  expressed (DE) genes.

For experiments with single sample in different  conditions, we assume that the log intensity values of gene  expression for the two samples are linearly related, following  bivariate normal distribution, contaminated with outliers.  In a contaminated bivariate distribution, the main body of  the data is characterized by bivariate normal distribution and  constitutes regular observations.The non-regular observations, described as outliers, represent systematic deviations.  These outliers are often suspected as possible candidates for  differential expression genes [Loguinov, et al. 2004; Zao  H-Y et al. 2004]. Here we use an exploratory approach consisting  of two-stages to detect outliers from bivariate population  and determining differentially expressed candidates  from these outliers. The approach provides the fold-change  value considering the scatter of observations and thereby  provides up and down regulated genes across the samples.  In the present context, there are six individuals, one from  each of the categories namely healthy (H), healthy with obesity  (H&O), obesity only (O), diabetes with parental history  (D&PH) and two individuals having diabetes without  parental history (D&NPH1 and D&NPH2). The expression levels of 39400 genes for each individual were obtained and  compared pair wise, resulting into fifteen combinations. The  analysis was carried out for each of these combinations independently  following the afore stated approach. Prior  to analysis, the data for each combination was normalized  using Loess normalization. Below we present the analysis for  each combination along with the interpretations.

1. Healthy (reference) vs. Healthy with Obesity (test sample)  (H vs H&O)
In the first step, the log intensity values of the gene expression  for the two samples were preprocessed using loess  method, in order to remove any measurement bias in the  experiment. Figure 3. 1.1

Figure  3.1.1:  MA-plots showing scatter of expression  values before and after loess normalization for healthy  vs. healthy with obesity comparison.


Upon normalizing the expression values for the two samples,  the scatter plot of log intensity values was obtained as shown  in Figure 3.1.2

Figure  3.1.2:   Scatter plot of log intensities for healthy vs.  healthy with obesity comparison after loess normalization.


The scatter plot gives the bivariate distribution along with  contaminated observations  (genes) / outliers. The Mahalanobis distance measure was used  to identify outliers for p=0.10. Thus out of 39400 genes, 3940  genes were identified as outliers as indicated by red spots in  Figure 3.1.3.

Figure  3.1.3:   Bivariate outliers based on Mahalanobis distance  measure for p=0.10 for healthy vs. healthy with obesity comparison.


The distribution of log fold change values was obtained and  the outliers were detected for the optimum cut-off value (c*). Figure 3. 1.4.

Figure  3.1.4:  The thresholds for 2.36 and 2 fold change  values. The green spots are the differentially expressed outlier genes for healthy vs. healthy with obesity comparison.


On similar lines, the analysis was carried out for the  remaining fourteen comparisons. The figures for each comparison  are given below followed by the table showing the  percentage of differentially expressed genes for the modified  fold change and the conventional 2-fold change.

2.Healthy (reference) vs Obesity (test sample) [H vs O]

Figure  3.2.1: MA-plots showing scatter of expression values  before and after loess normalization for healthy vs. obesity  mparison.


Figure  3.2.2: Scatter plot of log intensities for healthy vs.  obesity comparison after loess normalization.


Figure  3.2.3:  Bivariate outliers based on Mahalanobis distance  measure for p=0.10 for healthy vs. obesity comparison.


The distribution of log fold change values was obtained and  the outliers were detected for the optimum cut-off (c*). Figure 2.4

Figure 3. 2.4: The thresholds for 2.94 and 2 fold change values.  The green spots are the
differentially expressed outlier genes for healthy vs. obesity comparison.


shows the thresholds for 2.94-fold change, thereby providing  the up and down regulated genes. Out of 3940 outlier genes, 962 were detected as up-regulated, while 989 were detected as  down-regulated genes with respect to the healthy (H) individual.  Thus, for healthy vs. obesity comparison, 1951 genes were found  to be differentially expressed out of 39400, which amounts to 4.9%  of the total genes under study. This is 6% less than the number of  genes obtained for 2-fold change thresholds.

3. Healthy (reference) vs Diabetic with no Parental History  (1) (test sample) [H vs D&NPH1]

Figure  3.3.1: MA-plots showing scatter of expression values before and after loess normalization for healthy vs. diabetic with no parental history (1) comparison.


Figure  3.3.2: Scatter plot of log intensities for healthy vs. diabetic  with no parental history (1) Comparison after loess normalization.


Figure  3.3.3: Bivariate outliers based on Mahalanobis distance  measure for p=0.10 for healthy vs. diabetic with no parental history (1) comparison.


Figure  3.3.4: The thresholds for 2.37 and 2 fold change values.  The green spots are the differentially expressed outlier genes  for healthy vs. diabetic with no parental history (1).


The distribution of log fold change values was obtained and the  outliers were detected for the optimum cut-off value (c*). Figure 3.4 shows the thresholds for 2.37-fold change, thereby  providing the up and down regulated genes. Out of 3940 outlier genes,  1249 were detected as up-regulated, while 477 were detected as  down-regulated genes with respect the healthy (H) individual.  Thus, for healthy vs. diabetic with no parental history (1) comparison,  1726 genes were found to be differentially expressed  out of  39400, which amounts to 4.3% of the total genes under study.  This is 3.3% less than the number of genes obtained for 2-fold  change thresholds.

4. Healthy (reference) vs Diabetic with no Parental History  (2) (test sample) [Hvs D&NPH2]

Figure  3.4.1: MA-plots showing scatter of expression values  before and after loess normalization for healthy vs. diabetic  with no parental history (2) comparison.


Figure  3.4.2: Scatter plot of log intensities for healthy vs. diabetic  with no parental history (2) comparison after loess normalization.


Figure  3.4.3: Bivariate outliers based on Mahalanobis distance  measure for p=0.10 for healthy vs. diabetic with no parental history (2) comparison.


The distribution of log fold change values was obtained and the  outliers were detected for the optimum cut-off value (c*). Figure 4.4 shows the thresholds for 2.96-fold change, thereby  providing the up and down regulated genes. Out of 3940 outlier genes,  861 were detected as up-regulated, while 356 were detected as  down-regulated genes with respect to the healthy (H) individual.  Thus, for healthy vs. diabetic with no parental history (2) comparison,  1217 genes were found to be differentially expressed  out of  39400, which amounts to 3% of the total genes under study. This  is 7% less than the number of genes obtained for 2-fold change  thresholds.

Figure  3.4.4: The thresholds for 2.96 and 2 fold change values.  The green spots are the differentially expressed outlier genes  for healthy vs. diabetic with no history (2) comparison.


5.Healthy (reference) vs Diabetic with Parental History (test  sample) [H vs D&PH]

Figure  3.5.1: MA-plots showing scatter of expression values  before and after loess normalization for healthy vs. diabetic  with parental history comparison.


Figure  3.5.2: Scatter plot of log intensities for healthy vs.  diabetic with parental history comparison after loess normalization.


Figure  3.5.3: Bivariate outliers based on Mahalanobis distance measure  for p=0.10 for healthy vs. diabetic with parental history comparison.


The distribution of log fold change values was obtained and the  outliers were detected for the optimum cut-off value (c*). Figure  5.4 shows the thresholds for 2.36-fold change, thereby providing  the up and down regulated genes. Out of 3940 outlier genes,  1211 were detected as up-regulated, while 368 were detected as  down-regulated genes with respect to the healthy (H) individual.  Thus, for healthy vs. diabetic with parental history comparison,  1579 genes were found to be differentially expressed out of 39400,  which amounts to 4% of the total genes under study. This is  2.73% less than the number of genes obtained for 2-fold change  thresholds.

Figure  3.5.4:The thresholds for 2.36 and 2 fold change values. The green spots are the differentially expressed outlier genes  for healthy vs. diabetic with parental history comparison.


6. Healthy with Obesity (reference) vs Obesity (test sample)  [H&O vs O]

Figure  3.6.1: MA-plots showing scatter of expression  values before and after loess normalization for healthy with  oesity vs. obesity comparison.


Figure  3.6.2: Scatter plot of log intensities for healthy with  obesity vs. obesity comparison after loess normalization.


Figure  3.6.3: Bivariate outliers based on Mahalanobis distance  measure for p=0.10 for healthy with obesity vs. obesity comparison.


The distribution of log fold change values was obtained and the  outliers were detected for the optimum cut-off value (c*). Figure 6.4 shows the thresholds for 2.38-fold change, thereby  providing the up and down regulated genes. Out of 3940 outlier genes,  814 were detected as up-regulated, while 1469 were detected as  down-regulated genes with respect to healthy individual with  obesity (H&O). Thus, for healthy with obesity  vs. obesity comparison, 2283 genes were found to be differentially expressed out of n39400, which amounts to 5.8% of the  total genes under study. This is 2.6% less than the number of genes obtained for 2-fold change thresholds.

Figure  3.6.4: The thresholds for 2.38 and 2 fold change values.  The green spots are the differentially expressed outlier genes  for healthy with obesity vs. obesity comparison.


7. Healthy with Obesity (reference) vs Diabetic with no  Parental History (1) (test sample) [H&O vs D&NPH1]

Figure  3.7.1: MA-plots showing scatter of expression  values before and after loess normalization for healthy with  obesity vs. diabetic with no parental history (1) com  parison.


Figure  3.7.2: Scatter plot of log intensities for healthy with  obesity vs. diabetic with no parental history(1) comparison  after loess normalization.


Figure  3.7.3: Bivariate outliers based on Mahalanobis distance  measure for p=0.10 for healthy with obesity vs. diabetic with no  parental history (1) comparison.


The distribution of log fold change values was obtained and  the outliers were detected for the optimum cut-off value (c*). Figure 7.4 shows the thresholds for 2.14-fold change, thereby  providing the up and down regulated genes. Out of 3940  outlier genes, 539 were detected as up-regulated, while 1058 were  detected as down-regulated genes with respect to healthy individual  with obesity (H&O). Thus, for healthy with obesity vs.  diabetic with no parental history(1) comparison, 1597 genes were found to be differentially expressed out of 39400, which  amounts to 4% of the total genes under study. This is 1.2% less than the number of genes obtained for 2-fold change thresholds.

Figure  3.7.4: The thresholds for 2.14 and 2 fold change values.  The green spots are the differentially expressed outlier genes  for healthy with obesity vs. diabetic with no parental history (1)  comparison.


8.Healthy with Obesity (reference) vs Diabetic with no  Parental History  (2) (test sample) [H&O vs D&NPH2

Figure  3.8.1: MA-plots showing scatter of expression  values before and after loess normalization for healthy with  obesity vs. diabetic with no parental history (2) comparison.


Figure  3.8.2: Scatter plot of log intensities for healthy with  obesity vs. diabetic with no parental history (2) comparison  after loess normalization.


Figure  3.8.3: Bivariate outliers based on Mahalanobis distance  measure for p=0.10 for healthy with obesity vs. diabetic with no  parental history (2) comparison.


The distribution of log fold change values was obtained and  the outliers were detected for the optimum cut-off value (c*). Figure 8.4 shows the thresholds for 2.43-fold change, thereby  providing the up and down regulated genes. Out of 3940  outlier genes, 541 were detected as up-regulated, while 672 were  detected as down-regulated genes with respect to healthy individual  with obesity (H&O). Thus, for healthy with obesity vs.  diabetic with no parental history(2) comparison, 1213 genes were found to be differentially expressed out of 39400, which  amounts to 3% of the total genes under study. This is 2.75% less  than the number of genes obtained for 2-fold change thresholds.

Figure  3.8.4: The thresholds for 2.43 and 2 fold change values.  The green spots are the differentially expressed outlier genes  for healthy with obesity vs. diabetic with no parental history (2)  comparison.


9. Healthy with Obesity (reference) vs Diabetic with  Parental History (test sample) [H&O vs D&PH]

Figure  3.9.1: MA-plots showing scatter of expression  values before and after loess normalization for healthy with  obesity vs. diabetic with parental history comparison.


Figure  3.9.2: Scatter plot of log intensities for healthy with obesity  vs. diabetic with parental  history comparison after loess normalization.


Figure  3.9.3: Bivariate outliers based on Mahalanobis distance  measure for p=0.10 for healthy with obesity vs. diabetic with parental  history comparison.


The distribution of log fold change values was obtained  and the outliers were elected for the optimum cut-off value  (c*). Figure 9.4 shows the thresholds for 2.07-fold ange, thereby  providing the up and down regulated genes. Out of 3940  outlier genes, 502 were detected as up-regulated, while 822 were  detected as down-regulated genes with respect to healthy individual  with obesity (H&O). Thus, for healthy with obesity vs.  diabetic with parental history comparison, 1324 genes were found  to be differentially expressed out of 39400, which amounts to  3.3% of the total genes under study. This is 0.05% less than the number of genes obtained for 2-fold change thresholds

Figure  3.9.4: The thresholds for 2.07 and 2 fold change values.  The green spots are the differentially expressed outlier genes  for healthy with obesity vs. diabetic with parental history  comparison.


10. Obesity (reference) vs Diabetic with no Parental  History (1) (test sample) [O vs D&NPH1]

Figure  3.10.1: MA-plots showing scatter of expression  values before and after loess normalization for obesity vs.  diabetic with no parental history (1) comparison.


Figure  3.10.2: Scatter plot of log intensities for obesity vs. diabetes  with no parental history (1) comparison after loess normalization.


Figure  3.10.3: Bivariate outliers based on Mahalanobis distance  measure for p=0.10 for obesity vs. diabetic with no parental  history (1) comparison.


The distribution of log fold change values was obtained  and the outliers were detected for the optimum cut-off  (c*). Figure 10.4 shows the thresholds for 2.07- fold change,  thereby providing the up and down regulated genes. Out of  3940 outlier genes, 1479 were detected as up-regulated, while 1333  were detected as down-regulated genes with respect the individual with obesity (O). Thus, for obesity vs. diabetes with no  parental history (1) comparison, 2812 genes were found to be differentially expressed out of 39400, which amounts to 7.1% of  the total genes under study. This is 0.06% less than the number of  genes obtained for  2-fold change thresholds.

Figure  3.10.4: The thresholds for 2.07 and 2 fold change  values. The green spots are the differentially expressed outlier  genes for obesity vs. diabetic with no parental history (1)  comparison.


11.Obesity (reference) vs Diabetic with no Parental History  (2) (test sample) [O vs D&NPH2]

Figure  3.11.1: MA-plots showing scatter of expression values  before and after loess normalization for obesity vs. diabetic with  n o parental  his-tory (2)  com- parison.


Figure  3.11.2: Scatter plot of log intensities for obesity vs. diabetic  with no parental history (2) comparison after loess normalization.


Figure  3.11.3: Bivariate outliers based on Mahalanobis distance measure for p=0.10 for obesity vs. diabetic with no parental  history (2) comparison.


The distribution of log fold change values was obtained and  the outliers were detected for the optimum cut-off value (c*). Figure 11.4 shows the thresholds for 2.39-fold change, thereby  providing the up and down regulated genes. Out of 3940 outlier genes, 1534 were detected as up-regulated, while 813 were  detected as down-regulated genes with respect the individual  with obesity (O). Thus, for obesity vs. diabetic with no parental  history (2) comparison, 2347 genes were found to be differentially  expressed out of 39400, which amounts to 5.95% of the  total genes under study. This is 2.6% less than the number of genes obtained for 2-fold change thresholds.

Figure  3.11.4: The thresholds for 2.39 and 2 fold change  values. The green spots are the differentially expressed outlier  genes for obesity vs. diabetic with no parental history (2)  comparison.


12. Obesity (reference) vs Diabetic with Parental History (test  sample) [O vs D&PH]

Figure  3.12.1: MA-plots showing scatter of expression values before and after loess normalization for obesity vs.  diabetic with parental history comparison.


Figure  3.12.2: Scatter plot of log intensities for obesity vs.  diabetic with parental history comparison after loess normalization.


Figure  3.12.3:  Bivariate outliers based on Mahalanobis distance  measure for p=0.10 for obesity vs. diabetic with parental  history comparison.


The distribution of log fold change values was obtained and  the outliers were detected for the optimum cut-off value (c*). Figure 12.4 shows the thresholds for2.17-fold change, thereby  providing the up and down regulated genes. Out of 3940 outlier genes, 1338 were detected as up-regulated, while 1002  were detected as down-regulated genes with respect the individual  with obesity. Thus, for obesity vs. diabetic with parental  history comparison, 2340 genes were found to be differentially  expressed out of 39400, which amounts to 5.93% of the total genes  under study. This is 1% less than the number of genes obtained  for 2-fold change thresholds.

Figure  3.12.4: The thresholds for 2.17 and 2 fold change  values. The green spots are the  differentially expressed outlier genes for obesity vs.  diabetic with parental history comparison.


13. Diabetic with no Parental History 1 (reference) vs  Diabetic with no Parental History 2 (test sample) [ D&NPH1 vs D&NPH2]

Figure  3.13.1: MA-plots showing scatter of expression  values before and after loess normalization for diabetic with  parental no history (1) vs. diabetic with no parental history (2)  comparison.


Figure  3.13.2: Scatter plot of log intensities for diabetic with  parental no history (1) vs.
diabetic with no parental history (2) comparison after loess normalization.


Figure  3.13.3: Bivariate outliers based on Mahalanobis distance  measure for p=0.10 for diabetic with parental no history (1)  vs. diabetic with no parental history (2) comparison.


The distribution of log fold change values was obtained and  the outliers weredetected for the optimum cut-off value (c*). Figure 13.4 shows the thresholds for 2.18-fold change, thereby  providing the up and down regulated genes. Out of3940 outlier genes, 948 were detected as up-regulated, while 662 were detected  as down-regulated genes with respect to the individual with  diabetes and no parental history(1). Thus, for diabetic with no  parental history (1) vs. diabetic with no parental history (2) comparison, 1610 genes were found to be differentially  expressed out of 39400, which amounts to 4% of the total genes under study. This is 1.5% less than the number of genes obtained for 2-fold change thresholds.

Figure  3.13.4: The thresholds for 2.18 and 2 fold change  values. The green spots are the differentially expressed outlier  genes for diabetic with parental no history (1) vs. diabetic with  no parental history (2) comparison.


14.Diabetic with no Parental History 1 (reference) vs  Diabetic with Parental History (test sample) [ D&NPH1 vs  D&NPH

Figure  3.14.1: MA-plots showing scatter of expression  values before and after loess normalization for diabetic with  parental no history (1) vs. diabetic with parental history  comparison.


Figure  3.14.2: Scatter plot of log intensities for diabetic with  parental no history (1) vs. diabetic with parental history comparison  after loess normalization.


Figure  3.14.3: Bivariate outliers based on Mahalanobis distance  measure for p=0.10 for diabetic with parental no history (1)  vs. diabetic with parental history comparison.


The distribution of log fold change values was obtained and  the outliers were etected for the optimum cut-off value (c*). Figure 14.4 shows the thresholds for 2-fold change, thereby providing  the up and down regulated genes. Out of 3940 outlier genes, 686 were detected as up-regulated, while 682 were detected as  down-regulated genes with respect to the individual with diabetic and no parental history (1). Thus, for diabetic with no  parental history (1) vs. diabetic with parental history comparison, 1368 were found to be differentially expressed out of 39400,  which amounts to 3.4% of the total genes under study.

Figure  3.14.4: The thresholds for 2-fold change values. The green  spots are the differentially expressed outlier genes for diabetic with parental no history (1) vs. diabetic with parental history comparison. Here the modified threshold was same as conventional 2-  fold change.


15. Diabetic with no Parental History 2 (reference) vs  Diabetic with Parental History (test sample) [ D&NPH2 vs  D&NPH]

Figure  3.15.1:  MA-plots showing scatter of expression  values before and after loess normalization for diabetic with parental no history (2) vs. diabetic with parental history  comparison.


Figure  3.15.2: Scatter plot of log intensities for diabetic with  parental no history (2) vs. diabetic with parental history comparison  after loess normalization.


Figure  3.15.3: Bivariate outliers based on Mahalanobis distance  measure for p=0.10 for diabetic with no parental history  (2) vs. diabetic with parental history comparison.


The distribution of log fold change values was obtained and  the outliers were detected for the optimum cut-off value (c*). Figure 15.4 shows the thresholds for 2-fold change, thereby providing  the up and down regulated genes. Out of 3940 outlier genes, 676 were detected as up-regulated, while 979 were  detected as down-regulated genes with respect to the individual with diabetic and no parental history (2). Thus, for  diabetic with no parental history (2) vs. diabetic with parental  history comparison, 1655 were found to be differentially expressed  out of 39400, which amounts to 4.2% of the total genes under study.

Figure  3.15.4: The thresholds for 2-fold change values. The green  spots are the differentially  expressed outlier genes for diabetic with parental no history (2)  vs. diabetic with parental history comparison. Here the modified threshold was same as conventional 2-fold change.


Functional Classification of Differentially Expressed Genes
To determine biological significance of differentially expressed  genes, functional classification was performed using Gene Ontology.  Gene Ontology reports along with z-score are provided in  supplementary material for your reference. Numbers in parentheses  indicate number of up-regulated/down-regulated genes and total  number of genes (in uploaded data), present in that particular  ontology  respectively. Z-scores give statistical significance, indicating  relative representation up-regulated/down-regulated genes  in each function. To determine pathways associated with differentially  expressed genes, pathway analysis was performed. Pathway  reports are provided insupplementary material. Numbers in  parentheses indicate number of up-regulated / down- regulated  genes and total number of genes (in uploaded data), present in that particular pathway respectively.

Results Gene Ontology Analysis

1. Diabetes with Parental History Vs Normal (D&PH Vs H)
1. Molecular Function: Genes involved in NADH  dehydrogenase (ubiquinone) activity, glutamate dehydrogenase  [NAD(P)+] activity, CDP-diacylglycerol-glycerol-3-phosphate-3 phosphtidyltransferase activity are upregulated in D&PH with  respect to H.

Gene involved in protein kinase B binding, enzyme inhibitor  activity, acyl-CoA oxidase activity, phosphatidylinositol transporter activity, acyltransferase activity are downregulated in  D&PH with respect to H.

2. Biological Process: Genes involved in synaptic vesicle  membrane organization and biogenesis, polysaccharide metabolic process, regulation of growth rate, nucleosome assembly  are upregulated in D&PH with respect to H.

Genes involved in immune response, regulation of glycolysis  aredownregulated in D&PH with respect to H.

3. Cellular Component: Genes localized in cohesin core heterodimer, oligosaccharyl transferase complex,nucleosome, respiratory chain complex II are upregulated in D&PH with respect  to H. Genes localized in isoamylase complex, protein kinase  CK2 complex, proteasome activator complex, 6-phosphofructokinase  complex are downregulated in D&PH with respect to H.

2.Diabetes without Parental History Vs Normal (D&NPH1 Vs H)
1. Molecular Function: Genes involved in  hydroxyacylglutathione hydrolase activity, NADH dehydrogenase  (ubiquinone) activity, GABA-B receptor activity,  glutamate dehydrogenase [NAD(P)+] activity, CDPdiacylglycerol- glycerol-3-phosphate-3- hosphatidyltransferase  activity are upregulated in D&NPH1 with respect to H. Genes  involved in MHC class II receptor activity, structural  constituentofribosome, Hsp70 protein binding, L-tyrosine transporter  activity, cyclin binding, arachidonate 5-lipoxygenase  activity are downregulated in D&NPH1 with respect to H.

2. Biological Process: Genes involved in synaptic vesicle  membrane organization and biogenesis, polysaccharide metabolic process, regulation of growth rate, regulation of pH are  upregulated in D&NPH1 with respect to H. Genes involved in establishment of cellular localization, cell activation, immune  response are downregulated in D&NPH1 with respect to H.

3. Cellular Component: Genes localized in vacuolar lumen,  chromosome, nucleosome, proteasome activator complex are pregulated in D&NPH1 with respect to H. Genes localized in  ferritin complex, proton-transporting ATP synthase complex, coupling factor F(o), ribosome, eukaryotic translation elongation  factor 1 complex, ubiquitin conjugating enzyme  complex are downregulated in D&NPH1 with respect to H.

3.Diabetes without Parental History Vs Normal (D&NPH2 Vs H)
1. Molecular Function: Genes involved in asparaginase  activity, creatine:sodium symporter activity,  phosphomannomutase activity, glutamate dehydrogenase  [NAD(P)+] activity, basic amino acid transporter activity,  adenylosuccinate synthase activity are upregulated in D&NPH2 with respect to H. Genes involved in structural constituent of  ribosome, MHC class II eceptor activity, MHC class I  receptor activity, L-tyrosine transporter activity, Nacylmannosamine  kinase activity are downregulated in D&NPH2 with respect to H.

2. Biological Process: Genes involved in polysaccharide  metabolic process, regulation of pH, aromatic compound biosynthetic  process, regulation of growth rate, lipid glycosylation are upregulated in D&NPH1 with respect to H. Genes involved  in establishment of cellular localization, immune response,  ribosome biogenesis and assembly are downregulated in  D&NPH2 with respect to H.

3. Cellular Component: Genes localizedin  4-aminobutyrate ransaminase complex, oligosaccharyl transferase  complex are upregulated in D&NPH1 with respect to H. Genes  localized in ribosome, Arp2/3 protein complex, eukaryotic  translation elongation factor 1 complex, small ribosomal subunit, ferritin complex, mitochondrial outer membrane translocase  complex are downregulated in D&NPH2 with respect to H.

4. Obese Vs Normal (O Vs H)
1. Molecular Function: Genes involved in peptide  deformylase activity, NADH dehydrogenase (ubiquinone)  activity, glutamate dehydrogenase [NAD(P)+] activity, phosphomannomutase activity, transposase activity, carboxylic  ester hydrolase activity, glutamate decarboxylase activity,  mannosyltransferase activity, transforming growth factor beta  binding are upregulated in O with respect to H. Genes involved in glycolipid transporter activity, glycolipid binding, 3- hydroxyisobutyrate dehydrogenase activity, 25- hydroxycholecalciferol-24-hydroxylase activity are downregulated  in O with respect to H.

2. Biological Process: Genes involved in regulation of  isoprenoid metabolic process, polysaccharide metabolic process, regulation of pH are upregulated in O with respect to H. Genes involved in synaptic vesicle membrane organization and biogenesis, cellular macromolecule catabolic process, locomotion during locomotory behavior are downregulated in O with respect to H.

3. Cellular Component: Genes localized in  CAAX - protein geranylgeranyltransferase complex, intracellular organelle are upregulated in O with respect to H. Gene localized in vesicle, eukaryotic translation elongation  factor 1 complex, perikaryon, Golgi transport complex are downregulated in O with respect to H.

5. Diabetes Vs Obese (D&PH Vs O)
1. Molecular Function: Genes involved in glycolipid  transporter activity, calmodulin inhibitor activity, glycolipid binding, interleukin- 22 receptor activity, oxygen transporter activity, antigen binding, L- lactate dehydrogenase activity, glyoxylate reductase (NADP) activity, 25-hydroxycholecalciferol-24-hydroxylase  activity, glycerate dehydrogenase activity,  ubiquinol-cytochrome-c reductase activity are upregulated in D&PH with respect to O. Genes involved in amylo-alpha-1,6- glucosidase activity, 4-alpha- glucanotransferase activity, interleukin-8 receptor activity are downregulated in D&PH with respect to O.

2. Biological Process: Genes involved in synaptic vesicle  membrane organization and biogenesis, response to stimulus, cellular macromolecule catabolic process are upregulated in  D&PH with respect to O. Genes involved in regulation of isoprenoid metabolic process, blastocyst growth, regulation of glycolysis are downregulated in D&PH with respect to O.

3. Cellular Component: Genes localized invesicle  hemoglobin complex, perikaryon, Golgi transport complex are  upregulated in D&PH with respect to O. Genes Localized  inisoamylase complex, CAAX-protein geranylgeranyltransferase  complex, NADPH oxidase complex, protein kinase CK2 complex, MHC class I peptide loading complex, proteasome activator complex are downregulated in D&PH with respect to O.

6. Common to Diabetes and Obesity
1. Molecular Function: Genes involved in NADH  dehydrogenase (ubiquinone) activity, glutamate dehydrogenase  [NAD(P)+] activity, transposase activity, guanylate Cyclase  inhibitor activity are upregulated in common to diabetes  and obesity. Genes involved in hypoxanthine  phosphoribosyltransferase activity, structural Constituent of  ribosome, NADP binding, histone deacetylase activity  are downregulated in diabetes and obesity.

2. Biological Process: Genes involved in polysaccharide  metabolic process, regulation of pH, tissue development, diuresis are upregulated in diabetes and obesity. Genes  involved in regulation of hormone biosynthetic process, opsonization are downregulated in diabetes and obesity.

3. Cellular Component: Genes localized in oligosaccharyl  transferase complex, cytoplasmic vesicle, ribosome are upregulated in diabetes and obesity.

Genes localized in small ribosomal subunit, proton-transporting  ATP synthase complex, coupling factor F(o)  are downregulated in diabetes and obesity.

7. Obese Vs Tendency Towards Obesity (O Vs HO)
1. Molecular Function: Genes involved in transforming  growth factor beta binding, Sodium : amino acid  symporter activity, adenosylhomocysteinase activity, transferase activity, transferring acyl groups, caspase activator activity,  NAD(P)H oxidase activity, steroid 21-monooxygenase  activity, malate dehydrogenase (oxaloacetate-decarboxylating)  (NADP+) activity, glutamate decarboxylase activity upregulated  in O Vs HO. Genes involved in creatine:sodium symporter  activity, glycolipid transporter activity, glycolipid binding, 3- hydroxyisobutyrate dehydrogenase activity, leukemia inhibitory factor receptor activity, superoxide-generating NADPH oxidase activity, chemokine receptor activity, interleukin-22 receptor activity  are downregulated in O Vs HO.

2. Biological Process: Genes involved in establishment of cellular localization, cuticle biosynthetic process, hydrogen peroxide, biosynthetic process, vesicle docking are upregulated  in O Vs HO. Genes involved in synaptic vesicle membrane organization and biogenesis, response to stimulus, anatomical  structure development are down regulated in O Vs HO.

3. Cellular Component: Genes localized in CAAX – protein geranylgeranyltransferase complex are upregulated in O  Vs HO. Genes localized in Golgi transport complex, vesicle,  oncostatin-M receptor complex, perikaryon are downregulated in O Vs HO.

Diabetes with History Vs Diabetes without History

8. D&PH Vs D&NPH1
1. Molecular Function: Genes involved in MHC class II  receptor activity, gamma-aminobutyric acid:hydrogen symporter activity, chemokine receptor activity, interleukin-4 receptor  activity, interleukin-7 receptor activity, arachidonate 5-  lipoxygenase activity, complement receptor activity are  upregulated in D&PH Vs D&NPH1. Genes involved in ammonia  ligase activity, transaldolase activity, 4-alphaglucanotransferase  activity, choline:sodium symporter activity,  interleukin-8 receptor activity are downregulated in D&PH Vs D&NPH1.

2. Biological Process: Genes involved in cell activation, macromolecule  biosynthetic process, hydrogen peroxide biosynthetic process, immune response, regulation of glycolysis are  upregulated in D&PH Vs D&NPH1. Genes involved in  blastocystal growth, aromatic compound biosynthetic process, nitric oxide biosynthetic process, regulation of glycolysis are downregulated in D&PH Vs D&NPH1.

3. Cellular Component: Genes localized in ribonucleosidediphosphate  reductase complex, interleukin-18 receptor complex, interleukin-1 receptor complex, mitochondrion interleukin-5  receptor complex are upregulated in D&PH Vs D&NPH1. Genes localized in proteasome activator complex, isoamylase complex,  CAAX-protein geranylgeranyltransferase complex, protein kinase  CK2 complex, oxoglutarate dehydrogenase complex, MHC class I peptide loading complex are downregulated in D&PH Vs D&NPH1.

9. D&PH Vs D&NPH2
1. Molecular Function: Genes involved in structural constituent  of ribosome, MHC class II receptor activity, ferroxidase activity, NAD(P)H oxidase activity are upregulated in D&PH Vs  D&NPH2. Genes involved in 4-alpha-glucanotransferase activity, phosphomannomutase activity, receptor signaling proein  tyrosine kinase activity are downregulated in D&PH Vs D&NPH2.

2. Biological Process: Genes involved in intracellular sequestering  of iron ion, ribosome biogenesis and assembly, hydrogen peroxide biosynthetic process are upregulated in D&PH  Vs D&NPH2. Genes involvedin hemostasis, developmental growth, lipid glycosylation, regulation of glycolysis are downregulated in D&PH Vs D&NPH2.

3. Cellular Component: Genes localized in ribosome, ferritin  complex are upregulated in D&PH Vs D&NPH2. Genes  localized in CAAX-protein geranylgeranyltransferase complex,  isoamylase complex, apolipoprotein B mRNA editing enzyme  complex, lipopolysaccharide receptor complex, proteasome activator  complex are downregulated in D&PH Vs D&NPH2.

Pathway Analysis

1. Diabetes Vs Normal (D&PH Vs H)
Genes involved in Inositol phosphate metabolism, Starch  and sucrose metabolism, Nitrogen metabolism, Oxidative phosphorylation,  Androgen and estrogen metabolism, Glycan biosynthesis  and metabolism pathways, Metabolism of cofactors and  vitamins pathways, MAPK signaling pathway, ECM-receptor  interaction, Neuroactive ligand-receptor interaction, Regulation  of actin cytoskeleton, Cell communication pathways, Nervous  system pathways, Neurodegenerative disorders pathways  are upregulated in D&PH Vs H.

Genes involved in Glycolysis / Gluconeogenesis, Propanoate  metabolism, Carbon fixation, Biosynthesis of steroids, Fatty acid metabolism, Histidine metabolism, Phenylalanine metabolism,  Tyrosine metabolism, Urea cycle and metabolism of amino  groups, Cell cycle, Insulin signaling pathway, PPAR signaling  pathway, Antigen processing and presentation are downregulated  in D&PH Vs H.

2. Diabetes without Parental History Vs Normal (D&NPH1 Vs H)
Genes involved in Carbohydrate metabolism pathways, Metabolism  of cofactors and vitamins pathways, Ubiquitin mediated proteolysis, Signal transduction pathways, ECM-receptor interaction,  Neuroactive ligand- receptor interaction, Regulation of actin cytoskeleton, Cell cycle, Endocrine system pathways, Nervous  system pathways, Huntington’s disease are upregulated in  D&NPH1 Vs H. Genes involved in Cell adhesion molecules  (CAMs), Antigen processing and presentation are downregulated in D&NPH1 Vs H.

3. Diabetes without Parental History Vs Normal (D&NPH2 Vs H)
Genes involved in Carbohydrate metabolism pathways, Lipid  metabolism pathways, Glycan biosynthesis and metabolism pathways, Metabolism of cofactors and vitamins pathways, Ubiquitin  mediated proteolysis, Signal transduction pathways, Signaling  molecules and interaction pathways, PPAR signaling pathway,  GnRH signaling pathway, Nervous system pathways,  Development pathways, Neurodegenerative disorders pathways  are upregulated in D&NPH2 Vs H. Genes involved in Insulin signaling pathway, Immune system pathways are  downregulated in D&NPH2 Vs H.

4. Obese Vs Normal (O Vs H)
Genes involved in Carbohydrate metabolism pathways, Lipid  metabolism pathways, Amino acid metabolism pathways, Glycan  biosynthesis and metabolism pathways, Metabolism of cofactors  and vitamins pathways, Ubiquitin mediated proteolysis, Signal transduction pathways, Neuroactive ligand-receptor interaction,  Nervous system pathways, Neurodegenerative disorders pathways are upregulated in O Vs H.

Genes involved in Cell adhesion molecules (CAMs), Cytokinecytokine  receptor interaction, Insulin signaling pathway, Immune
sytem pathways are downregulated in O Vs H.

5. Diabetes Vs Obese (D&PH Vs O)
Genes involved in Inositol phosphate metabolism,  Oxidative phosphorylation, Amino acid metabolism  pathways, Ubiquinone biosynthesis, Signal transduction pathways, Signaling molecules and interaction pathways, Nervous  system pathways are upregulated in D&PH Vs O.

Diabetes with History Vs Diabetes without History

6. D&PH Vs D&NPH1
Genes involved in signal transduction, Regulation of actin  cytoskeleton, Antigen processing and presentation, Complement and coagulation cascades, Axon guidance,  Neurodegenerative disorders pathways are upregulated in D&PH  Vs D&NPH1. Genes involved in carbohydrate pathways are  downregulated in D&PH Vs D&NPH1.

7. D&PH Vs D&NPH2
Genes involved in Oxidative phosphorylation, Metabolism of  cofactors and vitamins pathways, Immune system pathways, Nervous system pathways, Metabolic disorders pathways are  upregulated in D&PH Vs D&NPH2.

Genes involved in Lipid metabolism pathways, Amino acid  metabolism pathways, Glycan biosynthesis and metabolism pathways,  Ubiquitin mediated proteolysis, Signal transduction pathways,  Signaling molecules and interaction pathways, Insulin signaling  pathway, PPAR signaling pathway are downregulated n  D&PH Vs D&NPH2.

Table 3.1: Number of up regulated and down regulated genes in each treatment category.


References
  1. Bakewell DJ, Wit E (2005) Weighted analysis of microarray gene expression using maximum-likelihood. Bioinformatics 21: 723-729. »  CrossRef  »  PubMed  »  Google Scholar

  2. Downey T (2006) Analysis of a multifactor microarray study using Partek genomics solution. Methods Enzymol 411: 256-70. »  CrossRef  »  PubMed  »  Google Scholar

  3. Filzmoser P, Garrett RG, Reimann C (2005) Multivariate outlier detection in exploration geochemistry. Compuiters and Geosciences 31: 579-587. »  CrossRef »  Google Scholar

  4. Sridhar GR (2007) Psychiatric co-morbidity & diabetes. Indian J Med Res 125: pp311-320.

  5. Garrett RG (1989) The chi-square plot: A tool for multivariate outlier recognition. Journal of Geochemical Exploration 32: 319-341.

  6. Gervini D (2003) A robust and efficient adaptive reweighted estimator of multivariate location and scatter. Journal of Multivariate Analysis 84: 116-144. »  CrossRef »  Google Scholar

  7. Kathryn EW, Gökhan SH (2005) Inflammation, stress, and diabetes. J Clin Invest 115: 1111-1119. »  CrossRef »  Google Scholar

  8. Loguinov AV, Saira Mian I, Vulpe CD (2004) Exploratory differential gene expression analysis in microarray experiments with no or limited replication. Genome Biology 5: R18. »  CrossRef  »  PubMed  »  Google Scholar

  9. Paturi VR, Xinfang L, Patrick P, Mark T, Nandgaonkar S (2004) Gene expression profiles of peripheral blood cells type 2 diabetes and nephropathy in Asian Indians. Genome Biology 5: P9. »  CrossRef »  Google Scholar

  10. Romero R, Espinoza J, Gotsch F, Kusanovic JP, Friel LA (2006) The use of high-dimensional biology (genomics, transcriptomics, proteomics, and metabolomics) to understand the preterm parturition syndrome. BJOG 113 Suppl 3: 118-35. »  CrossRef  »  PubMed  »  Google Scholar

  11. Rousseeuw PJ, Van ZBC (1990) Unmasking multivariate outliers and leverage points. Journal of American Statistical Association 85: 633-651. »  Google Scholar

  12. Sato N, Brivanlou AH (2006) Microarray approach to identify the signaling network responsible for self-renewal of human embryonic stem cells. Methods Mol Biol 331: 267-83.  »  PubMed  »  Google Scholar

  13. Sjogren A, Kristiansson E, Rudemo M, Nerman O (2007) Weighted analysis of general microarray experiments. BMC Bioinformatics 8: 387-394. »  CrossRef  »  PubMed  »  Google Scholar

  14. Suzuki T, Tian QB, Kuromitsu J, Kawai T, Endo S (2007) Characterization of mRNA species that are associated with postsynaptic density fraction by gene chip microarray analysis. Neurosci Res 57: 61-85. »  CrossRef  »  PubMed  »  Google Scholar

  15. Téni GE, Christophe H, Dong Y, Olivier BB, Barend M (2006) NADPH Oxidase- Derived Overproduction of Reactive Oxygen Species Impairs Postischemic Neovascularization in Mice with Type 1 Diabetes. Am J Pathol 169: 719-728. »  CrossRef  »  PubMed  »  Google Scholar

  16. Xi H, Shulha HP, Lin JM, Vales TR, Fu Y (2007) Identification and characterization of cell type-specific and ubiquitous chromatin regulatory structures in the human genome. PLoS Genet 17: 3-8. »  CrossRef  »  PubMed  »  Google Scholar

  17. Zao HY, Yue PY, Fang KT (2004) Identification of differentially expressed genes with multivariate outlier analysis. J Biopharm Stat 14: 629-646. »  CrossRef  »  PubMed  »  Google Scholar

This Article
DOWNLOAD
» XML (93 KB)
» PDF (3, 549 KB)
» Citation

CONTRIBUTE

SHARE

EXPLORE
Related Article at