

POSTGRADUATE CORNER 



Year : 2010  Volume
: 1
 Issue : 2  Page : 126127 

Data transformation
S Manikandan
Assistant Editor, JPP, Pondicherry, India
Date of Web Publication  10Nov2010 
Correspondence Address: S Manikandan Department of Pharmacology, Indira Gandhi Medical College and Research Institute, Kadirkamam, Pondicherry India
Source of Support: None, Conflict of Interest: None  Check 
DOI: 10.4103/0976500X.72373
How to cite this article: Manikandan S. Data transformation. J Pharmacol Pharmacother 2010;1:1267 
Preparing the data facilitates statistical analysis and this includes data checking, computingderived data from the original values, statistically adjusting for outliers and data transformation. The initial three methods have been explained previously in this series. ^{[1]} Data transformation also forms part of initial preparation of data before statistical analysis.
When to do Transformation?   
The pattern of values obtained when a variable is measured in large number of individuals is called a distribution. ^{[2]} Distribution can be broadly classified as normal and nonnormal. The normal distribution is also called 'Gaussian distribution' as it was first described by K.F. Gauss. This is called normal distribution as most of the biological parameters (such as weight, height and blood sugar) follow it. There are a very few biological parameters which do not follow normal distribution, for example antibody titre, number of episodes of diarrhoea, etc. The beginners should not be confused with the term 'normal' as it does not necessarily imply clinical normality and there is nothing abnormal in the 'nonnormal' distributions.
One of the assumptions of the statistical test used for testing hypothesis is that the data are samples from normal distribution.^{[3]} Hence it becomes essential to identify skewed/normal distributions. There are some simple ways to detect skewness. ^{[4]}
 If the mean is less than twice the standard deviation, then the distribution is likely to be skewed.
 If the population follows normal distribution, then the mean and the standard deviation of the samples are independent. This fact can be used for detecting skewness. If the standard deviation increases as the mean increases across groups from a population, then it is a skewed distribution.
Apart from these simple methods, normality can be verified by statistical tests like Kolmogorov  Smirnov test.
Once skewness is identified, every attempt should be made to convert it into a normal distribution, so that the robust parametric tests can be applied for analysis. This can be accomplished by transformation.
Transformations can also be done for the ease of comparison and interpretation. The classical example of a variable which is always reported after logarithmic transformation is the hydrogen ion concentration (pH). Another example where transformation helps in the comparison of data is the logarithmic transformation of doseresponse curve. When the doseresponse relationship is plotted it is curvilinear. When the same response is plotted against log dose (log doseresponse plot) it gives an elongated Sshaped curve. The middle portion of this curve is a straight line and comparing two straight lines (by measuring their slope) is easier than comparing two curves. Hence transformation can assist in the comparison of data.
In a nutshell, transformation can be carried out to make the data follow normal distribution or at times for ease of interpretation/comparison.
Which Type of Transformation to USE?   
Many a times, the transformation which makes the distribution normal also makes the variance equal. Even though there are many transformations like logarithm, square root, reciprocal, cube root, square, the initial three are more commonly used. The following are the guidelines for the selection of a method of transformation. ^{[5]}
 If the standard deviation is proportional to the mean, the distribution is positively skewed and logarithmic transformation is the ideal one.
 If the variance is proportional to the mean, square root transformation is preferred. This happens more in case of variables which are measured as counts e.g., number of malignant cells in a microscopic field, number of deaths from swine flu, etc.
 If the standard deviation is proportional to the mean squared, a reciprocal transformation can be performed. Reciprocal transformation is carried out for highly variable quantities such as serum creatinine.
Among these three transformations, logarithmic transformation is commonly used as it is meaningful on back transformation (antilog). ^{[3],[6]}
Caution
A small cautionary note for the beginners performing transformation is that all calculations should be done in the transformed scale and back transformation should be done only at the end.
Many researchers think that transformation of data is 'data deceiving'. They are assured that transformation is a statistically approved method and it is universally valid.
How to Report?   
While reporting the results, the summary statistics of the raw data should be mentioned. The transformation done should be clearly mentioned along with the reason for transformation. One should not forget to mention that all the statistical analyses were carried out on the transformed data. ^{[7]} Finally the back transformation value (especially for 95% confidence interval) should also be mentioned.
References   
1.  Manikandan S. Preparing to analyse data. J Pharmacol Pharmacother 2010;1:645. [ PUBMED] [Full text] 
2.  Altman DG, Bland JM. Statistics notes: The normal distribution. BMJ 1995;310:298. [ PUBMED] [Full text] 
3.  Bland JM, Altman DG. The use of transformation when comparing two means. BMJ 1996;312:1153. [ PUBMED] [Full text] 
4.  Altman DG, Bland JM. Detecting skewness from summary information. BMJ 1996;313:1200. [ PUBMED] [Full text] 
5.  
6.  Bland JM, Altman DG. Transformations, means and confidence intervals. BMJ 1996;312:1079. [ PUBMED] [Full text] 
7.  Swinscow TD, Campbell MJ. Statistics at square one. 10 ^{th} ed. (Indian). New Delhi: Viva Books Private limited; 2003. 
This article has been cited by  1 
Variability in gut mucosal secretory IgA in mice along a working day 

 Patricia Burns,Sofia Oddi,Liliana Forzani,Eduardo Tabacman,Jorge Reinheimer,Gabriel Vinderola   BMC Research Notes. 2018; 11(1)   [Pubmed]  [DOI]   2 
The impact of weight suppression and weight loss speed on baseline clinical characteristics and response to treatment 

 Marco Solmi,Davide Gallicchio,Enrico Collantoni,Paolo Meneguzzo,Tatiana Zanetti,Daniela Degortes,Elena Tenconi,Elisa Bonello,Angela Veronese,Andrea Ronzan,Angela Favaro   International Journal of Eating Disorders. 2018;   [Pubmed]  [DOI]   3 
Evaluation of dietary calcium level and source and phytase on growth performance, serum metabolites, and ileum mineral contents in broiler chicks fed adequate phosphorus diets from one to 28 days of age 

 T Momeneh,A Karimi,G Sadeghi,A Vaziry,M R Bedford   Poultry Science. 2018;   [Pubmed]  [DOI]   4 
Significance, Errors, Power, and Sample Size 

 Edward J. Mascha,Thomas R. Vetter   Anesthesia & Analgesia. 2018; 126(2): 691   [Pubmed]  [DOI]   5 
Chaos theory for clinical manifestations in multiple sclerosis 

 Tetsuya Akaishi,Toshiyuki Takahashi,Ichiro Nakashima   Medical Hypotheses. 2018; 115: 87   [Pubmed]  [DOI]   6 
Contextual interference during adaptation to asymmetric splitbelt treadmill walking results in transfer of unique gait mechanics 

 Jacob W. HinkelLipsker,Michael E. Hahn   Biology Open. 2017; 6(12): 1919   [Pubmed]  [DOI]   7 
Assessment of evidence for nanosized titanium dioxidegenerated DNA strand breaks and oxidatively damaged DNA in cells and animal models 

 Peter Møller,Ditte Marie Jensen,Regitze Sølling Wils,Maria Helena Guerra Andersen,Pernille Høgh Danielsen,Martin Roursgaard   Nanotoxicology. 2017; 11(910): 1237   [Pubmed]  [DOI]   8 
Tap, swipe, and build: Parental spatial input during iPad®
and toy play 

 Ariel Ho,Joanne Lee,Eileen Wood,Samantha Kassies,Carissa Heinbuck   Infant and Child Development. 2017; : e2061   [Pubmed]  [DOI]   9 
Fundamentals of Research Data and Variables 

 Thomas R. Vetter   Anesthesia & Analgesia. 2017; 125(4): 1375   [Pubmed]  [DOI]   10 
Correlates of virtual navigation performance in older adults 

 Laura E. Korthauer,Nicole T. Nowak,Scott D. Moffat,Yang An,Laura M. Rowland,Peter B. Barker,Susan M. Resnick,Ira Driscoll   Neurobiology of Aging. 2016; 39: 118   [Pubmed]  [DOI]   11 
Life is lognormal! What to do when your data does not follow a normal distribution 

 S. W. Choi   Anaesthesia. 2016; 71(11): 1363   [Pubmed]  [DOI]   12 
Incidence and associated factors of difficult tracheal intubations in pediatric ICUs: a report from National Emergency Airway Registry for Children: NEAR4KIDS 

 Ana Lia Graciano,Robert Tamburro,Ann E. Thompson,John Fiadjoe,Vinay M. Nadkarni,Akira Nishisaki   Intensive Care Medicine. 2014;   [Pubmed]  [DOI]   13 
Noncontrast computed tomography can predict the outcome of shockwave lithotripsy via accurate stone measurement and abdominal fat distribution determination 

 JiunHung Geng,HungPin Tu,Paul MingChen Shih,JungTsung Shen,MeiYu Jang,WenJen Wu,ChingChia Li,YiiHer Chou,YungShun Juan   The Kaohsiung Journal of Medical Sciences. 2014;   [Pubmed]  [DOI]   14 
When Ignorance is Bliss: Explicit Instruction and the Efficacy of CBMA for Anxiety 

 Ben Grafton,Bundy Mackintosh,Tara Vujic,Colin MacLeod   Cognitive Therapy and Research. 2013;   [Pubmed]  [DOI]   15 
Authoræs reply. 

 Manikandan S   J Pharmacol Pharmacother. 2011; 2(44): 45   [Pubmed]  





