Machine Learning Classification of Attention-Deficit/Hyperactivity Disorder Using Structural MRI Data

Background Clinical symptoms-based ADHD diagnosis is considered “subjective”. Machine learning (ML) classifiers have been explored to develop objective diagnosis of ADHD using magnetic resonance imaging (MRI) biomarkers. Methods We reviewed previous literature and developed ensemble classifiers using the ENIGMA-ADHD dataset, with the implementation of data balancing to control for age, sex, diagnostic groups, and sample sites and a held-out test set for independent evaluation. Results Our review showed that classification accuracies reported previously using cross-validation (CV) samples were inflated and did not generalize well to independent test samples. Our results showed a significant discrimination between ADHD and control samples for both adult and children, but the accuracies were modest (the area under the receiver operating characteristic curve (AUC): 66% and 67% respectively). We found that child samples were informative for predicting adult ADHD, and vice versa. The most important brain MRI structures for prediction were intracranial volume (ICV), followed by surface area and some subcortical volumes. The cortical thickness measurements were the least useful. Conclusions Although previous ML classification studies reported overly optimistic accuracies and suffered methodological limitations, our results suggest that clinically useful classification of ADHD may be possible with larger samples. In contrast to prior reports of ENIGMA-ADHD studies, our work finds ADHD-related sMRI differences in adults and shows that the brain differences between cases and controls seen in youth can be useful in discriminating adults with and without ADHD. This provides additional evidence for the continuity of ADHD’s pathophysiology from childhood to adulthood.


Potential conflicts of Interest:
Yanli Zhang-James, Emily C Helminen, Jinru Liu and Martine Hoogman declare no conflict of interest. Barbara Franke has received educational speaking fees from Shire and Medice. Dr. Stephen Faraone received income, travel expenses and/or research support from and/or has been on an Advisory Board for Pfizer, Ironshore, Shire, Akili Interactive Labs, CogCubed, Alcobra, VAYA Pharma, Neurovance, Impax, NeuroLifeSciences and research support from the National Institutes of Health (NIH). With his institution, he has US patent US20130217707 A1 for the use of sodium-hydrogen exchange inhibitors in the treatment of ADHD. In previous years, he received consulting fees or was on Advisory Boards or participated in continuing medical education programs sponsored by: Shire, Alcobra, Otsuka, McNeil, Janssen, Novartis, Pfizer and Eli Lilly. Dr. Faraone receives royalties from books published by Guilford Press: Straight Talk about Your Child's Mental Health, Oxford University Press: Schizophrenia: The Facts and Elsevier, ADHD: Non-Pharmacologic Treatments.

Introduction
Clinicians diagnose attention-deficit/hyperactivity disorder (ADHD) by evaluating symptoms and impairments. Despite of the concurrent and predictive validity of clinical diagnosis (1, 2), many raised concerns about the possibilities of over-diagnosing ADHD in the community (3, 4) because it relies on clinicians' "subjective" evaluation of responses from patients, parents, and/or informants. Concerns also exist about the underdiagnosis of ADHD (5, 6), especially in girls and women. The misdiagnosis of ADHD is also a serious concern, with an estimated misdiagnosis rate as high as 20% in the US (7). Those who are inappropriately diagnosed with the disorder may be unnecessarily exposed to chronic use of medications. Those who have ADHD and are not diagnosed will continue to have impaired functioning leading to increased risks for other health and social problems (8). When people who have ADHD are incorrectly diagnosed with another disorder, they may be exposed to unnecessary treatments and still face many impairments associated with ADHD.
In response to such concerns, researchers have sought to develop objective measures. Measures examined in the past years included peripheral biochemical markers(9, 10) and measures of oxidative stress (11), neuropsychological (12), electroencephalographic (EEG) (13), actigraphy (14), eye vergence (15), interactive gaming (16) and continuous performance tests (CPTs) (e.g. 17, 18, 19). Although many significantly differentiated subjects according to their ADHD diagnosis, none met the criteria of a "useful" biomarker defined by the World Federation of ADHD, which accordingly, must exceed 80% sensitivity and specificity, be reliable, reproducible, inexpensive, non-invasive, easy to use, and confirmed by at least two independent studies(20).
Magnetic resonance (MRI) data has also been examined for their potential to provide objective biomarkers for ADHD (21). The enthusiasm was further kindled by the , which provided an opportunity for researchers to compete for the best diagnostic classifier using a dataset much larger than any existing neuroimaging biomarker studies at that time, consisting of 776 children (63% healthy controls, 37% ADHD) contributed from eight sites (23)(24)(25). Although no predictive biomarkers were observed from those studies (26), the ADHD-200 dataset continues to be used by researchers to look for better brain-based biomarkers (27). It was later incorporated into a larger consortium by the Enhancing Neuro Imaging Genetics Through Meta Analysis (ENIGMA) ADHD Working Group. By Aug 2017, The ENIGMA-ADHD dataset contained 3,377 subjects, including >1,000 adults, from 23 participating sites. The initial report from ENIGMA-ADHD found small but significant and widespread regional volumetric differences between ADHD patients and healthy controls for children but not adults. These differences included volumetric reductions in intracranial volume (ICV), amygdala, caudate nucleus, nucleus accumbens, hippocampus, and cortical surface areas from many brain regions (28). The largest effect was found for total surface area: Cohen's d= -0.21, pFDR=<0.001 and the effect was larger in the youngest tertile (4-9 years, d =-0.35, pFDR=<0.001)(29).
The present study had three main objectives. We first sought to perform a systematic review of studies seeking to develop clinically useful classifiers for ADHD based on MRI data. Second, we applied ML to the ENIGMA-ADHD data with the goal of developing an improved neuroimaging classifier. Third, we used ML to test the hypothesis of continuity between childhood and adult ADHD (30-32). This idea has been challenged by recent studies (33). Given that symptoms and impairments persist into adulthood for a majority of children with ADHD (36), we hypothesized that ADHD-related brain structure differences in adults would be consistent with those in children and that ML methods may help uncover those differences.

Literature search and review
We searched PubMed using the key-words 'ADHD AND (classif*[ti] OR biomarker [ti]) AND machine learning' (up to May 1 st , 2018) to identify studies that used neuroimaging to discriminate ADHD and non-ADHD groups. Additional studies were extracted by examining their cited references. We examined the relationship of logittransformed accuracies reported (percent of correct classifications) with their cross-validation or testing methods, and sample sizes using a linear regression model and Pearson's correlation in STATA15. The analyses were weighted by the training sample sizes of the contributing studies.

MRI Samples
The ENIGMA-ADHD project provided T1-weighted structural MRI (sMRI) data from 3,377 subjects from 23 participating sites (by Aug 2017) to the current study. Images were processed using the consortium's standard segmentation algorithms in FreeSurfer (V5.1 and V5.3) (28). Variables used included 72 cortical surface area and thickness measurements from each hemisphere, 14 subcortical regions, and intracranial volume (ICV). Subjects missing more than 50% of variables were removed. Remaining missing values and outliers (outside of 1.5 times the interquartile range (iqr 1.5)) were replaced with imputed values using multiple imputation with chained equations in STATA15. Four sites that provided only cases with no controls were excluded from ML model training. However, they were used in additional test for assessment of model generalizability. The final ML dataset consisted of 48.7% non-ADHD controls (n=1320, male to female ratio (m/f) = 1.44) and 51.4% ADHD participants (n=1393, m/f=2.52). Ages ranged from four to 63 years old; 62.8% were children (age<18 years) and 37.2% were adults (age≥18 years, Table 1).
Subjects were randomly assigned to training (~70%), validation (~15%), and test (~15%) subsets. The random splitting was carried out within each diagnosis, sex, age subgroup and site. Next, we balanced the case and control groups within each site and age group by random oversampling of under-represented diagnostic groups, a procedure commonly used to deal with class imbalance.

Feature preprocessing
Principal factors factor analysis (PFFA) with varimax rotation on sMRI features on the training set identified 16 factors that explained >80% of the variance. Factor scores were computed for all subjects based on the training set PFFA. Outliers of the factor score (irq 1.5) were replaced with their closest values. We included age and sex as predictors because: 1) they are readily available, and 2) given the known effects of age and sex on brain structures, they may interact with sMRI features and improve predictive accuracy. All input features were scaled based on the training set's minimum and maximum values.

ML Algorithms
We implemented ensemble classifier in Scikit-learn (37), by combining support vector machine (SVM), random forest (RF), K-Nearest Neighbors (KNN), and gradient boosting (GB) classifiers. The mathematical basis of the individual classifiers were described previously (38-41). We estimated each individual classifier using Scikit-learn's grid search function to find the best hyperparameters, including C and gamma for SVM, the number of features, maximum depth and number of estimators for RF, the number of neighbors (parameter k), p and the leaf size for KNN, the learning rate, number of estimators and maximum depth for GB. A second grid search was performed on the ensemble model with the best hyperparameters from individual classifiers to finetune their combinations and the weights for the individual classifiers. We used area under the curve (AUC) statistic from the training and the validation subsets for model optimization during this two-tiered grid search. The final models were selected based on the highest validation AUCs (to improve accuracy) and the smallest difference between the training and validation AUCs (to avoid overfitting), with preference for hyperparameters favoring less overfitting. Receiver operating characteristic (ROC) curves and AUC statistics from the test subsets were reported. We also plotted learning curves using the training and test scores from different fractions of the training set to evaluate model overfitting and sample size effect.

Testing ADHD Hypotheses about the Continuity of ADHD from Children to Adults
Our analysis pipeline starts with three base models that classify ADHD in children, adults or combined samples. The base models used data from the corresponding age groups during the model training and validation phase and tested also on data from their corresponding age groups. Therefore, we referred to them as "Child, Child, Child", "Adult, Adult, Adult" and "Both, Both, Both", denoting their corresponding training, validation and test sets.
Next, we tested if the ADHD vs. control sMRI differences seen in adults would be useful in predicting ADHD in children. To do that, we used the model that was trained and validated on the adult data to predict the child test set. We refer this result as "Adult, Adult, Child". If the adult data are irrelevant to the child data, they will result in reduced or non-significant accuracy. We hypothesized that the AUC for this model would be statistically significant.
In like manner, we tested if the ADHD vs. control sMRI differences seen in children are useful in predicting ADHD in adults. To do that, we used the model that was trained and validated on the child data to predict the adult data. We refer to this result as "Child, Child, Adult". If the child data are irrelevant to the adult data, they will result in reduced or non-significant accuracy. We hypothesized that the AUC for this model would be statistically significant.

Model evaluation
We applied a softmax function (42) in our final models to generate a continuous brain risk score (BRS), which assess the probability for each individual to be diagnosed with ADHD. Cohen's d effect sizes were computed using the BRSs. In addition, we assessed the clinical utility of the model with sensitivity, specificity, positive predictive power, and negative predictive power using various cut-points to the BRSs.
The overall accuracies (percentage of correct predictions) were also reported and stratified by age, sex, and diagnostic groups. We used logistic regression to determine if prediction errors were significantly influenced by age, sex, diagnostic status, and MRI acquisition site. Pearson's correlation was used to determine if the subgroup accuracies were associated with the sample sizes in their corresponding training sets.

Evaluating model generalizability
We applied our model on 168 samples from four sites that had only provided cases. To prepare these samples for the test, we computed the 16 scaled factor scores based on the PFFA analysis of the training set. AUC statistics could not be computed due to the lack of control subjects. Only prediction accuracies (equivalent to sensitivity) were reported.

Feature importance
Importance scores were computed for 16 MRI factors, age and sex from RF and GB models (the other classifiers do not have a method for computing importance) (43). The importance of age and sex as features was further assessed by comparing the AUCs of models with age and sex excluded. For brain MRI measurement, we computed a composite score by summing the products of their factor loading with their corresponding factor's feature importance scores for both the RF and GB classifiers. We used the composite scores to assess how the different hemispheres and classes of MRI features contributed to predictive accuracy using a linear regression model. The four classes of MRI features were cortical surface areas, cortical thicknesses, subcortical volumes and total intracranial volume (ICV).

Literature Review
Nine studies were retained from the literature search (Table 2), among which three main methods were used to assess their model accuracies: 1) Various k-fold cross-validation (CV) methods: Seven studies used either 10fold, 70/30, or predefined CV. For the first two types, the training sets were randomly split each time and were trained many times on either 90% or 70% of the data and validated on the remaining portion. Predefined CV used fixed sets of samples as training and validation sets. 2) Leave one out CV (LOOCV): In six studies training was performed many times on the training set excluding one randomly selected sample for validation. Because the k-fold CV and LOOCV studies did not differ from one another in accuracy (F (1, 13) = 2.69, p=0.13), we combined them into one CV group. Among the above 14 studies, cross-validation accuracies were the only reported results in nine studies (64.3%). Only five studies used additional held-out test samples that were not used in the model training and validation. 3) Held-out tests: A total of nine studies reported accuracy evaluation using a held-out test sample that had not been used for either training or validation. Significantly lower accuracies were reported using held-out test samples than those using CV methodology (F (1, 18) = 26.38, p<0.0001, Figure 1). Training sample sizes did not have any significant effect on accuracy (F (1, 18) = 0.33, p=0.57). The correlations between accuracy and training sample size were 0.10 for the cross-validation results (p=0.71) and 0.25 for the held-out tests (p=0.38).

Accuracy of Predictions
The validation AUCs of the ensemble classifiers and their constituent classifiers were listed in Supplementary Table 1 for the best of the three base models. The child data performed better than the adult data for each constituent classifier as well as for the ensemble classifier. The same pattern was also observed for the corresponding test AUCs. Figure 2A shows the test set AUCs (as dots) and their 95% confidence intervals (as horizontal lines) for the three base models, which are denoted by their training, validation and test samples. The vertical line at an AUC of 0.5 indicates a chance level of diagnostic accuracy. The "Child, Child, Child" model had the highest AUC 0.67 with 95%CI (0.60, 0.73) that did not overlap with the 0.5 line, indicating significant predictive accuracy. The combined model ("Both, Both, Both") had a lower AUC (0.61, 95%CI: 0.56, 0.67), which was significantly different from 0.5, but not significantly different from the child model. The "Adult, Adult, Adult" model yielded the lowest test AUC (0.54, 95%CI: 0.45, 0.64), which did not differ significantly from 0.5, and was significantly lower than the Child AUC (Χ 2 (1) = 4.53, p = 0.03).

Tests of Hypotheses
When the Adult model was tested on the child samples (("Adult, Adult, Child"), we obtained a significant AUC(0.59, 95%CI: 0.52, 0.66) that was slightly lower but not significantly different from the "Child, Child, Child", or "Both, Both, Child" AUCs.

Model Evaluation
The receiver operating characteristic (ROC) curves ( Figure 3) show that the "Child, Child" model predicts child and adult ADHD equally well.
Examining sensitivity and specificity in separate sex groups (Supplementary Figure 1), we found that at default 0.5 BRS cut-point, the female subgroup has low sensitivity and high specificity. By contrast, the male subgroup has high sensitivity and low specificity. We can obtain similar sensitivities of ~20% with high specificities for both sexes (female 83.9% and male 87.4%, Table 4), if we shift the cut-points in opposite directions i.e. female 0.415 and male 0.71.

Feature importance
Feature importance scores from RF and GB models were significantly correlated (Pearson's r=0.66, p=0.003, Supplementary Figure 2). The ranked features using the average scores between both models were listed in Supplementary File 1. Factor 1 ranked the first and sex ranked last. Age ranked higher than sex but was still among the lowest. When we excluded age and sex from the model, we obtained an AUC 0.65 for the child samples (95%CI: 0.58, 0.71), which was similar to the AUC of the child set with age and sex included as features. Excluding age and sex resulted a lower AUC 0.59 (95%CI: 0.49, 0.68) for the adult samples, which also was not significantly different from that of the adult model with age and sex included. This suggests that the predictive information afforded by age and sex is redundant with the predictive information afforded by the sMRI features.
The ranked brain features and their corresponding importance scores are in Supplementary File 1. The importance scores were similar between left and right brain hemispheres but differed significantly across feature types (F [3,153] = 134, p<0.0001). ICV had the highest score ( Figure 4B). The mean scores for surface area measures (0.125, 95%CI: 0.122, 0.127) and subcortical volumes (0.125, 95%CI: 0.117, 0.134) were similar and both were significantly higher than those for cortical thickness (0.102, 95%CI: 0.099, 0.104). Figure 5 plots the learning curves for the prediction in children (Left) and adults (Right), showing similar converging trends for training and testing AUCs as the sample size increases. Final AUCs did not reach the accuracy achieved during training when all samples were used. The characteristics of the learning curves suggest some degrees of overfitting and that increasing sample size should improve performance and reduce overfitting. By extrapolating the training and test accuracies, we would predict that collecting more data would improve accuracy to about 0.75. Further improvement would likely require additional predictive features, such as functional MRI data.

Discussion
We achieved three main goals. First, our review found that many prior studies seeking to develop clinically useful classifiers for ADHD based on MRI data did not use a held-out test set and reported overly optimistic assessments of classification accuracy. Second, our results from ENIGMA ADHD data suggest that clinically useful classification may be possible, although achieving that will still require larger samples and, perhaps, additional predictive features. Third, we used ML in an innovative manner to provide supporting evidence for the continuity of ADHD's pathophysiology from childhood to adulthood.

What Makes Our Study Different From Prior Studies
Almost 40% prior studies of ADHD MRI classifiers only reported cross-validation (CV) accuracies (23, 24, 44-50)). Samples used in iterations of cross-validation influence the hyperparameter estimation. Therefore, CV results may overestimate actual accuracy, as our results showed. The sample size of ENIGMA ADHD data allowed us to properly estimate model accuracies from true test sets that were not involved in hyperparameter tuning.
Other concerns regarding many previous studies were confounding factors such as sites and case-control imbalance. For example, ADHD-200 dataset comprised of data from many different sites and has more controls (63%) than ADHD (51). Learning algorithms can be confounded by base rate of the disorder and difference among sites. In our study, we applied oversampling of under-represented groups in each age subgroup and site. In addition, we used AUC statistic for model evaluation, instead of commonly used accuracy (percentage correct), which can be influenced by case-control imbalances in data sets. Our results showed indeed that we removed the confounding effects of different acquisition sites and age groups. Although our accuracies were modest (61.2% for children and 62.1% for adults), they were at the high end of prior results with the held-out test sets (Figure 1). Two studies reported higher test accuracies than ours (52, 53). Both studies utilized functional MRI in addition to sMRI data as features. More importantly, both reported significant site variations. One estimated that the site information alone resulted in 66% prediction accuracy(52). Our results, in contrast, generalized well to different sites.
Another factor that has contributed to our models' generalizability was the rigorous regularization. Hyperparameters that favor less overfitting were selected so that models had training AUCs close to the validation AUCs. Learning curve analysis also helped us to assess how well our models learned with increasing sample sizes, and whether our models were overfit. No previous studies of ADHD have implemented and reported these measures.

Clinical Utility
Examining different accuracies in subgroups of sex and diagnosis, we found that male ADHD and female control groups had the highest accuracies for both children and adults (73.3% ~ 89.2%). Same pattern was observed in samples from the excluded sites. Conditional probability analysis suggests that sex-specific BRS thresholds should be considered. For example, shifting BRS cut-points in opposite directions for male and female groups, we achieve ~20% sensitivity and specificity >84% for both sexes. In this case, males also had a high PPP (71.1%), meaning that our model was correct most of the time when predicting a male as having ADHD, but it is wrong ~58% most of the time when classifying someone as non-ADHD. The lower PPP (40%) for girls means that we are often wrong (60% of time) when classifying a girl as having ADHD, although we are often correct (65.6%) when classifying a girl as non-ADHD. Although the current levels of accuracy do not suggest our models for clinical practice, our learning curve analysis indicate that increasing sample sizes could improve the model performance (54, 55), particularly more samples from the underrepresented sex and diagnostic groups, i.e. female ADHD and male control samples.

Machine Learning Tests of Hypotheses
Our results support the hypotheses about the continuity of child and adult ADHD pathophysiology and extend the results of prior ENIGMA ADHD studies (28,29). Firstly, consistent with prior ENIGMA reports, we found that adult ADHD could not be successfully discriminated from the controls when using only adult data. It could be due to the smaller sample size, or larger variations in brain differences in adults rendering it more difficult to discriminate. However, we show that using child data to train and validate the model, we can significantly improve the adult ADHD prediction, which suggests that the ADHD vs. control differences observed in children provide information relevant to adult ADHD and argues against recent hypothesis that adult ADHD is etiologically distinct from childhood ADHD (33). Indeed, our BRS estimated similar case-control effect sizes (Cohen's d) for children and adults. Both were two to three times greater than those of the individual regions reported in prior ENIGMA studies (Cohen's d 0.09 ~0.25) (28, 29). Secondly, some main features in our prediction models were consistent with preceding ENIGMA reports, for example, our most important feature ICV (28). Previously, total surface area was identified as the most significant measure with the highest Cohen's d effect size (28). We also found high scores for many surface areas in our model. One caution in interpreting importance scores is that one feature may mask the importance of the others because of high correlations, particularly for interconnected brain structures.

Limitations
First, although we eliminated the confounding effects of age and sites, we still observed sex and sample size differences. Future studies with more samples from under-represented groups will help improve model performance and generalizability. Second, we only used sMRI data. Incorporating other imaging modalities could help improve classification accuracy. Finally, we used pre-defined structures from ENIGMA standard image processing pipeline as features. It is possible that other methods such as one using 3D images as input features, in a convolutional neural network would uncover useful features leading to increased classification accuracy.
In conclusion, our application of ML to the ENIGMA ADHD data suggests that clinically useful classification may be possible, although achieving that will require larger samples. ML can uncover ADHD vs. control structure differences in adults that were not detected in prior ENIGMA ADHD reports using standard statistical methods. These analyses show that sMRI differences associated with ADHD are similar for adults and youth, which supports the continuity of ADHD's pathophysiology from childhood to adulthood. :   Table 1. Sample characteristics  Table 2. Machine learning literature on ADHD neuroimaging data.   Blue dots are results from held-out test set methodology and red triangles are results from cross-validation methodology. Higher accuracy scores were found for cross-validation compared with held-out test methodology.

Figure 2. Test set performance of the best ensemble models.
Area under the receiver operating characteristic curve (AUC) accuracy statistics for the held-out test results were plotted (as dots) with their 95% confidence intervals (as horizontal lines). The models were defined based on what samples were used for training, validation, and testing (Train, Valid, Test). The vertical line at an AUC of 0.5 indicates a chance level of diagnostic accuracy. If the 95%CI does not overlap with the 0.5 vertical line, it indicates significant predictive accuracy.
Three base models are plotted on the top group: the "Adult, Adult, Adult", the "Both, Both, Both", and the "Child, Child, Child" models. These used either only the adult, or the child or both samples respectively to train, validate, and test. Middle portion plots the adult and child subset AUCs from the "Both, Both, Both" models. Bottom portion showed the test AUC for adult samples by using the child model ("Child, Child, Adult", and the test AUC for the child samples by using the adult model ("Adult, Adult, Child").

Figure 3. ROC curves for ADHD prediction in adults and children.
Receiver operating characteristic (ROC) curves for our best model were compared for the test set prediction results in adults (red ROC) and children (blue ROC). A. The importance scores were derived from the two models that provide such scores: Random Forests (RF) and Gradient Boosting (GB). The features used are 16 brain MRI factors , age, and sex. These scores indicate the degree to which each feature contributed in predicting ADHD diagnostic status. The scores from two models were significantly correlated. Factor 1 ranked highest and sex ranked lowest.
B. The composite importance scores for MRI brain features were plotted to show mean differences across four main classes. The scores were generated by summing the products of the importance scores of the 16 factors in RF and GB models and the factor loading of the individual brain regions in each factor.

Figure 5. Learning curves for model prediction of the child ADHD (Left) and adult ADHD (Right).
The learning curves plot the training (red line) and test (green line) AUCs achieved for increasing training sample sizes. The whole training data were randomly split into eight parts. We started the training with 1/8 th of the total data, and repeated the process at an increment of 1/8th each time. For both graphs, a converging trend of training and testing AUCs was observed, although the final test AUCs did not reach the training AUCs when all the samples were used. The converging pattern and the gap indicate the presence of overfitting and suggest more samples are needed to improve model performance.

Supplementary Figure 1. Sensitivity and Specificity Analysis of Different Sex groups.
Classification sensitivity and specificity were computed and plotted for males and females separately at various probability cut-offs, which we referred as the brain risk score (BRS) that dichotomizes the case and control.  d  e  f  i  c  i  t  /  h  y  p  e  r  a  c  t  i  v  i  t  y  d  i  s  o  r  d  e  r  (  A  D  H  D  )  b  a  s  e  d  o  n  f  u  n  c  t  i  o  n  a  l  a  n  d  s  t  r  u  c  t  u  r  a  l  i  m  a  g  i  n  g  .   E  u  r  C  h  i  l  d  A  d  o  l  e  s  c  P  s  y  c  h  i  a  t  r  y   .   2  4  :  1  2  7  9  -1  2  8  9  .   2  4  .  H  a  r  t  H  ,  C  h  a  n  t  i  l  u  k  e  K  ,  C  u  b  i  l  l  o  A  I  ,  S  m  i  t  h  A  B  ,  S  i  m  m  o  n  s  A  ,  B  r  a  m  m  e  r  M  J  ,  e  t  a  l  .  (  2  0  1  4  )  :  P  a  t  t  e  r  n  c  l  a  s  s  i  f  i  c  a  t  i  o  n  o  f   r  e  s  p  o  n  s  e  i  n  h  i  b  i  t  i  o  n  i  n  A  D  H  D  :  t  o  w  a  r  d  t  h  e  d  e  v  e  l  o  p  m  e  n  t  o  f  n  e  u  r  o  b  i  o  l  o  g  i  c  a  l  m  a  r  k  e  r  s  f  o  r  A  D  H  D  .   H  u  m  B  r  a  i  n  M  a  p  p   .   3  5  :  3  0  8  3  -3  0  9 4 .  r  o  w  n  M  R  ,  S  i  d  h  u  G  S  ,  G  r  e  i  n  e  r  R  ,  A  s  g  a  r  i  a  n  N  ,  B  a  s  t  a  n  i  M  ,  S  i  l  v  e  r  s  t  o  n  e  P  H  ,  e  t  a  l  .  (  2  0  1  2  )  :  A  D  H  D  -2  0  0  G  l  o  b  a  l   C  o  m  p  e  t  i  t  i  o  n  :  d  i  a  g  n  o  s  i  n  g  A  D  H  D  u  s  i  n  g  p  e  r  s  o  n  a  l  c  h  a  r  a  c  t  e  r  i  s  t  i  c  d  a  t  a  c  a  n  o  u  t  p  e  r  f  o  r  m  r  e  s  t  i  n  g  s  t  a  t  e  f  M  R  I   m  e  a  s  u  r  e  m  e  n  t