Fusion of heart rate variability and salivary cortisol for stress response identification based on adverse childhood experience

Adverse childhood experiences have been suggested to cause changes in physiological processes and can determine the magnitude of the stress response which might have a significant impact on health later in life. To detect the stress response, biomarkers that represent both the Autonomic Nervous System (ANS) and Hypothalamic-Pituitary-Adrenal (HPA) axis are proposed. Among the available biomarkers, Heart Rate Variability (HRV) has been proven as a powerful biomarker that represents ANS. Meanwhile, salivary cortisol has been suggested as a biomarker that reflects the HPA axis. Even though many studies used multiple biomarkers to measure the stress response, the results for each biomarker were analyzed separately. Therefore, the objective of this study is to propose a fusion of ANS and HPA axis biomarkers in order to classify the stress response based on adverse childhood experience. Electrocardiograph, blood pressure (BP), pulse rate (PR), and salivary cortisol (SCort) measures were collected from 23 healthy participants; 11 participants had adverse childhood experience while the remaining 12 acted as the no adversity control group. HRV was then computed from the ECG and the HRV features were extracted. Next, the selected HRV features were combined with the other biomarkers using Euclidean distance (ed) and serial fusion, and the performance of the fused features was compared using Support Vector Machine. From the result, HRV-SCort using Euclidean distance achieved the most satisfactory performance with 80.0% accuracy, 83.3% sensitivity, and 78.3% specificity. Furthermore, the performance of the stress response classification of the fused biomarker, HRV-SCort, outperformed that of the single biomarkers: HRV (61% Accuracy), Cort (59.4% Accuracy), BP (78.3% accuracy), and PR (53.3% accuracy). From this study, it was proven that the fused biomarkers that represent both ANS and HPA (HRV-SCort) able to demonstrate a better classification performance in discriminating the stress response. Furthermore, a new approach for classification of stress response using Euclidean distance and SVM named as ed-SVM was proven to be an effective method for the HRV-SCort in classifying the stress response from PASAT. The robustness of this method is crucial in contributing to the effectiveness of the stress response measures and could further be used as an indicator for future health. Graphical abstract ᅟ


Introduction
Stress is an individual's response to external and internal challenges that is regulated at the systemic and cellular level. In the resistance stage, stress heightens awareness, increases mental alertness, and leads to superior cognitive and behavioral performance such as seen in sports. However, stress in the long term will cause an individual to enter its exhaustion stage and could eventually lead to the exhaustion of hormonal, cardiovascular, neural and muscular system due to insufficient time for recovery and repair [105]. In addition, stress has been proven to be one of the main risk factors for serious illnesses and thus earlier mortality [109]. Due to the significant impact of stress on health, the study of stress with its various modalities has drawn the attention of many researchers. For example, the effects of stress and its relationship with certain diseases have been widely discussed, and a number of treatments have been proposed [105,109].

Adverse Childhood Experience
Early life exposure to stress has been suggested to be predictive of a range of negative health outcomes [9,39]. For example, alcohol addiction, obesity, depression, smoking, substance abuse, sexual behavior problems [10] and also serious illnesses such as cardiovascular disease [14,36] depression [14,34,61], irritable bowel syndrome [61] and cancer [57]. Motivated by these findings, recent work in the area of childhood trauma discovered that adverse childhood exposure might lead to changes in physiological processes such as cardiovascular stress reactivity. For example, individuals with a history of one or more significant adverse events earlier in their lives showed altered cortisol and heart rate responses to acute psychological stress [70,69]. Children from aggressive family environments have also been shown to have altered cardiovascular reactivity to a psychological stress task [71]. This evidence suggests that stress reactivity might, in part, be determined through events occurring at quite an early age, and that early exposure to stress can determine the magnitude of stress reactivity which might have significant impact on health later in life [21,37]. For this reason, detection of this altered stress response associated with adverse childhood experience is important as a potential indicator of future health outcome.

Stress Classification
The detection and classification of stress has become popular recently [114,112,[126][127]132]. Results and innovations that spring from such studies are very important in providing information by which accurate and reliable treatment can be given to patients. Apart from this, the results can be also used for the development of an early detection tool for certain diseases. Stress classification can be a big challenge since certain biomarkers are too complex to be processed such as, HRV. However, this complex biomarker has been proven able to represent important information of the heart as well as autonomic function in responding to the stress [2,8,119,121]. Throughout the years, the method of processing this signal has been improved to a satisfying stage. However, as the health problems grow rapidly, the demand of more advanced treatment, diagnosis as well as health monitoring has increased. These demands lead to the investigation and development of a new processing method for specific application, in this case, stress classification for the development of a health indicator and potentially may provide better health care. The use of a physiological biomarker for stress classification is in high demand because of its strong ability to provide information, not only for mental health, but also for general well-being as well as to predict health conditions. Some of the well-known biomarkers used for stress classification are EEG, HR, BP, HRV and cortisol hormone [77,113]. The physiological process of stress gives a deeper understanding of the reason for selecting certain biomarkers to detect the presence of stress. These markers include salivary cortisol [52,54], α-Amylase enzyme level [83,117], skin responses [116], finger plethysmography [81], voice [32], static facial information [67], Electromyography (EMG) [127], Electroencephalography (EEG) [85,131], heart rate [31,120], blood pressure [55,94], and HRV [17,121].
When the body is under stress, there are two primary systems that are particularly involved in adapting to the situation; the Autonomic Nervous System (ANS) and the Hypothalamic Pituitary Adrenal (HPA) axis [11,80]. The activation of the ANS caused by stress, mainly affects the cardiovascular system, such as the BP and HR. These cardiovascular biomarkers, also known as the cardiovascular reactivity have been chosen by many studies as they indicate significant results, yet are easy to collect and there is risk of less error [25,[41][42][97][98]. Thus far, the majority of the studies into the cardiovascular reactivity to acute stress have used measurements such as Pulse Rate (PR), HR, BP [25,42,71,96,23] and Heart Rate Variability (HRV) [29,35,45,78,[122][123]. Among these measurements, HRV, which is defined as the variation in time between consecutive heartbeats, has been proven to be one of the most valuable markers of cardiovascular responsivity [2,8,121].
Compared to the HRV, the measure of HR is the most fundamental measurement of cardiovascular reactivity and thus only yields limited analytical that are prone to poor interpretation. Additionally, the HRV analysis provides a much more detailed and accurate investigation of the functional regulatory characteristics of the ANS. While it is less practical to measure some cardiovascular markers during a test for example, the BP and HR during mental stress test, the measurement of the HRV has the advantage over the other measurement in that it is able to continuously measure the response throughout the testing [8]. Meanwhile when compared with EEG, where both EEG and HRV have been proven to contain rich information compared to other biomarkers, the EEG has a drawback in terms of its complexity, either during the data collection or signal processing. The HRV, on the other hand, is easier to collect and less complex, yet has a direct connection with the ANS, where changes in the ANS directly affect cardiovascular reactivity.
In addition, it has been proven that HRV provides predictive information for patients with autonomic function-related diseases and is a valuable biomarker for the investigation of sympathetic and parasympathetic functions, including diabetes [13,76] and cardiovascular disease [64]. HRV is also used to assess the autonomic response to physiological differences or activities under different conditions such as meditation [64,92,99], mental stress [124,133], physical exercise [22,50] and acupuncture [51].
Time and frequency domain analyses of HRV are the conventional approaches applied by many previous studies in HRV analysis during stress classification [35,60,82]. However, due to the nonstationary nature of HRV, valuable information embedded within the signal might not be completely extracted by these conventional methods. Therefore, more advanced analysis such as Time Frequency Distribution (TFD) and Wavelet transform (WT) are proposed [1,18,28,84,125]. However, in studies related to adverse childhood experience and stress reactivity, very few used HRV to measure the stress response and only conventional frequency domain method was used [27,87,129]. Therefore, HRV analysis, using both conventional and more advanced approaches for assessing stress response in individuals who have had adverse childhood experiences could give insight into impact of early life on alterations in physiological responsivity.
As HPA plays a significant role in the stress response, the measurement of the cortisol level is also regarded as biomarker in stress related studies. It has been found that the salivary cortisol concentrations is closely correlated to the serum cortisol concentration and thus it has been suggested as a possible biomarker that indirectly reflects the HPA system [52,86]. Furthermore, the salivary cortisol is also regarded as a more practical assessment tool, the sample collection is easier and stress-free compared to the collection of blood and urine samples for stress research [53,117].
Cardiovascular reactivity representing the ANS has been widely used in the process for the selection of a biomarker for stress response as it is able to provide convincing and significant outcomes, whilst biomarkers that represent the HPA still show inconsistent results [53]. However, as HPA has an essential function in the stress response by releasing cortisol into the blood circulation [52], it has been recommended that both biomarkers, namely HRV and salivary cortisol levels, be used [79][80]. It has been proven that the HRV is moderately associated to the cortisol level [80,115]. In addition, the measurement of the HRV alone may have its own limitations because the HRV can be easily influenced by other physiological conditions, such as respiration [15,65]. The redundancy of other factors may result in the loss of vital information [87,103]. Even though it can be seen that many previous studies used multiple biomarkers in order to measure the stress response, the results for each biomarker were analysed separately [24,26,40], and only a few slightly related studies were found to have used a combination of the biomarkers to investigate the stress response as a combined measure [56,88].

Fusion of Biomarkers
Data fusion is the process of combining information originating from different sources to produce a robust, more specific, and comprehensive description of a process of interest that has been observed [104]. Generally, the fusion of the biomarker information can occur at different stages of a classification system, namely as a feature level fusion and decision level fusion. In a feature level fusion, the features extracted from multiple biomarkers are fused. Meanwhile, in a decision level fusion, the final results of multiple classifiers are combined. Since the feature set itself contains richer information about the biomarkers than the output decision of a classifier, a feature level fusion is believed and suggested to be more effective than a decision level fusion [44,130]. This specifically means that the feature level fusion is able to derive the most discriminatory information from the original multiple feature sets involved in the fusion, and is also able to eliminate the redundant information resulting from the correlation between distinct feature sets [130] . Based on this description, fusion at the feature level is expected to provide better recognition results [75].
In feature level fusion, the more well-known fusion technique is the serial fusion which simply concatenates two sets of feature vectors into a single combined vector [68]. If the source of the first feature vector, x, is m-dimensional and the source of the second feature vector, y, is n-dimensional, then the fused feature vector, z, is (m+n)-dimensional [130]. A recent method, known as parallel fusion, combines two sets of feature vectors into a complex vector z = x+iy (i being an imaginary unit). Note that if the dimensions of the two vectors that are to be combined are different, the one with the lower dimension is padded with zeros [130]. This fusion technique has practical significance with wide applicability, and has more intuitive physical meaning when applied to some real-life problems [130].
Combinations of biomarkers have previously been used to give a better measure of psychological stress. However, most of the actual stress classifications rely on the evidence of a single-source biomarker. More recently, several studies have started to fuse multiple sources of biomarker information to develop a more reliable classification [56,59,88,107]. There are not many related studies which used fusion of biomarkers for stress response classification between two groups of people. The most related studies found were on classification between 'stress' and 'non stress' conditions which are: EEG and EMG [114,107], BVP, GSR, ST, PD [132] and EEG and Peripheral signals (BVP, SC, Resp) [56]. For a better review, Table 1 summarizes these studies.

Table 1
Review on fusion method for stress classification From Table 1, very few studies were found to have used fused HRV, for stress classification. However, some related studies were found using HRV in the fusion method as a sensor for detecting false alarms during patient monitoring [19,38]. It should be noted that there has been no study that used salivary cortisol in the fusion. Additionally, none of the above studies mentioned in detail how the biomarkers were fused, but it is understood that they used the conventional serial fusion method, where the data were just concatenated and fed to the classifier. Therefore, this study aimed to fuse the selected biomarkers and compare the classical method of serial fusion to the newer method of parallel fusion.
On top of that, it can be seen that the performance in all studies in Table 1 reached approximately 90% which is considered very high. However, all these studies were based on detecting the differences between 'stress' and 'non-stress' condition on the same individual, therefore the high classification performance was expected. In addition, the objective and focus of these studies are apparently different from the current study, where the current study discriminates the stress response between two groups of people. However, finding from those studies are still beneficial in proving that the fusion of biomarkers is a robust approach for stress classification. Furthermore, the higher performance of SVM compared with other classifiers also suggests this machine learning as an effective classifier for stress classification Since the ANS and HPA axis are two major bodily systems that play important role in stress response regulation, combination of the biomarkers that represent both the ANS and HPA is recommended [79][80]. In addition, information from both biological systems are highly related with each other in their own pathological way and the combination of these information was expected to yield a robust stress response classification [79][80]115,66]. However, even though many studies used multiple biomarkers to measure the stress response, the results for each biomarker have generally been analysed separately [42,93,98] without considering the direct correlation underlie between each biomarker. Hence, the combination or fusion of those biomarkers as a single measure, need to be done. Consequently, in this study, an approach to the fusion of ANS and HPA biomarkers for stress response classification in relation to adverse childhood experience is proposed. Figure 1 illustrates the implementation of the method used in the present work, and each part is elaborated in detail in the following sections.

Data Acquisition
This section, presents the data acquisition process. The process started with the recruitment of the participants, followed by laboratory session including the mental stress test, and finally, the protocol of procedures involved.

Participants
A total of 609 participants, comprised of 511 females and 98 males (age, M = 19.2, SD = 2.99 years), who were the students of University of Birmingham, United Kingdom, were screened via online questionnaire using the lifetime adversity inventory based on C-DIS-IV items, as conducted by Lovallo et al. (2012) and the Childhood Traumatic Questionnaire (CTQ) [16]. The C-DIS-IV questionnaire covers two different aspects of adversity which are physical or sexual adversity and emotional adversity. Meanwhile, the CTQ measured five different aspects; emotional, physical, and sexual abuse as well as emotional and physical neglect. This measure summarizes the picture of the severity, duration, and frequency of abuse during childhood. Participants' informed consent was obtained during the online screening, where the participants agreed to be invited to a further laboratory session if they fulfilled the selection criteria.
All individuals with scores of 3 to 5 on the lifetime adversity (C-DIS-IV) measure and those with scores of 2 or higher on the CTQ were invited to the laboratory session as the adverse childhood experience group while the control group was comprised of those who scored 0 on the lifetime adversity (C-DIS-IV) and CTQ [16,70]. From the invited participants, only 23 participants (age, M = 19.4, SD = 2.17 years), 11 high adversity and 12 low adversity participants agreed and attended for the laboratory testing. The high adversity participants had a mean CTQ score of 7.7 (3.03) and mean C-DIS-IV adversity score of 2.4 (0.67). The participants were non-smokers with no history of cardiovascular disease, an acute infection or other chronic illnesses, a current endocrine or immune disorder, and who were not taking any medication. Given that the control group were drawn from the same population as the adversity group, they were not significantly different on age or socio-economic status (p < .05) but the high adversity group did include more non-white individuals (82% versus 8% p = .01). Ideally we would have liked to match the groups on ethnicity but childhood adversity is commonly higher among non-White groups in the UK. However, our previous studies have revealed no differences in the magnitude of cardiovascular or cortisol reactivity to the PASAT by ethnic group thus reducing the likelihood of ethnicity acting as a confounding variable. The study was approved by the appropriate University of Birmingham Ethics Committee.

Mental Stress Test: The Paced Auditory Serial Addition Test (PASAT)
Paced Auditory Serial Addition Test (PASAT) developed by Gronwall (1997) is one of the more well organized, established and documented test. It is a psychological stress task that measures the cognitive function that assesses auditory information processing speed and flexibility, as well as calculation ability. This mental stress test has been shown in numerous studies to reliably perturb both the cardiovascular system and salivary cortisol [95,108] and to demonstrate good test-retest reliability [128,47]. Therefore, PASAT was used in this study as the mental stress test in order to investigate the physiological changes of groups of individual who have had adverse childhood experiences. To ensure the standardization of the stimulation procedure, the test was executed using an audio compact disk. A series of single-digit numbers were presented to the participants, and they were required to add each new numbers to the number that they had previously heard, and to say aloud their answers. In order to increase the level of difficulty of the test, the time intervals between the numbers were shortened towards the end of the task. Throughout the test, the participants had to look at the television screen, which showed live filming of own image. The test took 10 minutes to complete. After completion the stress task, participants were administered stress perceptions question. Participants were asked to rate the stress task from 0 to 6, where 0 for "not at all" and 6 for "extremely". The mean stress rating for the group was x.x, and this did not significantly differ between the high and low adversity groups.

Procedure
Before attending the laboratory session, the participants were asked to refrain from smoking and caffeine two hours before the session, not to consume alcohol for at least 12 hours, not to consume any meal for one hour, and not to engage in any vigorous exercise 4 hours before the laboratory session. These details were reconfirmed before the testing began. On entering the laboratory, the participants were asked to sit on a chair facing the television set. Next, the BP cuff and Lead II ECG, with a standard three-lead ECG placement, were attached to the participants. The ECG was then captured continuously, amplified and bandpass filtered (0.1 to 100 Hz) by an AC amplifier (P511, Grass), and converted to a digital signal using LabChart Software (AD Instruments) at a sampling rate of 1kHz, and was used for later analysis. The participants were reminded that no talking was allowed throughout the testing. The recording session took place between 2.00pm and 6.00pm to take into account the diurnal rhythmicity of cortisol.
The session started with and adaptation phase for 10 minutes for the participant to acclimatise to the laboratory, followed by a further formal resting baseline period of another 10minutes. Three saliva samples were obtained from each participant using salivettes (Sarstedt Ltd.). The first saliva sample was collected at the final minute of the baseline period. Before starting the next phase, the participants were read the instructions for the mental stress task, and given a brief opportunity to rehearse the task to ensure their understanding. The participants then completed the 10-minute PASAT. Immediately after the test, a saliva sample was again collected, and this was followed by a 10-minute recovery period. The last saliva sample was obtained after the 10minutes of recovery. The systolic blood pressure (SBP), Dystolic blood pressure (DBP), and HR were measured every two minutes using blood pressure monitor , four times for each phase, while the ECG was recorded throughout the testing. Figure 2 illustrates the procedure.

Saliva Sampling and Cortisol Assays
To collect the saliva samples, a salivette (Sarstedt Ltd.) dental swab was placed in the mouth and was gently chewed by the participant for one minute. The swab was then returned to the salivette tube. In this study, three stimulated saliva samples were collected, one at the end of the resting baseline, one immediately after the stress task, and one at the end of recovery. At the end the laboratory session, the salivettes were centrifuged at 3500 rpm for five minutes at room temperature, and the saliva was then aliquoted and frozen at -20 degrees until the time to be assayed. The cortisol was assayed from the saliva samples in duplicate by an enzyme-linked immunosorbent assay (ELISA) using a commercial kit (DRG Diagnostics). This assay is based on the competition principle and microplate separation. An unknown amount of cortisol present in the sample, and a fixed amount of cortisol conjugated with horseradish peroxidase, competed for the binding sites of a polyclonal cortisolor DHEA-antiserum coated onto wells. After one hour of incubation, the microtiterplate was washed to stop the competition reaction. A substrate solution was added and incubated for 15 minutes at room temperature. After a stop solution was added, the optical density was measured at 450 nm. The concentration of cortisol (nmol/l) was then determined from the measurement of the optical density using a regression program (KC Junior).

Pre-Processing: From ECG to HRV
The 10 minutes recorded ECG signals were pre-processed to quantify the HRV using MATLAB software. The 50 Hz power line interference was removed using a notch filter. The QRS waves were then detected using Pan and Tompkin's algorithm. This method was chosen, because it has been proven to detect 99.3% of adult QRS using the MIT/BIH database [91].

HRV Feature Extraction
In this study, time, frequency, non-linear time-frequency and Wavelet analysis were used to extract the HRV features.

Time-Domain Analysis
Among the various measures for time-domain analysis, the measures recommended by the Task Force (1996) were computed. The measures included: 1) The Standard Deviation of the Normal-to-Normal intervals (SDNN) which is the mean of all the standard deviations of NN (normal RR) intervals for all windows of length.
2) The Standard Deviation of the Average of Normal-to-Normal intervals (SDANN). This index was calculated over short periods (5 minutes in this study), which is an estimate of the changes in heart rate due to cycles longer than 5 min.
3) The Root Mean Square Successive Difference (RMSSD) of intervals [119]. 4) HRV triangular index which is determined by dividing integral of the density distribution (the number of all NN intervals) with the maximum of the density distribution [119].

Frequency Domain Analysis
For the frequency domain analysis, the parametric AR spectral analysis was used to extract the HRV features [3,118]. An AR spectral analysis can provide the number, center frequency, and associated power of oscillatory components in a time series [118]. The power spectral density (PSD) can be estimated from a variety of AR methods including autocorrelation, covariance, modified covariance and Burg's estimation [6]. In spectral estimation, the accuracy of the estimated spectrum is critically dependent upon the selected model order. In this study the AR model was computed using the Akaike Information Criterion (AIC) [5]. The optimum model order was observed at p = 36.

Time-Frequency Domain Analysis
In this study, the Modified B-Distribution (MBD) was used to extract the time-frequency features of the HRV as suggested by Barkat and Boashash (2001) and Aimie-Salleh et al. (2015). TF mapping using MBD is presented in Figure 4.

TFD-Based Nonlinear Entropy Analysis
The Shannon and Rényi entropies are two entropy measures that have been introduced to estimate the information content and complexity of the TFD [73][74]46]. The Shannon entropy is based on information theory evaluation, which has specific properties that reflect the complexity changes of the HRV [48,111]. Initially, the Rényi entropy was used to evaluate the complexity of a signal in the time frequency plane [20].
In this study, the TFD-based Shannon and Renyi entropy from the estimated TFD were computed for feature extraction. The entropy features were computed from the overall TFD and the entropy within the LF and HF range [73,4].

Wavelet Transform Analysis
A Wavelet analysis provides both time and frequency localizations and the resultant wavelet coefficients can be used as features in classifiers. To decompose the HRV signal, wavelet family ψ a,b , a basis function generated by dilatations and translations of a unique admissible mother wavelet ψ(t) is defined as, (1) where a denotes the scale and b the location [43,63].
To decompose a signal, the choice of the mother wavelet must be similar to the signal being analysed. In this study, a discrete wavelet transform using the eight-order Daubechies mother wavelet (db8) up to five levels was applied to the HRV to create a five-level wavelet decomposition of the signal [1,43]. The Daubechies wavelet was selected due to the similarity between the wavelet function and the HRV signal [58].
Next, the Detail (D) and the Approximate (A) coefficients were reconstructed according to the recommended HRV frequency bands [119]. VLF range (<0·04 Hz) was reconstructed from the A5 coefficient, the LF range (0·04-0·15 Hz) from the D5 and D4 coefficients and the HF range from the D3, D2 and D1 coefficients. All the reconstructed features were denoted as W VLF , W LF and W HF respectively. From the reconstructed signals (W VLF , W LF and W HF ) including each single coefficient (A5 and D5-D1), Energy (E), the square of the DWT coefficient of the HRV signal, Approximate Entropy (ApEn), Sample Entropy (SampEn), Kurtosis and skewness were computed [1].
The ApEn is a nonlinear method that is used to measure the regularity and unpredictability of signal variations [100] . This method has a potential application in heart rate data analysis [101].
Meanwhile, the SampEn is a modification of the approximate entropy used for the assessment of the complexity and regularity of physiological time-series [106]. Unlike the ApEn, the SampEn is independent of data length and performs well consistently.
Kurtosis and skewness are used to quantify the probability distributions of the signal series [102]. Kurtosis indicates whether the data is peaked or flat relative to the normal distribution. Skewness measures the asymmetry of the tails of distribution. The kurtosis (Kur) and skewness (Skw) are defined as where X is the probability distribution of the signal, μ is the mean value of the data set, and σ represents the standard deviation of the data set [33].

HRV Feature Selection
The main objective of a feature selection process is to select the optimum number of features that can produce the highest classification performance. From the feature extraction process, a vector of 83 features may contain irrelevant and redundant features. Therefore, a feature selection technique was required to identify the most representative features so as to improve the performance of the classification. In this study, the genetic algorithm (GA) was exploited as a feature selection method. GA involves an iteration process, which manipulates the chromosomes to produce a new population using genetic functions such as crossovers and mutations. According to the theory, the fittest species will survive the evolution process, while the weak will perish. For GA to start its process, an initial population of chromosomes was randomly created. The chromosome is the encoded bit strings representing the sequence of the features, while the gene represents each feature. A combinatorial set of genes or features in the current population, known as individual was evaluated by fitness function once the GA started the iteration.
For the current work, the fitness function (FitFunc) was computed using kNN-based classification error with k = 3 [89][90]7] which is defined as: (4) where α denotes kNN-based classification error and N f is cardinality of the selected features. Table 2 summarizes the parameters used in the GA along with their selected values. From this table, the initial population was set to 40, which means that there were 40 individuals (40 sets of chromosomes) in the initial population. The length of each chromosome was 83 which presented the number of extracted feature in this work. To create a new population, the GA performs elitism, crossover and mutation in a sequential order. The number of elitism used in this study was 2, indicating that two children or individuals with the lowest error rate were pulled out for the next generation. This means, the remaining 38 individuals (40 -2) continued the iteration to produce crossover and mutation kids. In this study, the crossover probability was set to 0.9 where the number of crossover children is round(0.9 x 40) = 36. Meanwhile, the number of mutation children was 40-36-2 = 2.

Table 2
Genetic Algorithm's operators used in this study

Feature fusion
In order to yield reactivity measure, the features were computed according to the normalization equation below [88,130]: (5) where Ŷ denotes the mean of sample feature vector, and ŝ denote the standard deviation of the feature vector. This equation (Equation 5) in the same time will counteract the numerical imbalance between the features and gain a satisfactory fusion performance. The serial fusion was achieved by concatenating the feature vectors of the biomarkers, namely h obtained from the HRV feature extraction process, and b, p and c representing the BP, PR and salivary cortisol respectively. The aim of the fusion was to combine information from both the ANS and HPA in order to obtain a better information representation. Therefore, for comparison, the biomarkers that represent the ANS namely the HRV, BP and PR, and the one that represents the HPA, namely cortisol, were fused. For better understanding, the studied biomarker combinations for Serial Fusion (f s ) are summarized as follows: Meanwhile, parallel fusion was achieved by combining the feature vectors into a complex vector. To represent the combined features in a complex vector, let a and b be two different feature vectors of the same sample S, such that the complex vector, c = a + ib, where i is an imaginary unit. However, two kinds of feature vectors can be formed either a + ib or b + ia. The question maybe raised whether both combinations are the same. This can be proven by referring to the Pythagorean Theorem, which is also known as Pythagoras's theorem ( Figure 4). From this theorem the modulus or absolute value, |z| of a complex number z = a + ib is its distance from the origin, which also define the Euclidean distance (e d ). From Figure 5, it can be seen that z = a + ib, then the modulus |z| is (10) while for the second type of combination: (11) Therefore, As with the serial combination, the biomarkers that represent the ANS and the one that represents the HPA axis i.e. cortisol, were fused and the resulting Parallel Fusion or Euclidian distance combinations (f e ) are given as below: (13) f e2 = b + ic (14) f e3 = p + ic (15) For the fourth combination, which was a fusion of the HRV, BP, PR and cortisol, the biomarkers ware grouped into two feature vectors [56] where one feature vector represented the ANS (HRV, BP, PR) and the other one represented the HPA axis (salivary cortisol). Therefore, the fused parallel feature for this combination was defined as: Let the feature vector of the ANS be, Hence,

Classification
A classifier is needed to predict the output class of an input where the data is already in existence. In this study, the stress response between two groups of people, one group with adverse childhood experiences and the other as a normal control group, was classified using a SVM classifier.

Support Vector Machine
The SVM, which is a classifier that is being extensively used in the biomedical field, constructs a separating hyperplane in a feature space which splits the training data into two stress response classes. An SVM classifies the data by searching the best hyperplane that splits all the data points of two classes. The hyperplane with the largest margin between the two classes is selected as the best hyperplane for an SVM. The margin means the maximal width of the slab parallel to the hyperplane that has no interior data points. The support vectors are the data points that are closest to the separating hyperplane. These points are on the boundary of the slab. The following figure ( Figure 6) illustrates these definitions [49].

Classifier Performance Evaluation
The performance of classifiers can be estimated through a number of different approaches. One of the most commonly used performance measures is the accuracy of the classification. Usually, the classifier is trained by using a set of training data, and the performance measure is calculated from a set of test data.
In this study, the performance measures derived from the confusion matrix, as shown in Figure 7, were used. From this confusion matrix, accuracy (Acc), sensitivity (Sen) and specificity (Spe) were implemented as the performance measures [62,72].

Fig. 7 Confusion Matrix
The overall performance of the classifier in terms of its classification accuracy is defined as the percentage of the test sample that is correctly identified by the classifier, and it is expressed as: (18) where TN denotes the true negative, which is the number of correctly rejected non-events, and TP denotes the true positive, which is the number of correctly detected events. Meanwhile, FN, is the number of missed events and FP, the number of falsely detected events. Both denote a false negative and a false positive, respectively.
Sensitivity is the probability that an event tested will receive a positive test result [72]. It also defined as the ratio of the TP to the total number of events, and is given by: (19) Meanwhile, the specificity, (also known as the true negative rate) is the probability that an event that is tested will receive a negative test result [72]. It is also defined as the ratio of the TN (true negatives) to the total number of non-events, and is given by:

HRV Feature Extraction
In total, 83 features were extracted from the HRV signal. The summary of the extracted features are presented in Table 3. These features were selected to be used in this study since it has been proven to provide significant results in discriminating HRV reactivity in response to the mental stress task. Previous studies showed that the features employed for HRV characterization are mostly based on conventional approaches which are time and frequency analysis [17,35,82]; on the other hand, several studies presented the use of more advanced feature analysis such as time frequency, wavelet, and nonlinear [1,12,110] . Interestingly, this study used the combination of both conventional and latest advanced analysis to extract the HRV features.

Table 3
The Total Number of Extracted HRV features

Feature Selection: Genetic Algorithm
From the GA feature selection, 12 best features were finally selected and are presented in Table 4. These selected HRV features were then fused with other biomarkers and the result is presented in following section.

Table 4
Feature Extracted from feature selection by Genetic Algorithm

Feature Fusion
After the best HRV features were selected using the GA feature selection, the features were then fused with the other biomarker collected in this study i.e. BP, PR and SCort. At the beginning, parallel fusion, also identified as Euclidean distance (e d ) was applied for the biomarker fusion. Serial fusion, which is the most common and simple method of fusion was then implemented for comparison. Conceptually, combination of the biomarkers that represent ANS and HPA was implemented in this study as it was suggested that information from both biological systems are highly related with each other in their own pathological way and the combination of these information was expected to yield a robust stress response index [79][80]115,66]. As explained earlier, the biomarkers that represent ANS are PR, BP and HRV meanwhile for HPA is SCort. Each biomarker that represents ANS was fused with SCort which is biomarker that represents HPA. Therefore, three combinations were produced, HRV-SCort, BP-SCort and PR-SCort. Finally all biomarkers for ANS were fused with SCort (HBP-SCort). Table 5 presents the performance of fused biomarkers using linear SVM in terms of accuracy, sensitivity and specificity. From the table, HRV-SCort using Euclidean distance achieved the most satisfactory performance with 80.0% accuracy, 83.3% sensitivity and 78.3% specificity. The table also showed that, the fusion of all biomarkers i.e. PR-BP-HRV with SCort had lower performance compared with HRV-SCort. This result proved that HRV is a powerful biomarker that can represent ANS.

Table 5
Performance of fused biomarker by SVM classifier Next, the performance of this new fused feature was then compared with performance of stress response classification by using a single biomarker as presented in Table 6. From the table, the stress response classification of the fused biomarker outperformed that of the single biomarker. In addition, it can be seen that the fusion of BP and SCort nearly achieve satisfactory performance with 78.3% accuracy, 65.0% sensitivity and 90.0% specificity which can be further investigated in future work. However, the use of BP alone has drawback where it is impossible to measure the throughout the stress testing [8].

Table 6
Comparison of the performance between fused and single biomarker for stress response classification by using SVM This study proposed HRV and salivary cortisol named as HRV-SCort as the most effective measure for stress response classification compared to either different combinations of biomarkers or single biomarkers. It is important to be noted that, this combined biomarker represents both ANS and HPA, which are two main body systems that are activated during stress. Therefore, this new approach to measurement is believed to be more reliable and accurate compared to the existing measures as it is able to assess both essential systems simultaneously and thus minimize the errors and limitations caused by single biomarkers. In addition, a new approach for classification of stress response based on HRV-SCort biomarker using Euclidean distance and SVM named as e d -SVM is proposed. The robustness of this method is crucial in contributing to the effectiveness of the proposed stress response measures.
Since SVM indicated highest performance in classifying the fused biomarker (HRV-SCort), SVM model is suggested could be further implemented for the computation of stress response index. The stress response index differentiates how the individual responds to stress by classifying between irregular stress response and normal stress response. This index can also be used for an indicator for future health. The detection of this stress response irregularity is important so that preventive measures can be taken and if needed, further thorough diagnosis can be done. This perhaps might improve the health care management during adulthood.

Conclusion
This study presented the fusion of ANS and HPA biomarkers in order to classify the stress response based on adverse childhood experience. From the results, it can be concluded that the fusion of HRV and salivary cortisol named as HRV-SCort was proven as the most significant measure which represents both the ANS and the HPA axis for stress response classification when compared to different combinations of biomarkers and single biomarkers. Next, the present study showed that the fusion approach using Euclidean distance is an effective fusion method for HRV-SCort in classifying the stress response to the Paced Auditory Serial Addition Test (PASAT) between individual who had adverse childhood experience and those who had no adverse childhood experience.