A cCombined approach of genomics and metabolomicsgenomic/metabolomic approach to differentiatione of the geographical origins of natural products: deer antlers as an example[JH1] .

 

Abstract

The dDeer antlers have been used asas medicalmedicinal or health products for hundreds of years in orientalEast Asian countries. As Tthe healthypositive physiological effects of deer antlers dependcorrelate strongly onwith their chemical components, thatwhich can vary hugelywidely according to the geographical sourcesorigin, of the deers, which makes correct indicationidentification of the sourcesthose origins is essential forto their quality control. To address this[JH2] In the present study, we applied both genomicetics and metabolomics approaches to the origin-identification for theof samples from Canada, New Zealand and Korea. The genomicetic approach employing mictochondirial DNA sequencing gaveprovided the distribution of the deer species in each country, but failed to categorize all the samples, due to the presence of identical species in different countiries. The NMR[JH3] -based metabolomics approach gaverevealed the celan differentiation between the New Zealand and Korean samples, but it gaveshowed ambiguities for the Canadian samples. We then applied the metabolomics approach forto the samples from identical-species samples that could not be differentiated by DNA sequencinge. ItThis gaveyielded  clean separations of all of the analyzed samples, and compounds specific forto each country were also were identified. The validity of the metabolomic approachmethod for differentiating identical species was also demonstratedconfirmed by correct prediction of the blind samples. In summary, Aas the genomicetic approach gaveprovided unambiguous read-outs for different species, and the metabolomics approach gave cleanly distinguishedction between[JH4]  identical species from different countirries, their comgbined use could be an especially robust approachmethod for the identification of the sourcesorigins even in difficult cases. We believe the method isto be generally applicable to many herbal medicinal products for which various species are grown internationally.

Introduction

TheIn East Asian countries, deer antlers have been used in oriental countries as important meidcationmedicinal ingredients and dietary supplements for many[JH5]  hundreds of years. In fact, Manymuch researchers havehas shown that the deer antlers can provide beneficial effects such as anti-aging [1], and anti-inflammatory [2], as well as blood-regenerating and blood-pressure-lowering effects [3]. TheThese healthysalutary effectsresults of deer antlers are due to theirthe ingredients and many chemicals and other constituents that have been found to be present in deer antlers: amino acids, nucleic acids, polyamines, vitamins, and additional organic and inorganic acids. As with any[JH6]  other natural productssubstances, thedeer antler compositions of deer antlers vary quite a lotwidely according to their geographical sourcesorigins. Moreover, the price of deer antlersPrices can vary also vary, as much as tenfold. times depending on the sources[JH7] . AccordinglyAlmost inevitably then, foul plays have happened faking the sources of the antlersfraud and illicit trade have ensued, which causinge socioeconomic problems in antler-consumingproducing countries and medical malpractice issues in antler-consuming countries. As various species of deers are cultivated in different countries, classification of the origins of deer antlers is essential for their correct use of deer antlerslegal marketing and correct use.

There have been some sStudies tohave differentiated the deers or[JH8]  antlers using genetic[JH9] s, metabolomics or morphological and metabolomic approaches[JH10] . For the genetic analysisA phylogenetic analysis of the antler-growing genus Cervus producing antlers, phylogenetic analysis was performed using the entire cytochrome b gene [4], and a world deer phylogeny was constructed for the world’s deer using sequence data from the genes of on the basis of mitochondrial DNA (mtDNA) gene sequence data [5, 6]. With respect to the genetic variation of deer, the cClassification of deer species according to their genetic variation has been conducted by karyotyping [7], repetitive DNA sequencing [8, 9], RFLP analysis of mtDNA [10], and gene sequencing of mtDNA [11]. The classification of the dDeer was also donehave been classified also on a morphological basis. Morphologically,[JH11]  tThe European red deer (C. elaphus), the wapiti (the C. elaphus subspecies inof Asia and North America), and the sika deer (Cervus nippon) are monophyletic [4]. In terms of overall morphologyThat is to say, they are very similar to one another, except in body size and antler morphology [4, 12, 13]. Lastly, a metabolomic approach combined with The antlers were also classified using metabolomics combined with principal component analysis (PCA) has been employed for classification of antlers [14]. However, the number of samples in one group was quite small (about six[JH12] ) and the statistical approach (PCA) cannot be used for the prediction of unknown samples[JH13] .

Although the above approaches have owntheir respective merits, there are drawbacks to limit their general use: genetic[JH14]  approaches cannot differentiate the antlers from the same species; metabolomics approaches needs to be done on a sizeable sample sizes for statistical reliability; morphological studies lacks objective criteria; metabolomic approaches require sizeable sample sizes for their statistical reliability[JH15] . HereIn the present study, we employed a combined approach of genomics and metabolomics genomic/metabolomic approach to the determination of thevarious species’ origins. among various species. Metabolomically, Wwe used aboutan approximately five-times-larger sample size, for one country group, for metabolomics approach than did athe previous study aforementioned [14]. AlthoughWhereas neither individual approach, of genomics or metabolomics, did not proved fully differentiate the originssatisfactory, their combination gaveenabled a reliableaccurate and reliable separationdifferentiation of deer antlers origin by the origin from among the various species. Using the approach, we were able to predict the oOrigins prediction of deer antlers correctly,proved effective even for those from the same genetic species grown infrom different countries.  TheOur method, moreover, can be applied forto the discrimination of not only of deer antlers, but also, more generally, toof other orientalEast Asian or Oriental medicines, such as herbs and plants.

Results and Discussion

Genetic-DNA analysisGenomic[JH16]  approach

We analyzed the DNA sequences of 101 deer antler samples collected from Canada, New Zealand and Korea. The Individualdeer species of the deer for all the samples were determined for all of the samples were determined by DNA sequencing of the base pairs in the 439 - 450 sequence region of the D-loop of the mitochondrial DNA (Figure 1B). Out ofAmong the 40 Canadian samples, 24 were found to be C. e. nelsoni species, and 13 and 3 samples to be[JH17]  C. e. manitobensis and C. e. canadensis species, respectively (Supplementary Table S1). Out ofAmong the 30 New Zealand samples, 27 samples were C. elaphus species (25 samples showed 100% sequence homology, and 2 samples showed 97% sequence homology), and the otherremaining 3 samples belonginged to C. e. nelsoni[JH18] , C. e. macneilli (92%), and C. e. barbarous (96%) species, respectively. Out ofAmong the 31 Korean samples, 10 samples belonged to C. elaphus species, 11 samples belong to C. e. Canadensis, and 10 samples belong to C. e. nelsoni species. Therefore, C. e. nelsoni was the majorpredominant Canadian species for the Canadian samples, and was a but only a very minor speciesone in New Zealand. samples. In comparisonConversely, the predominant New Zealand species, C. elaphus, species was the majority for the New Zealand samples and was not presentfound inamong the Canadian samples. These two species, C. e. nelsoni and C. elaphus, were almost equally presentprevalent amongin the Korean samples. It is known that aboriginal Korean deers gotbecame almost extinct during the 1940s, and that in response the Korean authorities deers were imported deer from foreign countries. Our data show that mMany of these deers currently grown in Korea had been importedcame from Canada and New Zealand, our data showed. The data indicated also indicate that the DNA sequencinge (genomic) approach cancould be used toeffectively tellidentify the origins of some species, (for example, C. e. manitobensis) that is present only in Canadian samples, or to exclude Canada as the origins forof C. elaphus. However, it wascould not sufficient for the differentiateion of the countryies of origins forfor all of the samples, even thoughwhen the read-out of the DNA sequence itself iswas almost unambiguous.

NMR-based metabolomics approach

As the geneticgenomic approach alone did not giveprovide enough information sufficient to tellreveal the origins of allsome[JH19]  of the antler samples, we triedemployed the metabolomic approach another approach that canto address the environmental or growth conditions of the deer. By this approach, specifically, Wwe analyzed obtained the NMR spectra obtained forfrom the antler extracts and analyzed them using metabolomics approach (Figure 2). The NMR[JH20]  spectra, in the 3 - 4 ppm regions, featured many signals from sugar-containing compounds at 3 - 4 ppm regions as well as those from methyl groups, probably from branched amino acids [17]. Although the representative spectra of each country’s the samples from each country were seemingly differed[JH21] nt, they could not addressresolve the question of intra-group variation. Therefore, we further performed thea multivariate statistical analysis forof the entire NMR data set. We applied a Partial Least Squares-Discrimination Analysis (PLS-DA) to seedetermine if the metabolic profiles cancould be used to differentiate the origins and to find specific signals belonging to each country group (Figure 3). The results showed that in fact, the PLS-DA model cancould reliably differentiate the antlers from New Zealand andfrom Korean antlers. StillHowever, the Canadian samples exhibited some overlaps with the samples from both New Zealand and Korean samples. The quite tight clustering of the Korean samples maymight be due toreflect the similar growth conditions of the deer in Koreathat country, as the countrywhich is much smaller than the other two. The results also indicated that the antlers’ metabolic profiles of the antler maymight be more affected more by the environmental or growth conditions than the DNA sequences, difference[JH22] , as there was no noticeable grouping within the Korean samples withfor the three equally populated species. Overall, the metabolomics approach, though in some ways very effective, shows some utility, but is still not sufficient tois inadequate for differentiatinge all the samples at once. In fact, A the encouraging data from a previous report for the utility of theon origin differentiation by NMR-based metabolomics in differentiating origins might have shown the positive data due tobeen skewed by theirthe much smaller sample sizes[JH23]  for each country (compared withthan ours) [14].

Combined Approach of Genetics and Metabolomicsgenomic/metabolomic approach

As eachboth of the genetics and metabolomics[JH24]  approaches showed at least some utility in discriminatingon of theantler origins, we decidedreasoned that to analyze the data in a combined genomic/metabolomic mannerstrategy tomight effect a significant improvement. the differentiation. First, we applied the DNAgenomic approach,, without further experimental analysis, , can be used to the differentiatione of the species present only in one country, for example, C. e. manitobensis fromin Canada, or C. e [JH25] .macneilli (92% sequence homology), and C. e. barbarus (96%) fromin New Zealand. Second, we used metabolomics for the species that are present in large numbers in more than one country. we applied metabolomics. We analyzed C. elaphus species, presentwhich is found in both in New Zealand (25 samples) and Korean (10 samples), samples and C. e. nelsoni species, presentwhich is prevalent in both in Canadaian (24 samples) and Korean (10 samples), samples, as determined by the DNA analysis (see supplementary Table S1). We performed a multivariate Orthogonal pProjections to lLatent sStructure-dDiscriminant analysis (OPLS-DA) multivariate analysis on the NMR metabolicte profile[JH26]  data for each species., The differentiations were achieved with one predictive component and one orthogonal component for C. elaphus species and one predictive component and two orthogonal components for C. e. nelsoni species. The results showed that the origins of each species cancould be clearly differentiated (Figure 4A and Figure 4B). These analyses results showOverall, the results proved to demonstrate that categorizing the species withby genetic[JH27]  DNA sequencinge first, firstly, andafter which then analyzing the  identical species, therefore (i.e. those that cannotcould not be differentiated genetically), are analyzed withby means of metabolomics, s approach can comprehensively discriminate the origins of the antler samples.

Statistical Vv[JH28] alidation

To eliminate theany likelihood that the clear separations might behave occurred by chance, we performed statistical validation using Y-scrambling [15, 16, 18]. We randomly permutated Y variable value for 200 rounds to rebuilt and analyze.[JH29]  We observed a substantial decrease in both R2 and Q2 parameters for each model (Supplementary Figures S1A and S1B), for each model, with the extrapolated value of the Q2 regression line ofbeing about -0.2 or -0.3, respectively.

Marker compound identification and verification

To getobtain thean idea of which metabolites contributed to the differentiation of the same species grown in different countries, we constructed the S-plots based on the two OPLS-DA models of each species (see Figures 5A and 5B). ForRegarding the C. elaphus samples from New Zealand and Korea, the signals at 2.6537 and 0.9804 ppm were higher in the New Zealand onescases, whereas the 1.3237 ppm signal was higher in the Korean samples. Based on thea comparison with the standard samples and a two-dimensional spectral analysis (HMBC, DQF-COSY, TOCSY, HSQC) [17], we identified those signals as coming from methionine, valine and lactate, respectively. The Oother signals from the identified compounds also were also confirmed, by the NMR analysis (data not shown). To further test for the actual biased presence of the marker metabolites, we built a plotted with the intensities of thosetheir signals in the New Zealand, Canadian and Korean samples using an independent Student’s-t test[JH30]  (Figure 6). The result confirmeds that these metabolites arewere significantly biased in one of the groups, contributing to the separation.

ForAs concerns the identified marker compounds, it iswas interesting to see that the same aliphatic amino acids, methionine and valine, arewere higher in both the New Zealand and Canadian samples than in the Korean samples, regardless of the species. In addition, mMaleate and lactate, common organic acids, arewere higher in the Korean samples than those fromin the other two countries. We recently reported that feeding conditions can affect the metabolites detected in deer antlers.[JH31]  It is likely therefore that the differential metabolicte profiles of single-species deer antlers from a single species raised infrom deer living in different countries are fromreflect the growth, food, and environmental differences in thethose respective countries. Moreover, our results confirmed that identical species can be differentiated based on their metabolicte profiles.

Prediction of the origins for a single species

An important practical matterconsideration of thein  differentiatingon of the origins of natural products, including deer antlers, is ifwhether a given method can be used to predict unknown samples correctly. As our genomic (DNA) test cancould deliver unambiguous species differentiation, we tested if our metabolomics model to see if it cancould predict the origins of the unknown samples from a single species. The process can be performed by leaving some data (test set) out and rebuilding a new OPLS-DA model using the remaining data set.[JH32]  In this case, the test set can be considered unknown samples for prediction.[JH33]  We randomly took outremoved as many as 30% of the samples from the entire dataset (test set), and carried out the prediction test with thethis obtained OPLS-DA model, so obtained. Specifically, Wwe took outremoved a total of 11 samples (8 New Zealand and 3 Korean samples) forrepresenting the C. elaphus species, and alsoalong with 11 samples (8 Canadian and 3 Korean samples) forrepresenting the C. e . nelsoni species. for the prediction tests. As shown in Figures 7A and 7B, all 11 of the test samples (8 New Zealand and 3 Korea samples) of the C. elaphus speciessamples and all 11 test samples (8 Canadian and 3 Korean samples) of the C. e. nelsoni speciessamples were predicted correctly withusing an a priori cut-off value of 0.5, whichthereby confirmings the robustness of the metabolomics antler differentiation model for the same species. samples.

Significance and cC[JH34] onclusions

Natural and agricultural products, including herbalanimal and animalherbal productsvarieties, are important economical commodities for all countries, and used very widely as foods orand dietary supplements, are important economic commodities in all countries. At the same time, tThe values of thosethese materialscommodities vary significantly depending onaccording to their origins. Therefore, correct indicationdetermination of the countries of origins is important not only for the properaccurate appreciation of the economic valuationes, but also for the quality control. of the materials[JH35] . For thoseproducts obtained from species that are differentially present in different countries, the[JH36]  DNA sequencinggenomic[JH37]  approach can provide a very reliable means of differentiationidentification. However, itthis approach cannot be applied to all the cases, because identicaldifferentiation of identical species are grown infrom different countries. MoreoverThis limitation is a particularly important one, in that seeds for herbal products are traded internationally more often and widely than ever before, and therefore, there are good chancesincreasing the likelihood of onethe same species being presentgrown in differentvarious countries. Indeed, Ssome herbal species purposefully are intentionally grown in differentother-than-native countries where cultivation costs are muchsignificantly lower. IIn these cases, though, thediffering cultivation techniques and environmental conditions lead almost inevitably to a wide range of product qualitiesy. of the products may well differ due to the growth and environmental conditions. As we showed here, the metabolomics approach can help differentiate the origins forunder thesethose difficult casescircumstances. Although metabolomicswith this approach alone, could not differentiate not all of the three origins for all of the deer antler samples could be differentiated[JH38] , it was able to reliably effective in differentiatinge those samples from identical-species samples from different countries. SoThus far, many  metabolomics literaturestudies, including ours, have reported the origin differentiation of origins usingby means of either NMR or Mmass spectroscopic methods [15, 18-22]. StillContrastingly, we hardly see examplesexamples of single-species origin discrimination for the samples that are confirmed to be a single species by DNA sequencing are hardly seen in the literature. We suggest that the combined genomic/metabolomic approach using genetics and metabolomics can improve the reliability of the differentiation of the origins of natural products origin differentiation.


 [JH1]implicit

 [JH2]implicit

 [JH3]This should be identified, here, and once in the main text (on first mention), unless the long form is almost never used in your field.

 [JH4]*OR (if more than two species in any one case): among

 

--passim….

 [JH5]Delete this if you prefer.

 [JH6]OR: most

 [JH7]Implicit

 [JH8]OR (if any study has differentiated both deer and antlers): and/or

 [JH9]*Perhaps the more general “genetic” is better here. (“genomic” is a more specific category of “genetic”)

 [JH10]… to match the order followed below

 [JH11]implicit

 [JH12]It would be better if you could get the precise number.

 [JH13]??—couldn’t understand this or how it connects to first part of sentence. As always, feel free to send me an e-mail with additional context/explanation or a rewritten sentence, and I will check the sentence again, at no additional charge.

 [JH14]… probably the more general form is better here too (but otherwise, substitute “genomic”)

 [JH15]… to match the order followed above

 [JH16]It’s probably ok to retain "DNA analysis" if you prefer, but I think that "Genomic" is better for consistency (especially because this is a heading).

 [JH17]Can omit in this syntax

 [JH18](?) Shouldn’t there be a sequence homology percentage shown for this one as well?

 [JH19]*OR: any

 [JH20]implicit

 [JH21]OR:

differed

 [JH22]implicit

 [JH23]OR: size

 [JH24]Implicit here

 [JH25]Inserted spaces

 [JH26]If this really should be “metabolite profile,” make the change here and passim….

 [JH27]Redundant given “DNA”

 [JH28]… just for consistency

 [JH29]??—couldn’t understand

 [JH30](?) OR: Is it “Student’s t-test” -?

 [JH31]Footnote for this?

 [JH32]This is covered adequately below.

 [JH33]repetitive

 [JH34]Retain your original version if it is part of an overall template used by your target journal.

 [JH35]implicit

 [JH36]change to “a” if more than one such approach

 [JH37]Use “DNA sequencing” if you prefer, though I think that “genomic” is better, simply because this is the first mention of this method in the Conclusions.

 [JH38](*?)  OR:

“the three origins could not be differentiated for all of the samples”