DNA barcodes of the vascular flora of the Altai Mountain Country: type material of the Herbarium ALTB

Summary . The article presents first data of the work on DNA barcoding of type specimens of ALTB Herbarium (Barnaul, Russia). Obtained sequences of ITS and trn L-trn F, trn H-psb A markers of DNA were deposited in NCBI GenBank


Introduction
Identification, naming, and classification of living organisms at the species level are the foundation of all biology and has become one of the indispensable criteria in biodiversity analysis and management, conservation, and breeding (Vu, Le, 2019). Genetic analysis is exclusively a DNA-based technology recognized as "DNA barcoding". In the global infrastructure of biodata, DNA barcoding plays a main role, first, to solve fundamental problems of biodiversity. This is a diagnostic technique that uses short DNA sequence(s) for effective and accurate identification of different group of organisms, as well as unknown species (Ankola et al., 2021). Using united protocols of DNA isolation and analysis allows to significantly increase the efficiency of research and, therefore, the relevance of the results obtained, as well as using of public data repositories (NCBI, EMBL-EBI, GBIF, DDBJ) makes in demand the results in other natural sciences, not only in biodiversity, ecology, and genetics.
Developing of methods of the DNA analysis led to the creation of "The Consortium for the Barcode of Life" (CBOL) and "The Barcode of Life Data System" (BOLD). These depositaries with keeping of separate markers (barcodes) are also in demand for taxa identification.
DNA barcoding has also a wide and expanding range of practical applications, including the protection of biodiversity and rare species and the prevention of their collection and illegal sale; the control of plant raw materials, herbal teas, honey, and other commercial products; the control of weeds, invasive species, and allergy-causing plants, etc. (Koltunova et al., 2019;Shneyer, Rodionov, 2019); the genotyping both cultivated (Chinnappareddy et al., 2013;Mitrova et al., 2015) and wild plants Sinitsyna et al., 2016;Smirnov et al., 2017).
In global DNA barcoding, there is unresolved question regarding the approved set of markers specifically for plants (Shneyer, Rodionov, 2019). So far, the nuclear-encoded ribosomal internal transcribed spacer (ITS) region and the chloroplast intergenic spacer trnH-psbA have emerged as candidates for barcoding plants, followed by others including coding sequences from plastid genes rbcL and matK, two loci now the most commonly used for plants (Kress, Erickson, 2007;Yao et al., 2010;Loera-Sánchez et al., 2020;Guo et al., 2022). These markers can be used separately or in combination with other markers or spacers. Since a standard plant barcode has been complicated by the trade-off that arises between the high variability of sequences and high conservation of primers, it is then recommended to simultaneously utilize more than one marker as a compromise that best matches the barcoding criteria (Lahaye et al., 2008;Shneyer, Rodionov, 2019;Guo et al., 2022).
So, DNA barcoding is considered as a strong and promising tool in the field of molecular taxonomy for the taxonomists and conservation biologists worldwide to discover new species by performing unknown DNA sequence analysis on the DNA barcode database coupled with key morphological evidence (Ankola et al., 2021).
The Altai Mountain Country (AMC, Flora Altaica. http://altaiflora.asu.ru) is the highest modern uplift amongst the continental mountain countries in Siberia, as well as in Northern and Central Asia in general (Kamelin, 1998). This area occupies about 550 000 km 2 including the Chinese, Kazakh, Mongolian, and Russian Altai, as delimited by R. V. Kamelin (Kamelin, 2005;Vaganov et al., 2019). In 2002, David Olson and Eric Dinerstein singled Altai-Sayan territory as one of the 200 priority ecoregions of the world for global conservation of biodiversity in their work "The global 200 Priority ecoregions for global conservation" (Olson, Dinerstein, 2002). More than 2700 plant species, 300 of which are endemic, grow within the territory of the AMC (Vaganov et al., 2021). A list of 42 world scientific depositories containing the information on animals, plants and fungi findings of AMC placed in the Global Biodiversity Information Facility (GBIF) was obtained (Vaganov et al., 2019).
Plant biodiversity remains a potential source of novel human benefits, and the discovery of new taxa, as well as greater study of known taxa (Erst et al., 2022). Endemic species, those restricted in their distribution to a relatively small geographic area, are the most vulnerable to extinction (Chichorro et al., 2019;Erst et al., 2022). The type material of herbarium collections can play the key role in DNA barcoding in the study of new plant species, which are endemic in most cases.
The general fund of the ALTB Herbarium (South Siberian Botanical Garden, Barnaul, Russia) has more than 450 000 sheets. Of these, there are 334 items of typical material (as of date 20.11.2022). The publication of data on DNA sequences and distribution of AMC plants in gene banks and GBIF is one of the indicators of active work in the field of genetics and biodiversity informatics at the level of modern standards. In 2022, within the framework of 7 Turczaninowia 25, 4: 5-11 (2022) the RSF project "Study of Phytodiversity and Genetic Resources of the Altai Mountainous Country Based on Big Data", the process of DNA barcoding of the type material of the ALTB Foundation was started and work was carried out to digitize the collection (http://altb.asu.ru).
So, the purpose of our work was to sequence the main DNA markers as DNA barcode for type specimens of ALTB Herbarium. At the first stage, we chose 3 popular markers -ITS region of nrDNA, trnH-psbA intergenic spacer, and trnL-trnF intergenic spacer and trnL intron of plastid DNA.

Materials and methods
For molecular genetic study, we took material (little part of the dried plant) from 110 specimens of 72 type taxa of different taxonomic rank (species, subspecies, nothospecies, variations, etc.) of 16 families of the ALTB Herbarium, mainly from the territory of AMC. After revision of the type material for the analysis, the most numerous genera by number of representatives were Alchemilla L., Veronica L., Potentilla L. and Gagea Salisb.
DNA isolation and amplification were conducted in Laboratory of Bioengineering of the South Siberian Botanical Garden of Altai State University according to standard techniques (Kutsev, 2009). DNA was isolated using DiamondDNA kit (LLC "ABT", Russia) according to the manufacturer's instructions.
Concentration of the DNA probe was determined fluorometrically by NanoPhotometer P360 Implen (Hamburg, Germany), as well as with electrophoresis in 1.5 % agarose gel using DNA ladder Step50plus (BioLabMix). PCR products were purified using magnetic buds CleanMag DNA (Evrogen, Russia) according to the manufacturer's instructions. Purified products were sequenced by Sangermethod in SB RAS Genomics Core Facility (Institute of Chemical Biology and Fundamental Medicine SB RAS, Novosibirsk, Russia).
Obtained sequences were analyzed in Chromas 2.6.4, and then, in BLAST -for the sample confirmation. The resulted sequences were submitted in the international NCBI GenBank (see Table).

Results and discussion
The first stage result of DNA barcoding of the ALTB type material was the publication of molecular data on plant species relatively recently described in science, mainly from the AMC territory, of which a significant proportion belongs to rare and endemic ones.
In total, it was deciphered 102 nuclear and chloroplast DNA sequences of 60 taxa of vascular plants from the ALTB type material: 29 fragments (28 taxa) of nuclear-encoded ribosomal internal transcribed spacer (ITS) region, 28 fragments (27 taxa) of the chloroplast intergenic spacer trnL-trnF and trnL-intron, and 45 fragments (44 taxa) coding sequences from trnH-psbA spacer. The above data on DNA sequences were not equally successfully obtained for all taxa. In some samples, concentration of the PCR product was not enough to sequence. As a rule, the success of DNA extraction and further amplification was depended on the quality of the herbarium material.
The length of the ITS region in the data set was from 617 bp in Neogaillonia botschantzevii Lincz. Each obtained nucleotide sequence was downloaded in Genbank and identified by BLAST. In the most cases, the percent identity was 90-100 %. If it was less, it meant this taxon was absent in the database. The results are common for barcode  (Hebert et al., 2004;Erst et al., 2022).
The sequences were prepared and placed in GenBank with a unique number assigned (Table).
First column of the Table is presented barcode of the type specimen in ALTB Herbarium (Virtual Herbarium ALTB. http://altb.asu.ru). The second column includes names of type specimens as they are called on the herbarium labels. If this name is obsolete and is a synonym now, then the current name under which the taxon is registered with the NCBI is given in brackets. All taxonomic nomenclature was verified by POWO service (https://powo.science. kew.org/). The taxa, for which both nuclear and chloroplast DNA sequences were obtained, were combined into a dataset and published in the Global Biodiversity Information Facility (Vaganov et al., 2022) through the Integrated Publishing Toolkit (IPT) data publisher's operator node (http://altb.asu.ru/ipt). The dataset "DNA barcodes of the vascular flora of the Altai Mountain Country: type material of the Herbarium ALTB" has information on DNA sequences (the term "associatedSequences" of the Darwin Core specification), data on the places of collection of type material ("decimalLatitude", "decimalLongitude"), links to digitized images of the herbarium on the Internet and other information, including labels.

Conclusion
The results of the study combine molecular genetics and digital technologies, and the end-to-end number of the type collection of ALTB Herbarium is integrated into the biodata architecture of GenBank and GBIF. In the future, this approach will make it possible to obtain objective results for solving the tasks on biodiversity, evolution, and ecology of endemic and other promising plant species. General open access to the original data of the study will allow identification of taxa and trace the dynamics of their area more reasonably and accurately. In the absence of other evidence, DNA barcoding creates hypotheses regarding new species rather than outright discovering them (Taylor, Harris, 2012;Guo et al., 2022). But it should be noted that barcoding must supplement morphological data for species description (Guo et al., 2022). In the applied aspect, the identification of plant objects directly affects the solution of social problems of environmental safety, is included in the food and health agenda, and is no less significant for nature protection activities in the transboundary territory of Russia, Kazakhstan, China, and Mongolia.