Multilocus phylogeny reveals taxonomic misidentification of the Schizoporaparadoxa (KUC8140) representative genome

Abstract Schizoporaparadoxa, current name Xylodonparadoxus, is a white-rot fungus with certain useful biotechnological properties. The representative genome of Schizoporaparadoxa strain KUC8140 was published in 2015 as part of the 1000 Fungal Genomes Project. Multilocus phylogenetic analyses, based on three nuclear regions (ITS, LSU and rpb2), confirmed a misidentification of S.paradoxa strain KUC8140 which should be identified as Xylodonovisporus. This wrong identification explains the unexpected geographical distribution of S.paradoxa, since this species has a European distribution, whereas the strain KUC8140 was recorded from Korea, Eastern Asia.


Introduction
The genus Schizopora Velen., currently synonymous with Xylodon (Pers.) Fr. (Riebesehl and Langer 2017), includes white-rot fungi that play an important role in ecosystem processes as a wood decomposer. The description and identification of Xylodon (=Schizopora) species, based on morphological characters, has led to inaccuracies due to a lack of clear diagnostic characters and it has been assumed that many Xylodon species have a worldwide distribution (Paulus et al. 2000). However, during the last decade, it has been pointed out that fungal cosmopolitanism could be the result of the application of a morphological species recognition criterion and not the result of an actual biogeographical pattern (Taylor et al. 2006). Moreover, phylogenetic analyses have revealed an undescribed species diversity masked by the morphological species recognition approach (Taylor et al. 2000).
The representative genome of Schizopora paradoxa strain KUC8140, current name Xylodon paradoxus (Schrad.) Chevall., was sequenced in 2015 as part of the 1000 Fungal Genomes Project (http://jgi.doe.gov/fungi) (Min et al. 2015); this strain was collected from an oak forest in Korea. Usually X. paradoxus has been associated with late stages of wood decay, mainly in deciduous forests and shows useful biotechnological properties for bioremediation, such as tolerance to heavy metals or dye decolourising activity (Lee et al. 2014). It has been recorded around the world; however, available genetic data point to a European distribution (Paulus et al. 2000). Within the framework of a broader study of Xylodon through molecular approaches, the taxonomic identity of the strain KUC8140 has been assessed.

Materials and methods
In order to infer the taxonomic position of the strain KUC8140, phylogenetic relationships of six Xylodon species were addressed. DNA from specimens of X. paradoxus, X. quercinus  (Table 1). Three specimens of the sister genus Lyomyces P. Karst. were included as outgroup in the phylogenetic analyses (Table 1). DNA isolation was performed using DNeasy™ Plant Mini Kit (Qiagen, Valencia, California, USA) following the manufacturer's instructions. Three nuclear regions were amplified and sequenced: nuclear ribosomal internal transcribed spacer (ITS, fungal barcoding; Schoch et al. 2012), nuclear large ribosomal subunit (LSU) and the second largest subunit of RNA polymerase II (rpb2). Direct Polymerase chain reactions (PCRs) were performed to obtain sequences from ITS and LSU with the pair of primers ITS5/ITS4 (White et al. 1990) and LR0R/LR5 (Rehner and Samuels 1994), respectively. Nested-PCRs were done to obtain amplifications of rpb2 fragments, using RPB2-5F/RPB2-7.1R (Liu et al. 1999, Matheny 2005 for the first amplification followed by RPB2-6F/ RPB2-7R2 (Matheny et al. 2007), using 1 μl of the first PCR as target DNA. Amplifications were undertaken using illustra™ PuReTaq™ Ready-To-Go™ PCR beads (GE Healthcare, Buckinghamshire, UK) as described in Winka et al. (1998), following thermal cycling conditions in Martín and Winka (2000). Negative controls lacking fungal DNA were run for each experiment to check for contamination. Amplifications were assayed by gel electrophoresis in 2% Pronadisa D-1 Agarose (Lab. Conda, Torrejón de Ardoz, Spain). Amplified DNA fragments were purified from the agarose gel using the Wizard SV Gel and PCR Clean-Up System (Promega Corporation, Madison, WI, USA) and sent to Macrogen Korea (Seoul, Korea) for sequencing. Primers, used for sequencing, were those used for PCR amplifications. Additional searches for the six Xylodon species in EMBL/GenBank/DDBJ databases were performed in order to complete the molecular information available for this genus.
Raw sequence data were processed and assembled with Geneious version 9.0.2. (Kearse et al. 2012). Two individual datasets, ITS-LSU concatenated and rpb2, were created to compare the KUC8140 strain with other Xylodon species. The combination of novel, GenBank and KUC8140 sequences for each dataset were aligned in Geneious 9.0.2 with the MAFFT nucleotide sequence alignment function (Katoh and Standley 2013). The automatic alignments were reviewed manually through Geneious 9.0.2.
Phylogenetic tree estimation for each alignment was performed using Maximum Likelihood (ML) and Bayesian Inference (BI). ML and bootstrapping analyses were conducted in RAxML (Stamatakis 2006), using default parameters established in the CIPRES web portal (http://www.phylo.org/portal2/; Miller et al. 2010) and calculating bootstrap statistics from 1000 replicates. Bayesian inference analyses were implemented in BEAST v2.4.3 (Drummond and Rambaut 2007). Site model partition was selected using jModelTest2 (Darriba et al. 2012) and defined using BEAUti v2.4.3 interface. HKY and GTR substitution models were selected for ITS+LSU and rpb2 alignments, respectively, as the closest available in BEAST from the results obtained in jModelTest2. We used relative timing with an uncorrelated lognormal relaxed clock by calibrating the tree with a value of 1 in the root for the Xylodon clade. Birth Death model was used as a tree prior. One MCMC run was specified for 50 million generations, sampling every 5000th generation. Results were visualised in Tracer v.1.6 (Rambaut et al. 2018) to evaluate whether the effective sample size (ESS) values were above 200. The trees obtained were summarised in a maximum clade credibility tree by TreeAnnotator v.1.7. with a burn-in of 5000.

Results and discussion
The ITS+LSU dataset was 1193 characters long (ITS = 594; LSU = 599) and the rpb2 dataset was 647 characters long. The results of phylogenetic analyses of ITS+LSU and rpb2 datasets are summarised in Fig. 1, using phytools R package (Revell 2012). Each phylogram represents the best tree produced from the RAxML analysis. All effective sample sizes from BEAST analyses were higher than 200 for all parameters. Those clades with Maximum likelihood bootstrap (MLB) percentages ≥ 75% and Bayesian posterior probabilities (BPP) ≥ 0.99 are marked with empty circles in Fig. 1. The remaining support values are represented above branches (MLB/BPP); specimen vouchers and species names are provided on the tip labels.
Our phylogenetic analyses confirmed the misidentification of S. paradoxa strain KUC8140, since sequences of this strain grouped in the X. ovisporus clade, showing a different evolutionary history from X. paradoxus. Therefore, S. paradoxa strain KUC8140, from Korea, must be identified as Xylodon ovisporus, reported from Asia and West Pacific areas (Wu 2000, Hattori 2003. The new identity of the strain KUC8140 is also supported by geographical data, since S. paradoxa has a European distribution. This rectification helps to explain the biogeographical patterns of Xylodon and also sustains the idea that "not everything is everywhere" for wood-decay fungi (Lumbsch et al. 2008).
According to our phylogenetic analyses, Xylodon ovisporus is the sister species of X. flaviporus and morphological characters confirm this relationship. The species can be discriminated by the spore size, shorter in the first one (Hattori 2003). This example accords with studies that warn about misidentifications or mislabelled vouchers in public sequence databases (Bidartondo 2008). It has been estimated that around 20% of DNA fungal sequences in the GenBank repository may have erroneous lineage assignations (Bridge et al. 2003, Nilsson et al. 2006. Assessing accuracy in GenBank and other DNA repositories is a key stage for species identification in current biodiversity analyses based on similarity of DNA sequences (Hibbett et al. 2016). It is especially  Table 1. important in cases like Xylodon paradoxus, with useful biotechnological properties since, according to Bortolus (2008), a wrong taxonomy could lead not only to inaccurate knowledge of nature, but also to important economic losses.