- Academic Editor
Wheat (Triticum spp and, particularly, T. aestivum L.) is an essential cereal with increased human and animal nutritional demand. Therefore, there is a need to enhance wheat yield and genetic gain using modern breeding technologies alongside proven methods to achieve the necessary increases in productivity. These modern technologies will allow breeders to develop improved wheat cultivars more quickly and efficiently. This review aims to highlight the emerging technological trends used worldwide in wheat breeding, with a focus on enhancing wheat yield. The key technologies for introducing variation (hybridization among the species, synthetic wheat, and hybridization; genetically modified wheat; transgenic and gene-edited), inbreeding (double haploid (DH) and speed breeding (SB)), selection and evaluation (marker-assisted selection (MAS), genomic selection (GS), and machine learning (ML)) and hybrid wheat are discussed to highlight the current opportunities in wheat breeding and for the development of future wheat cultivars.
Wheat (Triticum spp.) is a vital cereal grain with a global production of over 770 million tons [1]. Wheat plays a crucial role in global food security, with cultivation on more than 217 million hectares annually [2]. It was the first domesticated crop to become a staple food globally [3]. Presently, wheat contributes 20% of the dietary calories and protein consumed by humans [4]. Wheat is also an important source of dietary fiber [5, 6, 7]. Globally, hunger affects more than 720 million people, and the lack of a healthy diet affects more than three billion people [8]. Therefore, wheat production, accessibility, and availability are crucial for food security [9]. Demand for wheat is expected to increase by 60% as the global population grows to a predicted 9.4 billion people who will also be wealthier and consume more by 2050 [10, 11].
However, wheat production is influenced by both biotic and abiotic constraints [12]. Biotic stresses (insects, nematodes, alongside fungal, viral, and bacterial diseases) and abiotic stresses (heat, drought, cold, and salinity stresses) are potential risks to global wheat production [13]. Climate change, including global warming, threatens food security and is predicted to lower wheat productivity and nutritional value [14, 15]. To cope with these food security abiotic and biotic challenges and the predicted increase in global demand, there is a need to enhance the global wheat yield [16] by 1.4 to 1.7% annually [17]. The environment can trigger phenotypic changes through biotic and abiotic stresses (including climate change), which can significantly impact crop yield.
The importance of the environment on production and how wheat cultivars grow in the environment is best summarized by the following equation:
P = E + G + G
where P is the phenotype, E is the environment, G is the genotype, and
G
The environment includes largely uncontrolled aspects (weather-related, temperature, moisture, and solar radiation). Other controlled aspects are often called management (M) [18], which includes grower decisions, such as tillage, fertilizer, irrigation, and pesticide use or non-use. Agronomic research develops improved management techniques to optimize phenotype. Hence, the previous equation can be expanded to:
where E, G, and G
To increase wheat yield, different breeding strategies have been used to
evaluate G, G
In a pure-line breeding program that develops inbred cultivars, the breeder
identifies and chooses the traits that need to be improved. Then, the breeder
needs to (1) introduce variation for those traits; (2) allow the variation to
segregate through inbreeding; (3) select elite lines (improved for the target
traits) for extensive evaluation and potential release (Fig. 1) [20]. The time
from when a cross is made to when a new wheat cultivar is released is between 6
and 13 years but often depends on the growth habit of the specific wheat (spring
wheat or winter wheat, the latter requiring vernalization for flowering and
producing grain). This review will consider the selection during segregating
generations separate from the evaluation, which uses highly inbred/homozygous
lines. At a segregating locus (Aa, where A and a are alleles), in the F
The phases and time for each phase of plant breeding to create new wheat cultivars. The values under years refer to the number of years for each phase. For example, in introducing genetic variation, a cross would be made in the first year and additional crosses (three-way or double crosses) in the second year. The number of years in this phase ranges from one to two to accommodate the different types of crosses that start a breeding cycle. The total number of years from the cross to the cultivar release ranges from 6 to 13 years. In this example, selection is performed concurrently with the inbreeding phase.
To evaluate and compare the predicted outcomes of different breeding systems, plant breeders use “the breeder’s equation” [21] to estimate genetic gain, which in various breeding systems is the response to selection. There are many forms of genetic gain equations depending upon the breeding strategy, but a common one for phenotypic selection is:
R = kh
where R is the genetic gain per cycle, k is the selection intensity (related to
the proportion of the population selected for advancement), h
R
where R
h
where
Genetic variation in wheat can be introduced through conventional and modern approaches. The traditional breeding approach includes introducing improved traits from other elite lines, but also from wild relatives and landraces when the elite line parents do not have genetic variation for important or desired traits. Conversely, modern approaches for increasing genetic variations include mutation breeding, transgenic (often used synonymously as genetically modified), and gene-edited wheat.
Genetic variation in wheat can be achieved by crossing two or more distinct
lines or parents. Introducing genetic variation is considered critical to the
overall success of the breeding effort [23, 24]. Except for random mating
populations involving genetic or cytoplasmic male sterility, all crosses are
biparental (Line 1
In considering the origin of tetraploid wheat (T. durum L., 2n = 4x = 28, genomes AABB) and hexaploid wheat (T. aestivum L., 2n = 6x = 42, genomes AABBDD) wheat, the major bottleneck is often considered to be the late addition of the diploid D-genome (2n = 2x = 14) to the tetraploid progenitor. The progenitor of tetraploid wheat is estimated to have occurred between 0.3 to 0.8 million years ago [27, 28, 29, 30]. However, adding the D-genome to create hexaploid wheat occurred only 8500–9000 years ago [29, 30]. Compared to tetraploid wheat, there would be less evolutionary time for crosses to progenitor species, intermating within the new species, and the accumulation of mutations in hexaploid wheat. To overcome this bottleneck and expand the diversity within hexaploid wheat, synthetic wheat has been developed by artificially hybridizing the tetraploid wheat species (AABB) with the diploid wheat species Aegilops tauschii (DD) to recreate hexaploid wheat [31]. This technique enables the breeder to produce fertile hybrids artificially by crossing the tetraploid and diploid species of wheat, which can also be crossed with hexaploid wheat. Synthetic wheat is used as a bridge to transfer traits from either progenitor parent to redevelop hexaploid wheat [31, 32] while greatly expanding the genetic diversity of wheat [33]. There are many examples of synthetic wheat being used to improve hexaploid wheat for traits such as salt and drought tolerance [34, 35, 36, 37] and diseases and pests [38]. Since synthetic wheat expands the diversity of hexaploid wheat, different mating (syn. crossing) strategies have been proposed [39].
Hence, wild relatives are a good source of increasing genetic variation by introducing new genes lost during domestication [40]. Wild relatives are widely used by breeders to create new genetic combinations for disease resistance (as the pathogen/pest and the wild relative host plant have co-evolved over millennia), for abiotic stress resistance (the wild relative has co-evolved in harsh environments); many of the introduced genes are dominant and relatively easy to select [41, 42]. These traits make landraces and wild relatives a valuable resource for increasing biodiversity and sustainability that must be protected [43].
Breeders have been reluctant to use wild relatives due to the added complexity of their ploidy level and the linkage drag of unfavorable genes linked with favorable genes [44]. Wild species have not been selected for agronomic potential, so it can be difficult to identify useful genetic variation, maintain genetic gain, and recover viable cultivars even when deleterious genes are not linked to the gene of interest. When diploid and tetraploid relatives of wheat are used as a source of genetic variation and improvement of bread wheat, there are often wide hybridization barriers [45, 46] and sterility and recombination issues requiring many backcrosses. Backcrossing is used when a donor parent is crossed multiple times to a recurrent parent, and the objective is to introgress one or a few genes into a recurrent parent.
The genetic gain per cycle (R = kh
If elite germplasm lacks genetic variation for a key trait, the breeder may generate new genetic variation using mutation breeding [48, 49] or transgenic approaches. Examples of successful mutations used in breeding include herbicide resistance traits to manage weeds and nutritional value improvements.
Transgenic or genetically modified (GM) crops are produced by inserting one or more genes from other species into their DNA. Wheat is a major cereal and can be genetically transformed using Agrobacterium tumefaciens [50, 51, 52, 53]. For example, HB4 (IND-ØØ412-7) is the drought-tolerant GM wheat, which was recently approved for cultivation in Brazil and Argentina [54, 55]. Transgenic wheat with the ZmDof1 transcription factor, derived from maize (Zea mays L.), has enhanced nitrogen use efficiency and yield [56]. The CspA and CspB genes for the cold shock proteins derived from bacteria also enhanced drought tolerance in transgenic maize and wheat [57, 58].
Transgenic wheat is an effective way to breed wheat cultivars for resistance and tolerance to biotic and abiotic stresses by adding new genes, silencing host or pathogen genes, and pyramiding/stacking genes and resistance/tolerance approaches [59]. These efforts are potential sources to transform agriculture by incorporating novel genes or developing synthetic genetic variation [60]. For example, the gene Fhb7 was recently cloned [61] and shown to be the resistance gene for Fusarium head blight (incited by Fusarium graminearum Schwabe) from Thinopyrum elongatum (Host) D.R. Dewey. From a breeding perspective, Fhb7 bread wheat can be developed by (1) the gene being definitively known; (2) the development of perfect markers from the gene sequence for tracking in breeding populations; (3) the gene being separated from deleterious linked genes in Th. elongatum, which results in less linkage drag, as described above.
In addition to utilizing cloned genes for transgenic use in wheat breeding, gene
stacking is an important application of transgenic crop improvement through the
biosynthetic gene clustering of favorable traits [62]. The objective is to
develop linked resistance genes that segregate as one linked group, thereby
simplifying their breeding and tracking. Usually, when stacking multiple genes,
the breeder must consider the smallest complete population to include every
possible genotype, which is 4
Number of Segregating Loci | Generation | |||
F |
F |
DH | ||
Smallest complete population* | 1 | 4 | 3 | 2 |
6 | 4096 | 78 | 64 | |
21 | 4,398,046,511,104 | 4,084,924 | 2,097,152 | |
Smallest population with one plant containing all of the needed genes** | 1 | 2 | 2 | 2 |
6 | 6 | 54 | 64 | |
21 | 421 | 1,098,973 | 2,097,152 |
The generations were chosen to represent the selection occurring in the early
generations (F
The major difficulty with transgenic wheat is in its commercialization and market acceptance [64]. Public acceptance of new GM wheat remains very slow, even in countries facing food crises [65]. In addition, for major exporting countries, negative campaigns and differing views of GM crops among importing countries continue to be major constraints for transgenic wheat [66].
The clustered regularly interspaced short palindromic repeat (CRISPR) system is a very powerful gene-editing tool [67, 68, 69], which, in addition to CRISPR/Cas, is used for crop improvement by adding, deleting, and silencing genes [70]. The CRISPR/Cas system employs the Cas enzyme to specifically cut the double-stranded DNA in vitro, creating double-stranded breaks in the DNA for gene-editing [71]. The gene of interest can be added at any specific double-stranded DNA breakpoint by homology-directed repair (HDR, ligation mechanism, or repair of DNA strands). However, if the deletion of a specific gene is desired, then the Cas enzyme can target a specific DNA sequence and cut the DNA precisely where the deletion is required. Thereafter, the remaining DNA will be repaired or ligated by non-homologous end joining (NHEJ). In addition, gene families can be targeted for editing [72], which is important for complex traits, such as seed storage proteins, which affect end-use quality and human allergies (see below) found in higher plants, including wheat.
Gene-editing is an increasingly important tool for enhancing genetic resistance in wheat against stresses by knockdown mutations of susceptibility genes [73]. CRISPR/Cas technology has been used successfully to modify the wheat genome by deletion [74, 75]. For example, CRISPR/Cas was used to develop non-transgenic reduced allergen or allergen-free wheat E82 through deletions in the wheat genome [76]. One of the major differences between transgenic and gene-editing is that they currently have different regulatory structures. In most countries, transgenic wheat is more highly regulated, making it more expensive to bring to market than gene-edited wheat. The lower regulatory levels are because gene-editing often involves modifying the wheat genome instead of adding new genes to wheat from different genera or species.
The genetic gain per cycle of GM wheat can be higher compared to conventional crossing if the transgene is inserted into a cultivar, which will be released (potentially reducing the time required compared to traditional backcrossing), assuming government regulations do not delay it. Moreover, the breeding values of GM wheat should be the line value plus the added trait value, which would be more easily predicted than evaluating lines from a segregating population during inbreeding.
Inbreeding is the need for breeding programs to develop homozygous lines for
cultivar release or for hybrid parents. In self-pollinated crops, inbreeding
(through selfing) is the normal form of generation advancement. Genetic gain
(R
A DH plant is formed by doubling the chromosomes of a haploid cell or plant. It
is an important technique for accelerating the wheat breeding cycle because 100%
homozygous plants can be developed from an F
In plants, haploid cells are from gametophytic tissues (derived from the
megaspore or microspore). However, if the megaspore is used, the haploid plants
generally come from a haploid embryo. After fertilization, one set of chromosomes
is eliminated, meaning only one set remains, which is followed by embryo rescue
[83]. The most common practice today is the pollination of a wheat ovule using
maize pollen [84, 85, 86]. Shortly after fertilization, the maize chromosomes are
eliminated, meaning only the wheat chromosomes remain. The technique of
wide-crossing (Hordeum vulgare
If the microspore is used, the haploids are developed from anther or isolated microspore cultures [92, 93]. The use of r anther culture is another practical and cost-effective method of producing DH [92]. In this method, immature anthers (usually containing microspores at the late uninucleate stage are cultured [93, 94]. Anther cultures have been successfully used for decades for haploid production [95, 96]. These haploids are successfully developed into homozygous DHs again by applying colchicine [97, 98, 99]. Many wheat varieties have been developed using this DH technique, such as “Florin” [100].
Currently, the wheat x maize DH production is more widely used because it has
less genotype specificity (some F
The importance of DHs is how quickly all the genetic variation is transferred
between the lines (no genetic variation within the line); moreover, the lines
should be true breeding unless the DH protocols induce heterozygous genetic
variation. Furthermore, since all the genes are homozygous, there will be more
efficient selection for traits controlled by recessive genes that would be masked
in a heterozygous plant. One challenge with this approach is that if DHs are made
from F
The phases and time for each phase of plant breeding using doubled haploid (DH) lines to create new wheat cultivars. The values under years refer to the number of years for each phase. For example, in introducing genetic variation, a cross would be made in the first year and additional crosses (three-way or double crosses) in the second year. The number of years in this phase ranges from one to two to accommodate the different types of crosses that start a breeding cycle. The total number of years from cross to cultivar release ranges from 7 to 10.
A similar approach to the outcome of doubled haploidy is the single seed descent plant breeding protocol. Similar to DH plant breeding, this approach was developed to rapidly achieve homozygosity before evaluation and often without early generation selection. The inbreeding generations could occur off-season and in unrepresentative environments for selection because the goal was rapid generation advancement. Two to three generations were conducted yearly in many single-seed descent breeding programs. Therefore, SB is a technique that further reduces the time required in the breeding cycle, meaning generations are advanced more quickly through the careful manipulation of light and temperature [102, 103]. In SB, seeds are sown under a controlled environment, using LED lights as light sources, with 22 hours of photoperiod and 2 hours of darkness, at temperatures of 22 °C for the photoperiod and 17 °C for the dark period, under humid conditions (60–70%). The LED with red, blue, and far-red is suitable for SB environment. In SB spring wheat reaches the 2–3 leaf stage after 10 days of sowing, the flowering stage after 4–6 weeks, early maturity stage within 8 weeks (often 52 days) from the sowing date, heads are also placed in air forced dehydrator for 3 days at 35 °C [104]. Then, the harvested seeds are kept at 4 °C for 3 days to break the seed dormancy and for sowing [105]. Hence, SB shortens the breeding cycle and is a useful tool for accelerating genetic gains in wheat [106, 107, 108]. Moreover, it enables breeders to advance by 4–6 generations of spring wheat in a year by enhancing the photoperiod under a controlled environment [105].
SB also supports various selection methods: Single seed descent, single pod
descent, single plant selection, MAS selection, and clonal selection [109, 110].
SB and DHs are both tools for inbreeding and rapid approaches to homozygous
generation advancement. However, the methods differ in two important ways. The
first difference is the amount of recombination in each process. DHs derived from
F
The selection and evaluation phases are both essential yet different components
of the breeding program. Selection is best performed using tools or environments
that separate lines for the traits of interest. Since there is not a single
environment that represents all the current and future environments where a line
may be grown (often referred to as a target population of environments) [113],
selection often occurs in multiple environments, and G
In selection trials, the experimental design of the early-generation material [120, 121, 122] is essential and often uses augmented, partially-replicated, or alpha-lattice designs. Removing or accounting for spatial variation in field trials needs to be done to estimate the accurate value of a line [123, 124]. Randomized complete block designs require extensive replication, meaning they are rarely used in early-generation selection trials due to the large number of lines that must be evaluated with limited seed supplies. Due to the likelihood of spatial variation within the block, ready access to improved statistical software, and greater computational capabilities, selection trials can account for and remove spatial variation [120, 121, 122]. Many of these same experimental designs and analysis perspectives are important for evaluation trials (discussed below), but generally, the seed is not limiting, and the number of lines is fewer in advanced evaluation trials; so replication is often greater.
Selection processes often occur simultaneously with the inbreeding process and
are an important part of genetic gain (R = kh
There are two types of selection: natural and artificial selection. In natural selection, “nature” selects the fittest and most well-adapted plants, while artificial selection is performed by the breeder. Additionally, artificial selection can be further separated into direct and indirect selection. In direct selection, breeders complete the selection based on the trait of interest, e.g., screening or selection of material in a field/laboratory based on drought stress, salt stress, or disease tolerance. Comparatively, indirect selection is implemented based either on a related trait, which is genetically correlated to the trait of interest, or on the region of the genome associated with the trait of interest. Molecular selection (see the following section on marker-assisted selection, MAS) is an example of indirect selection, whereby the selection is made based on the highly heritable molecular markers in the genomic region, which is associated (correlated) with the trait of interest [125]. Similarly, selection and evaluation also benefit from high-throughput phenotyping and genotyping. The breeders use various techniques for selection and evaluation, e.g., phenotypic selection [20], MAS, high throughput genomic and phenomic selections, and ML. Indeed, ML is an excellent tool and can be used for phenomics and many other aspects of plant breeding; it will be discussed in a general sense below. ML tools are increasingly being used in various aspects of plant breeding and some aspects will also be included with the sections discussing specific plant breeding aspects.
Molecular markers were initially protein-based, but with advances in molecular tools, modern molecular markers are primarily small DNA fragments. Markers can detect polymorphisms between different genotypes or alleles [126]. These markers are associated with phenotypic traits via linkage disequilibrium to reveal the DNA sequence variation that causes trait variation [127]. These marker-trait associations (MTAs) are widely used for identifying the genomic regions responsible for the expression of the trait of interest in crops [128]. Marker-assisted selection uses markers and identified MTAs to select the desirable alleles for each important trait. Marker-assisted selection is an efficient way of selecting genotypes and improving essential traits owing to the high heritability of the markers. In contrast, the trait may be lower in heritability and more costly to phenotype [129]. In the general sense, MAS is a form of indirect selection. MAS has been progressively upgraded from the first and second generation of markers, viz., restriction fragment length polymorphisms (RFLPs), random amplified polymorphic DNA (RAPDs), simple sequence repeats (SSRs), and amplified fragment length polymorphisms (AFLPs), to the third and fourth generation of markers viz., single nucleotide polymorphism (SNPs), kompetitive allele specific PCR (KASP), diversity array technology (DArT) assays and genotyping by sequencing (GBS) [130, 131], or skim sequencing [101]. SNPs have the highest throughput and are the most commonly used markers in plant breeding [132].
Marker-assisted selection is used for screening genotypes for qualitative and quantitative traits. However, it is generally used for selecting one or a few loci that control a similar number of traits. For example, markers have been widely used to screen wheat genotypes for fusarium head blight (caused by F. graminearum) [133, 134], stripe rusts (caused by P. striiformis Westendorp f. sp. tritici) [135, 136, 137], drought tolerance [138, 139], heat stress [140, 141], and salt tolerance [142, 143].
Marker-assisted selection is beneficial in backcross breeding [144], where the objective is to introgress one or few genes into a recurrent parent. First, MAS can be performed at the seedling stage, while the trait is often expressed later in the life cycle of the plant (in the adult plant or after flowering). Second, the trait may be controlled by recessive alleles and is hard to select for in the backcross generation, although if the markers are co-dominant, they can be selected easily. Third, molecular markers can identify the genetic distance between the donor and recurrent parent lines, which may reduce the number of backcrosses needed to recover the recurrent parent genotype [145]. Finally, molecular markers can also be used to quickly recapture the recurrent parent background and reduce the number of backcrosses needed [145, 146, 147].
Recent advances in DNA sequencing technology provide a major advantage in that thousands of markers can be developed for individual genotypes. Hence, many genomic regions can be identified by genome-wide association studies (GWAS) and by tracking using MAS, from one DNA extraction and one sequencing assay [148]. These advances greatly reduce the cost of each marker and the ability to screen for multiple traits and a recurrent parent background simultaneously. Since the DNA does not change, developing and mapping new markers by dense-genotype mapping allows those markers to be added to the lines of historical records as the DNA does not change, thereby increasing its applications and related knowledge.
GS is used to improve selection for complex traits affected by many loci. This breeding approach selects the best individuals based on predicted breeding values for each line, which have been developed from a combination of genomic and phenotypic data [149, 150], known as genomic estimated breeding values (GEBVs) [151]. The GEBVs are created based on the summation of the individual marker effects on phenotypic responses. GS uses both genotypic and phenotypic data sets of a relevant, smaller population, known as the training population, to predict a larger population using only genotypic information. The training population must be relevant to the breeding population to provide accurate predictions. The accuracy criteria are based on the quality of the predictions, such as the correlation between the predicted and true breeding values. Multiple statistical models are used for making predictions in the GS, viz., penalized regression, Bayesian method, and the Reproducing Kernel Hilbert spaces method [152, 153, 154, 155]. The accuracy of these models depends on the phenotypic trait being predicted, the amount of available phenotypic and genotypic data, and the relationship between the training and breeding populations.
The improvement in selection is based on the GEBVs of the genotypic data—hence known as GS [156]. The key advantage of GS is that it is generally less costly to genotype lines than to phenotype them. For example, highly efficient DH breeding programs can create hundreds or thousands of DHs. Thus, it is more cost-effective to genotype the DHs and phenotype a subset (the training population) than to evaluate the remaining lines (the test populations) phenotypically. This process greatly reduces selection costs and facilitates the rapid selection of useful individuals, potentially shortening crop breeding cycles [157]. Another advantage of GS includes the ability to select for an “average” year when the current selection or evaluation environments have environmental variations that are not representative of typical environments (for example, extreme weather conditions, such as early heat, drought, or hail [158].
Compared with MAS, GS is more efficient and accurate for complex traits that are often multigenic, as it uses the data from each marker to predict the phenotype [159]. Theoretically, GS differs from linkage analysis (LA) and GWAS because these methods emphasize the detection of specific traits or specific QTLs. In contrast, GS emphasizes the use of each locus in the genome and predicts the phenotype [160]. It performs correlation among single nucleotide polymorphisms (SNPs), which is better than other methods that analyze each SNP separately. In summary, genome-assisted breeding is a powerful tool for improving crops since it combines the data of GS, MAS, and markers for crop improvement [161]. Various wheat varieties, viz., ‘Ruth’, ‘Valiant’, and ‘Epoch’ were selected and released using GSs at the University of Nebraska–Lincoln [162, 163, 164]. In wheat, GS is also used to improve various traits, viz., wheat blast [165, 166], spot blotch [167, 168], fusarium head blight [169, 170, 171], and yield [77, 172].
Since this section has described GS, it must be noted that repeated high-throughput phenotyping can also predict traits [173]. For some traits, the phenomic predictions were better in maize than the genomic predictions. Hence, genomic predictions will certainly be augmented by phenomic predictions from high-throughput phenotyping.
In wheat breeding, some examples exist where the traditional cross, self and
inbreed, and selection do not work well, meaning recurrent selection [144] is
needed. Recurrent selection can be defined as repeated cycles of crossing and
selection to accumulate desirable alleles. An example of where recurrent
selection is needed is pyramiding multiple minor genes [174] or whenever the
number of segregating loci exceeds what can be handled conveniently in an F
Evaluation is an essential and final phase of any breeding program before cultivar release. In this step, the goal is to identify lines that succeed in one or more mega-environments, where the lines are recommended to be grown, and to determine where the line is not recommended to be grown. In the evaluation stage, the lines should all have elite performance levels and have most, if not all, the required traits. The differences among lines are often small compared to the early generation selection trials because only the consistently better lines have been advanced through phenotypic and GS. The concept of a mega-environment is a group of environments where lines perform similarly. Mega-environments are usually identified by some form of cluster analysis to document that the environments are similar [179]. The value of knowing mega-environments is that the breeder can select evaluation (syn. testing) sites in different mega-environments to obtain the most useful data to validate the value of a line [113, 179, 180]. Ideally, a breeder would require one evaluation site within each mega-environment. However, in practice, the breeder needs one selection trial in each mega-environment and multiple evaluation trials in each mega-environment. Multiple evaluation trials within a mega-environment are needed to (1) better represent the mega-environment due to annual climatic variation; (2) provide additional replications to separate better line performance where every line is an elite line (better statistical power); (3) As part of a risk avoidance strategy when an evaluation trial is lost due to catastrophic effects such as drought, heat, or hail. Finally, seed purchasers like having relevant data from nearby locations to market a line. The number of evaluation sites within a mega-environment is determined by project resources and the importance of the mega-environment.
Similar to clustering environments, lines can be clustered by their response to the environment [119, 179, 181]. When breeding programs have limited resources, then, not every line must be tested as long as the representative lines are tested in all locations. With the advent of extensive genetic/genomic data, the concept of replicating alleles rather than lines is an important addition to the evaluation/testing theory [119]. The traditional evaluation of genotypes and traits was conducted using the genotype through environmental interaction under multi-location yield trials [182, 183]. Understanding the G x E interaction is a critical component during the selection and evaluation of genotypes in multi-location yield trials [184]. The G x E interaction enables the breeder to evaluate the genetic stability of various quantitative traits and select genotypes with higher yield potential under various growing conditions and ecological zones [185]. The desirable genotypes are selected based on their performance for the priority traits in their areas of adaptation. These traditional approaches, such as cluster analyses, were laborious and unable to handle big data or large populations through multi-trait evaluations from high throughput phenotyping [186]. Thus, the modern approaches of ML and deep learning (DL) have been applied to facilitate breeders in evaluating various aspects of crops, handling large populations, and determining the best statistical analyses [187, 188, 189, 190].
ML is part of the broad category of cutting-edge technology called artificial intelligence, where computers can learn from experience [191]. ML allows the machine to learn without being programmed [192]. It can be used to optimize the integration of large phenomic datasets and make predictions about yield, disease, and the need for crop inputs. Conceptually, computers are given a set of training data in which they find patterns and from which they are expected to develop improved models (algorithms). DL is an advanced form of ML and a modern technique of extensive data analysis and image processing [193], which again emphasizes learning by example, often with images, as part of phenomics. DL in agriculture often provided higher accuracy in the results than traditional image processing tools [194]. For example, in Fig. 3a, a whole breeding field can be imaged in a single photograph. By enlarging the image or by taking additional images (Fig. 3b and c), each plot can be objectively scored using algorithms. In this example, we were interested in the plot stand after the winter, which estimated winter survival. The plant stage is early tillering and represents early spring growth for additional plant development studies. The individual plot scores, analyses, and images can be stored for permanent records. The advantage is that there is no temporal variation in these measurements, which would normally occur by walking in a field and compiling notes. Additionally, the unmanned aerial vehicle (syn. drone) can image and with the correct protocol can obtain and analyze the data more efficiently than traditional note taking, data transfer, and analysis.
Uncrewed aerial vehicle (syn. drone) photographs of a wheat breeding nursery in Lincoln, NE, acquired on May 10, 2020, using an RGB camera. (a) A complete breeding nursery taken at once so that every plot is equally available for processing, evaluation, and analysis. (b) and (c) are enlarged images of specific parts of the fieldwhich can be used for image analysis (in this case winter survival and early plot vigor). Photographs courtesy of Mason C. Lien and included with his permission.
ML and DL can be used to develop decision-support tools for many purposes in
plant breeding, including yield improvement [195]. For example, ML is helping
breeders record data, analyze data, and make predictions [196] using high
throughput phenotypic and genomic data that allows computing and bioinformatic
tools to identify associations and create inferences from patterns [197].
Therefore, ML and DL can save time and allow the use of high throughput
technology that is superior to or more efficient than the previous ways of
measuring the morphological, biochemical, and physiological traits of crop plants
[149, 186] while also helping make predictions [173]. Using ML and DL to perform
large-scale phenotyping of genotypes for multiple traits under diverse
environments enables the breeder to identify and select the genotypes and
phenotypes with desirable traits to better understand G
ML and DL are mainly used in wheat to predict yield, biotic—often for disease—or abiotic stresses. Applications of ML were reported for yield prediction [200, 201, 202], fusarium head blight prediction [203, 204], stripe rust [205, 206], and salinity stress [207, 208]. ML and DL improved the GS prediction by handling large populations with multi-spectral traits (morphological, biochemical, and physiological) and genomics using multi-location yields that were better than conventional methods.
Thus, ML will become an increasingly useful approach for line evaluation. In ML, the massive data generated for each line can be recorded temporally in the form of spectral images, temperature, moisture, or light radiation and will be used to develop a model from the training data set for a better understanding of the pattern for any important trait and the collections of traits that is the line per se. For example, these learning models can be used to identify the early, medium, and late stages of development and then identify the critical stages of any trait or disease, which will further improve predictions. To reduce the error or noisy predictions, it is necessary to use a large population size (thousands of data points) to develop the input models required to evaluate the elite lines. Hence, ML provides efficient ways of using big data from high throughput phenotyping and genotyping faster and more extensive evaluations that should enhance genetic gain.
The efficiency of the crop evaluations will continue to be enhanced by coupling ML with human-based evaluations. For example, while ML may consider spatial variation, it may be difficult to know what caused it without performing human observations (soil, tillage, weed, drought-related, or other factors).
Recently, interest in hybrid wheat (the harvested seed from a cross between two parent lines) has been renewed by research institutions and seed companies owing to the need for improved abiotic and biotic stress tolerance and yield enhancement [209, 210]. Globally, the rising demand for wheat requires breeders to consider every possible way of unlocking the potential for wheat to mitigate the gap between yield and demand [211, 212]. Currently, pure-line wheat production is facing stagnant yield increases, meaning the potential of hybrid wheat is needed for food security and yield enhancement [211, 213, 214]. The second reason for this renewed interest in hybrid wheat is the new tools that are available to wheat breeders. The theory and understanding of hybrids and heterosis (defined below) continue to increase, as do the genomic and statistical tools required to predict heterosis [215, 216, 217, 218, 219, 220]. Much of the previous trial-and-error aspect of hybrid breeding has been replaced by molecular markers that can be used to identify the key genes required for hybrid seed production, improved mating systems that create heterotic pools [216], and genomic predictions.
Wheat is a naturally self-pollinated crop that could benefit from hybrid breeding to exploit the potential of heterosis [221]. The success of hybrid wheat depends on higher levels of heterosis for yield and quality traits and the value of the grain (i.e., as the value of the harvested grain increases, the necessary yield increases required for farmer adoption decreases) [222]. Heterosis can be defined as the increased performance of the hybrids over the average parents (mid-parent heterosis), the increased performance of the hybrids over the better parent heterosis (referred to as high parent heterosis), or the higher performance of the hybrids over the commercial cultivars (commercial heterosis) [223, 224]. Of the three heterosis types, geneticists are most interested in the mid-parent heterosis, while growers are most interested in the commercial heterosis. Nevertheless, heterosis has been noted in numerous studies [215, 225, 226, 227], and wheat hybrids are usually more environmentally stable than pure-line cultivars [228, 229, 230], an added advantage for reducing production risks. Hybrid wheat also has the advantage of dominant alleles, whereby either parent can contribute the needed allele, and the trait will be expressed in the hybrid. It is expected that through careful hybrid parent selection, hybrids should have greater biotic and abiotic stress tolerance since combining the beneficial dominant alleles of each parent in a hybrid is easier than developing a pure-line cultivar that contains all the beneficial alleles. For long-term success in hybrid wheat development, heterotic groups will likely need to be developed using selective crossing plans and genomic predictions [216]. All the recent work with hybrid parents was performed by selecting parents without heterotic groups [217] and with little pre-existing information on their combining ability.
Self-pollinating cereals, and especially those that are polyploids where there can be heterosis between genomes in pure lines, have a long history of efforts for hybrid breeding with generally low to moderate success [218]. Hybrid breeding in wheat started in the 1950s with the discovery of a cytoplasmic male-sterility (CMS) system in Japan [231].
Research on hybrid wheat has been limited to smaller-scale studies compared to cultivar (pure-line) breeding and lacks the establishment in global markets [218]. The future success of hybrid wheat depends on the farmer’s increased profits from higher and more dependable grain yield as compared to the increased cost of hybrid seed and any additional costs for growing hybrids [232]. A hybrid wheat program has a few requirements that can make it commercially successful. In addition to the increased profitability of hybrids, compared to cultivars, there must be an efficient system to produce hybrid seed, i.e., an efficient male sterility system and a redesign of female traits that produce higher out-crossing and pollination [233, 234] or apomixis of heterozygous hybrids.
Currently, there are three different approaches for hybrid seed production, viz., CMS, chemical hybridizing agent (CHA), and genetic male sterile systems, such as the blue aleurone (BLA) gene systems, which researchers in wheat breeding are using to produce hybrids [235].
CMS is based on the interaction between cytoplasmic and nuclear genetic factors
[236, 237]. An interaction between the diploid wheat (T. timopheevi)
cytoplasm and the nucleus of hexaploid wheat (T. aestivum) led to the
development of a cytoplasmic male sterile line [238]. A genotype capable of
restoring fertility was first discovered in ‘Nebraska 542437’ [239]. The CMS
hybrid seed production system is a three-line (ABR) system that requires
cross-pollination in two steps for hybrid seed production [236]. In the ABR
system, the A-line (cytoplasmic male sterile with no nuclear restorer genes) is
male sterile, the B-line is a male fertile maintainer line (has the normal wheat
fertile cytoplasm and the same nucleus as the A-line, a male fertile alloplasmic
line similar to the A-line), and the R-line carries genes that restore fertility
to the male sterile cytoplasm and is called a fertility restorer line. Since most
restorer genes only partially restore fertility when used singly, most R-lines
have multiple restorer (R
Another approach for achieving hybrid wheat is the use of a CHA. The application of a CHA creates female lines by inducing male sterility, thereby preventing the CHA line from self-pollinating. The CHA-treated females are pollinated by unsprayed males and used for hybrid seed development [243]. Chemically induced male-sterility is the forced type of cross-pollination in wheat crops to achieve hybrid seeds [222]. When compared to CMS hybrids, the CHA is a simpler approach for hybrid production. It induces male-sterility theoretically in any genotype by suppressing the pollen formation [244] and can be pollinated by any normal wheat plant that sheds pollen and pollinates when the female is receptive. A good CHA should not harm the female stigma receptivity [245] or female grain yield, although the CHA is often phytotoxic to the plant and may not sterilize every genotype [215]. A major difficulty with the extensive use of CHAs is that their application and efficacy are affected by weather conditions when applied which can cause hybrid seed production failures.
The BLA gene system is another hybrid wheat system invented by Australian researchers [235]. The blue aleurone gene was translocated into T. aestivum from a wild relative. The female (male sterile) line has a homozygous deletion (Probus deletion) in wheat on the short arm of chromosome 4B (ms-1 gene) [246]. Though initially performed by a deletion, a mutation in the male fertility locus can also be used [235]. This homozygous deletion or mutation of a single gene (ms-1) is a very reliable way to create male sterility. However, male sterility can be reversed in hybrids by crossing any normal wheat that carries the dominant fertility gene (does not have the deletion or mutation). For large-scale male sterile (e.g., female) seed production, the male sterile line is crossed with male lines, which are heterozygous for the fertility gene and are linked to the blue aleurone gene from a wheat relative. In the hybrid seed progeny, any blue seed will be self-fertile, and any line that is not blue should be homozygous and sterile. The fertile and infertile seeds can be sorted in ideal conditions using color grain sorters. In the early efforts, the male fertility and blue aleurone color were on an additional chromosome (2n = 42+1), but through elegant chromosome engineering, a stable translocation line with tightly linked fertility and blue aleurone genes has been developed [235]. Notably, other nuclear genetic male sterility systems have been proposed in addition to the blue aleurone production systems [247].
Another genic male sterility system that has been used in other crops is environmentally sensitive male sterile genes [248]. In this case, female seeds can be produced in environments that do not express the sterility gene, meaning the line is fertile. Hybrid seeds can be produced in environments where the sterility gene(s) is expressed and the female line is sterile, causing it to be outcrossed by the male [249, 250]. This system is known as a two-line system.
In practical hybrid wheat research, experimental hybrids that are used for testing are made using CHAs because they have less genotype dependence [244] and, in theory, can sterilize any wheat line needed for testing [243]. CHAs can be used for large-scale productions; however, genetic-based systems are considered more cost-effective, effective, and reliable.
Apomixis is the asexual reproduction of seeds without fertilization [251, 252, 253]. Apomixis has little potential for pure-line propagation because selfing and apomixis would provide genetically identical progeny for a homozygous line. However, apomixis has great potential for propagating hybrids and cloning heterozygous genotypes [253, 254], thus, potentially replacing the current hybrid production systems. Hybrid seed production could become as simple as pure-line seed production (planting and selling seeds from the apomictic hybrid), exactly as planting and selling seeds from the pure line. Synthetic apomixis has been achieved in rice (Oryza sativa L.) by inducing (MiMe) mutations in gamete cells and activating a parthenogenetic trigger (BBM1) in egg cells, which produces hybrid clonal seeds [255]. The MiMe mutation led to the mitotic division of gametes instead of the meiotic division, thereby producing clonal gametes (2n) [256]. Overall, apomixis may soon become a reality in cereals, including wheat.
We have discussed several new technologies and their potential impact on wheat breeding. Plant breeding is a highly flexible science, one where technology is used to fit the objective; this section will provide examples of how modern technology might be used in an evolving wheat breeding program depending upon the desired predicted outcome. Everything in plant breeding is based on resources. Therefore, we have assumed a well-funded breeding effort that can be scaled back if resources become limited. Here, we will emphasize pure-line cultivar development and then highlight a few concepts for hybrid development.
The introduction of variation will largely continue through crossing. However,
crosses will increasingly become based on genomic predictions [23]. The crosses
will continue to be performed by hand emasculation or using genetic male
sterility [177]. If the breeder wants to make numerous crosses with the same
parent line, a field crossing block can be used, whereby the single line is used
as a male, while the numerous other parents are used as females that a CHA has
been used to sterilize. The resultant CHA hybrids will have sufficient seeds,
meaning that the hybrids can be tested in multiple environments in replicated
trials to help predict or validate the crosses that are the most likely ones to
advance and select within [24]. Growing the hybrids in the field will also
advance the generations and produce F
For elite crosses, DH technology will be used for rapid inbreeding. For more complex crosses involving more diverse parents, large breeding populations and early generation field selections are often beneficial to identify the rare lines that possess more beneficial genes. After the selection generation(s) and once the larger breeding population has been narrowed to contain lines with more beneficial genes, SB [257], single seed descent, or bulk breeding are used. Due to the segregating material and many individuals, repeated high throughput phenotyping [258] will supplement traditional selection for early generation selection. As lines become more inbred and less heterozygous, they will be genotyped for GS [259, 260] and utilized in GWAS studies [148]. Having both phenomic and genomic data is a valuable resource in removing spatial variability and averaging year effects, especially in catastrophic years when trials are lost or have become useless due to winterkilling, hail, drought, or other calamities. In the early generations, the selection is usually “one and done”; thus, single environment selection nurseries are exceptionally vulnerable to abnormal conditions. While the predicted values may not be as accurate as high-quality phenotypic data, they are certainly better than having no data for a selection year.
Evaluation will continue to require extensive testing in mega-environments. The
same tools used for the genomic and phenomic selections will be extended to the
evaluation phase of plant breeding to better account for G
As for developing hybrids, the greatest impact would be the development of efficient and inexpensive hybrid seed production systems and the development of heterotic groups [217]. As our understanding develops on the genes that restore the sterile cytoplasm, male sterile mutations or that convey environmentally sensitive genetic male sterility, it may be possible for gene-editing to increase the restoration capabilities of single genes or link multiple restorer genes. This would lead to more efficient CMS-based hybrid systems and create environmentally sensitive genetic male sterile lines with higher fertilities when the gene is not expressed and greater sterility when the gene is expressed. Increasing our knowledge of the processes involved in breeding hybrids may improve parental line selections for crosses and help breed pure-line cultivars.
Wheat remains an important crop that helps feed the world. Though plant breeders and agronomists have a proud history of developing new cultivars and tailored production practices, the predicted demand for wheat is unprecedented. Subsequently, every available new breeding method and tool will be needed to meet this challenge. In this article, we have attempted to describe the latest breeding methods and tools that will be used to make future wheat breeders more efficient and capable of meeting the challenge. The core question remains of improving yield and productivity and reducing the risk of biotic and abiotic stresses while maintaining good end-use quality. However, the technology has continuously improved. We have attempted to illustrate how these methods and tools expand or augment the foundation for plant breeding in creating new cultivars, which will impact wheat growers and consumers. Every phase of wheat breeding will be affected, and every wheat breeder will adapt their breeding efforts to their current resources and predicted technology costs.
MA proposed, developed an outline and wrote the first draft of the manuscript. PSB and KF recommended additions to the outline and revised the draft. All authors contributed to the following revisions of the article and approved the final version. All authors have participated sufficiently in the work to take public responsibility for appropriate portions of the content and agreed to be accountable for all aspects of the work in ensuring that questions related to its accuracy or integrity. All authors contributed to editorial changes in the manuscript.
Not applicable.
This research was partially supported by the University of Nebraska-Lincoln, the University of Nebraska Foundation, and the Nebraska Wheat Board.
This research received no external funding.
The authors declare no conflict of interest.
Publisher’s Note: IMR Press stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.