Boosting the potential of cattle breeding using molecular biology, genetics, and bioinformatics approaches – a review

Cattle are among the most important farm animals that underwent an intense selection with the aim to increase milk production and to improve growth and meat properties, meanwhile reducing the generation interval allowing for a faster herd turnover. Recently, a shift from traditional breeding methods to breeding based on genetic testing has been observed. In this perspective, we review the techniques of molecular biology, genetics, and bioinformatics that are expected to further boost the agricultural potential of cattle. We discuss embryo selection based on next-generation and Nanopore sequencing and in vitro embryo production, boosting the potential of genetically superior animals. Gene editing of embryos could further speed up the selection process, essentially introducing a change in a single generation. Lastly, we discuss the host-microbiome co-evolution and adaptation. For example, cattle already adapted to low-quality low-cost fodder could be bred to achieve desired properties for the beef and dairy industry. The challenge of breeding and genetic editing is to accompany the selection on desired consumer-oriented traits with the push for sustainability and the adaptation to a changing climate while remaining economically viable. We propose that we are yet to see the limits of what is possible to achieve with modern technology for the cattle of the future; the ultimate goal will be to produce and maintain genetically elite individuals that can sustain the growing demands on the production.


Traditional breeding methods
Traditional breeding relied on the selection of superior individuals with desired traits for breeding and increasing the frequency of such traits from generation to generation, most prominently milk yield. This process was hampered by the natural limitations of cattle, i.e. life-history traits such as a long generation time in males of up to 7 years (García-Ruiz et al. 2016), fertility rates, and pregnancy length (283 days), but initially also by scientific and technological limitations: a paucity of information about the heritability of specific traits and the reluctance to adopt new practices. Typically, specific phenotypes were used as a proxy for genetics and used for a selection (in the beef industry these would be metrics such as intramuscular fat, eye muscle area, or dressed weight). This process worked well for genetic traits encoded by a single locus with large effects, while polygenic traits with small effects remained elusive. The predicted breeding value (PBV) was assessed using pedigrees: the performance of progeny was estimated using its parents. Breeding efforts have represented a conundrum between the selection pressure on one hand and inbreeding depression on the other; under extreme conditions, a single bull can sire thousands of progeny. As an example, the genetic analysis of agriculturally valuable North American Holstein cows showed that they are descendants of only two bulls (Yue et al. 2015). Thus, almost exclusively, only two versions of the Y chromosome exist in this population. Once the natural variation is lost from the population, it cannot be reconstructed; unless old, deeply-frozen sperm samples are used. For this reason, methods selecting economically important traits while accounting for inbreeding are being developed (Colleau et al. 2009), as well as more targeted, region-specific methods aimed at properly addressing regions of low heterozygosity, thus narrowing the genomic region under selection (Howard et al. 2017). Finally, the advent of genetic testing, and specifically the availability of cheap genotyping solutions, has enabled the shift from the selection based on phenotypes to selection based on genotyping (Goddard and Hayes 2007), in which the information about non-genotyped individuals can be also derived: using genotyped individuals and the pedigree information (Legarra et al. 2009).

Genetic testing and genotyping cattle with genome arrays
Microarrays specifically targeted for cattle genome (i.e. Illumina BovineSNP50 array) have enabled the genotyping of hundreds of thousands of Holsteins cows (Taylor et al. 2016), identification of relevant variants, and thus a rapid selection of traits such as milk, protein, and fat yield as well as life-history traits: generation interval, fertility, longevity, and the onset of puberty. Nonetheless, the relationship between reproductive performance and both milk and carcass traits was found to be antagonistic (Berry et al. 2014;Carthy et al. 2016); for example, high milk production decreases during oestrus (Lopez et al. 2004). Genotyping-assisted breeding has brought the average genetic gain of ~50-100% by year, and as much as 300-400% for lowly heritable traits that are difficult to select on an individual level but can be addressed in a population (García-Ruiz et al. 2016). Due to this success, Holsteins were named "genomic selection poster cows" (Taylor et al. 2016) and further array development followed. For example, high-density Illumina Bovine DNA Chip found genomic loci associated with a breeding value of fertility (BVF) and breeding value of beef (BVB) (Anton et al. 2018). The genome arrays typically represent relevant single nucleotide polymorphisms (SNPs) in the bovine genome and vary in their density, and the unsampled regions of the genome can be imputed (Boichard et al. 2012). This has allowed the discovery of agriculturally relevant loci (Kiser et al. 2019). Further, a higher focus on male-and embryo-related traits might be desirable, such as sperm quality or the number of viable and total embryos, respectively (Capitan et al. 2014;Diskin et al. 2016;Fonseca et al. 2018;Taylor et al. 2018). Lower fertility can also be linked to the hidden mutations, for instance, variants responsible for embryonic lethality that will never be observed in the homozygous state, but should be identified and subsequently considered in the breeding process (Capitan et al. 2014;Hayes and Daetwyler 2019).
The selection intensity from mothers to daughters in dairy cattle is limited due to the need to replace the predominantly female herd. Thus, out of the four paths: sire(s) of bulls (SB), sire(s) of cows (SC), dam(s) of bulls (DB), and dam(s) of cows (DC), it was the last one that brought the least amount of improvement. In contrast, the selection on SB path has resulted in marked success by decreasing the generation interval from ~7 years to less than 2.5 years in bulls, followed by DB generation interval that decreased from 4 years to nearly 2.5 years in Holstein dairy cattle (García-Ruiz et al. 2016). Because one bull can sire thousands of progeny, the selection of elite bulls is particularly important. Fertility and economics of individual herds can be improved by the bull breeding soundness evaluation (BBSE), i.e. the evaluation of semen by third-party andrology laboratories (Chenoweth and McPherson 2016). The semen can be sexed before artificial insemination to ensure the desired bull to cow ratio, to accelerate the herd turnover (and produce heifers), and lastly, to save costs associated with culling. Calves from the sex-sorted semen were shown to develop normally (Tubman et al. 2004). The disadvantage of sex-sorted semen insemination lies in the need for accurate timing of intrauterine insemination, higher costs (Garner and Seidel 2008), and pregnancy loss (Karakaya et al. 2014), which can be potentially caused by the damage introduced to the sperm (Suh et al. 2005), although new flow-sorting methods are optimized to reduce or remove this issue (Vishwanath and Moreno 2018). The cost-benefit ratio of sexed semen alone, or in combination with genomic testing, can be calculated (Newton et al. 2018). Few caveats exist for genetic testing: genetic variants, such as SNPs, work best if they are applied in the same population from which they were derived (Seidel 2009). This means that some of the variants derived using Holstein cows might not be relevant for Jersey Cattle and mirrors the observations from human, where genome-wide association studies (GWAS) are hard to interpret between populations (Tian et al. 2008;Martin et al. 2019). It should be noted that raising a generation of genetically superior cows does not necessarily translate into higher culling rates as the herd's longevity could be more economical (De Vries 2017). The selection on longevity might also slow down the long-term decline in fertility (VanRaden et al. 2004). Overall, genetic testing has brought unprecedented improvement in cattle production and economics as demonstrated by the success of the Holstein cattle, and generated an enormous interest in the field (Mäki-Tanila and Webster 2019).

In vitro embryo production
During the process of in vitro embryo production (IVP), a single cow selected for breeding is superstimulated and her oocytes are collected using transvaginal follicular aspiration. After oocyte in vitro maturation and fertilization, embryos undergo in vitro culture until the blastocyst stage and transfer into recipients. Furthermore, oocyte collection can take place during various reproductive phases, including pregnancy (Bungartz et al. 1995). Using this increasingly popular method, a single cow can parent dozens of calves via surrogacy, increasing the selection pressure and speeding up the breeding process. A prominent example of this phenomenon is a single Australian cow named W449 that mothered 137 calves. There is an additional advantage associated with IVP: each embryo can be tested, similarly to prenatal genetic diagnostics in human. A morula or a blastocyst can be biopsied (in which case trophectoderm can be used to refrain from the cells that give rise to an embryo). As few as 3-8 cells provide a sufficient amount of genetic material for the PCR analysis (Tutt et al. 2020), and the biopsy of bovine embryos produced in vivo and in vitro do not seem to affect pregnancy rates (de Sousa et al. 2016). Importantly, the retrieved embryos can be screened for aneuploidy (Griffin et al. 2019), genotyped, and ranked in quality for implantation purposes. The profitability of the IVP system compared to the artificial insemination depends on the cost of the embryo transfer, as well as the price of the surplus heifers (Kaniyamattam et al. 2017). Altogether, IVP in cattle represents a promising direction that can significantly speed up the selection for breeding purposes.
Another promising avenue is boosting the agricultural potential of cattle via maternal lineage. This is because mitochondria are only transmitted maternally (paternal mitochondria are lost), while representing an important energy center of the cell affecting all processes in an organism. In milk production, fat, protein, and lactose are carriers of energy in the milk, and sequence polymorphism in the mitochondrial DNA (specifically D-loop) might shape this energy content (Schutz et al. 1994). It was estimated that as much as 2.0, 1.8, and 3.5% of phenotypic variation in milk yield, milk fat yield, and percentage of fat in milk, respectively, was explained by cytoplasmic inheritance (Bell et al. 1985). When estimating the genetic value of a cattle, it is important to separately consider the contribution due to mitochondria, because such advantages in mothers will not be further transmitted by their sons (Schutz et al. 1994). Additionally, for female lineage, the breeders might also consider the somatic cell nuclear transfer (SCNT) or mitochondrial replacement therapy (MRT) (Wilmut et al. 2000;Herbert and Turnbull 2018). The SCNT introduces the somatic cell nucleus into the oocyte, followed by nuclear reprogramming, essentially resulting in cloned animals (Ross and Cibelli 2010;Kasinathan et al. 2015). In practice, this technique is not yet employed due to a high cost and low effectivity; the SCNT-derived animals have altered gene expression and epigenetic status (Urrego et al. 2014), with the occurrence of hydrops and cotyledonary placenta leading to a pregnancy loss (in comparison to IVP embryos) (Lee et al. 2004). Using MRT, another experimental procedure only recently approved for the limited use in human reproduction (Castro 2016), oocytes of females with inferior nuclear genome but superior mitochondrial DNA would be enucleated and used to host a selected nuclear genome (Schutz et al. 1994). However, it should be noted that it will likely not be sufficient to simply choose a superior mitochondrial lineage, as the compatibility between nuclear and mitochondrial genome might be just as important (Wolff et al. 2014;Wang et al. 2017;Hill et al. 2019;Zaidi and Makova 2019). On an inter-species level, Bos taurus oocytes with Bos indicus mitochondrial DNA, had a significantly lower mitochondrial DNA copy number, as opposed to those with Bos taurus mitochondrial DNA (Srirattana and John 2017). Thus, the appropriate donor should provide the most advantageous mitochondrial genome on the given nuclear background (Schutz et al. 1994), and the methods for mitochondrial depletion should be considered (Srirattana and John 2017).
The predicted breeding value can be calculated directly on the embryos, permanently replacing the evaluations based on progeny testing. This is already commercially offered by various biotechnological companies, although less adopted in Europe. Along with the genetic analysis and the identification of aneuploidy, additional services can be provided, such as identifying the genetic origin of an animal or the identification of various diseases, such as bovine leukocyte adhesion deficiency, MSTN mutations, or chondrodysplasia. Because the estimated value of embryos is immediately available, it can be used to guide implantation decisions.

Gene editing
The in vitro fertilization process enables the opportunity to not only screen but also to manipulate conceived embryos; the CRISPR gene-editing system has already revolutionized the field of molecular biology and genetics (Hsu et al. 2014 requires a special authorization for the cloned animals to be imported or sold in the EU. In summary, the EU is conservative in adopting gene-editing technologies. We believe that while gene-editing must be strictly controlled and regulated, scientific progress should not be impeded. The potential risks of the technology must be balanced with the advantages of more affordable and accessible production, including animal welfare and ecosystem management. Indeed, gene-editing can be used to further enhance the embryos for economic gain and animal welfare. For example, dehorning in cattle is costly and distressing for the animals. The underlying genetics suggest the modification of a single allele -POLLED -could be used to create hornless cattle (Carlson et al. 2016;Lamb et al. 2020), although non-intended modifications must be screened for (Norris et al. 2020) in order not to erode the trust of the general public that might support the use of such technology to improve animal welfare (McConnachie et al. 2019). The introduction of the POLLED allele into the US Holstein and Jersey cattle population could provide a cost-effective solution while maintaining acceptable levels of inbreeding (Mueller et al. 2019). Another application of gene editing is creating disease-resistant animals, for example, transgenic animals with higher resistance against cattle tuberculosis (Gao et al. 2017). These solutions are superior to conventional breeding methods that require a reduction in selection intensity to maintain genetic gain (Scheper et al. 2016). Moreover, recent methods enable sequential multiplex genome editing of multiple loci, followed by a clonal expansion of edited cells, essentially inducing a change in a single generation (McFarlane et al. 2019;McLean et al. 2020). While the possibilities of gene-editing technology are paramount, it is not clear if and which regulatory and ethical limits will prevent its widespread adoption (Yum et al. 2018).

Genotyping cattle with sequencing
Next-generation sequencing The next step in genotyping is the use of next-generation sequencing. This technology does not rely on the use of pre-defined probes and instead it reads DNA "as it isˮ. While the genotyping based on SNPs has brought major progress, understanding the full genomic information (including the gene copy number analysis, the discovery of new structural variants and rearrangements, novel isoforms, or long non-coding RNA), could likely further improve our breeding value predictions. Ultimately, we can only perform selection on the markers that are available and we can only improve traits that we measure, arguing that we likely do not yet understand the value of the regulatory sequences in the still inaccessible regions of the cattle genome (Hayes and Daetwyler 2019).
Just like the switch from small DNA chips to dense arrays has brought an improvement in the quality of predictions, the next-generation sequencing likely represents the next frontier. This is intertwined with the development of a complete cattle reference, and moving from a single reference genome (such as ARS-UCD1.2) into a pan-genome, representing a comprehensive catalogue of common cattle breeds (Bickhart et al. 2020). Indeed, combining data from closely related populations was shown to increase the reliability of genomic predictions (Lund et al. 2011). The existing studies so far have analyzed bovine embryonic genome activation and identified the genes activated at the 8-cell, 16-cell, and blastocyst stage (Graf et al. 2014;Jiang et al. 2014), as well as characterized imprinted genes (Chen et al. 2016) and changes between in vivo and in vitro produced embryos for selected genes (Goossens et al. 2007).
We propose a few research directions that have the potential to inform agricultural practice. The transcriptome and methylome of bovine embryos during in vivo development could serve as a baseline for the optimization of the culture media and the conditions during cultivation (e.g. temperature, CO 2 , and O 2 levels). This way, appropriate conditions can be selected to minimize embryonic stress. Due to the inherent similarities between early embryonic development in human and cattle, this aspect of cattle research could be conceivably translated to human reproduction. Low-coverage next-generation sequencing is also efficient for aneuploidy detection. This is advantageous for embryo screening before the transfer, along with the morphokinetic parameters recorded during the cultivation. The rate of aneuploidy in bovine embryos depends on the timing of cleavage (77.8% in late-cleaving vs. 31.8% in early-cleaving embryos), and this is accompanied by the lower copy number of mitochondrial DNA in late-cleaving embryos (Hornak et al. 2016). Deep sequencing could in the future replace chip genotyping, covering the whole genome as opposed to selected probes and uncovering repetitive regions of the genome; as of right now, SNP analysis remains more accessible in terms of interpretation.

Third-generation sequencing
While next-generation sequencing reads DNA/RNA fragments in the length of 100-300 basepairs per sequence, the third-generation sequencing, now represented primarily by PacBio and Nanopore, delivers hundreds of thousands of basepairs per sequence. Nanopore sequencing, in particular, is promising due to its size, affordability (MinION is a hand-held device that can be plugged into the USB port of a laptop and obtained for under 1,000 USD), and real-time character. The sequencing library can be produced in a few hours with minimal equipment and the sequences are generated as soon as the sequencing run has started. The real-time sequencing could enable the genotyping of embryos without the freeze-thaw cycle (Wei et al. 2018). Moreover, the results for aneuploidy screening are identical to those obtained by next-generation sequencing (Wei et al. 2018).
Both next-generation and third-generation sequencing face similar challenges: first, the amount of DNA from a single embryonic blastomere currently requires whole-genome amplification before the sequencing; and second, the analysis. Until both the sequencing and the analysis can be performed directly on the farms, delivering actionable information such as which embryo should be prioritized for implantation, it is likely that the farms will instead invest in obtaining commercially available elite embryos for their herd and focusing on the most effective breeding process.

Microbiome analysis and predictions
The microbiome undoubtedly influences the health and disease of an individual, as evidenced in a number of recent studies (Jami et al. 2014;Malmuthuge and Guan 2017;Young et al. 2020). The microbiome is an inseparable part of the evolution of species and has co-evolved to serve the complex requirements of cattle, and about one-third of microbial taxa could be affected by host additive genetics (h 2 ≥ 0.15) (Li et al. 2019). The adaptation to specific fodder is especially important in cattle because the feed represents 75% of the cost in the beef industry and 40-60% in the dairy industry (Bach 2012;O'Hara et al. 2020). Thus, cattle already adapted to low-quality low-cost fodder could be bred to achieve desired properties for the beef and dairy industry. This is in line with an upcoming "Feed Savedˮ project announced by the Council on Dairy Cattle Breeding, introducing feed efficiency as another trait to consider for the breeding process. Another avenue of research represent the attempts to manipulate the cattle microbiome, e.g. by inoculation. For example, the introduction of corn to unadapted cattle can cause acidosis because the bacteria that digest lactic acid, such as Megasphaera elsdenii, have not yet built up in the rumen (Miller et al. 2013). One more example includes attempts to reduce methane production, as livestock comprises a significant proportion of anthropogenic greenhouse gas emissions (Sejian et al. 2015). Such interventions might be most promising in early life when the microbiome development is still turbulent (Yáñez-Ruiz et al. 2015). In another example, the rumen microbiome (and specifically Prevotella sp.) has been associated with higher amounts of protein in the milk, and these cows also produced less methane (Xue et al. 2020). In conclusion, a better understanding of the microbiome composition and its function, as well as the process of microbiome adaptation and heritability, could provide a distinct advantage for the agricultural practice.

Cattle of the future
The current issues in agriculture are related to economics, followed by environmental issues such as antibiotic resistance and climate change (specifically, adaptation of cattle to the changing environment). Ideally, strategies that simultaneously improve both adaptation and production should be utilized (Marshall et al. 2019;Strandén et al. 2019), as well as increasing disease resistance while improving sustainability (Boichard et al. 2012). For instance, heat tolerance could be achieved by the introgression of the "slickˮ prolactin receptor variant (Davis et al. 2017). The value of an animal will be characterized not only by its parameters but also by its resilience. We believe that the most effective solutions are aimed towards producing elite individuals accompanied by the reduction of herd size (Plate II, Fig. 1). What additional changes will we see in the future? The health and wellness of animals will likely be continuously monitored via cattle tracking systems. These systems could monitor their GPS location, rumination length, heat/cold stress, provide early mastitis or lameness detection, and others (Norton and Berckmans 2017).

Conclusions
The biggest barriers to adopting novel techniques of reproductive biology into agriculture are cost and legislation. While genotyping the individuals and in vitro culture was initially associated with a higher cost, it paid off in the form of long-term benefits in longevity, health, and fertility of the animals. In the future, new statistical methods will enable focus on longitudinal traits (such as lactation curve) (Oliveira et al. 2019), new techniques will enable earlier pregnancy detection and the technologies that improve both adaptation and production will be of utmost importance, as well as next-and third-generation sequencing, gene editing (coupled with higher cloning and gene editing efficiency), and microbiome research. This will lower the cost and speed up the production of individuals with desirable traits, while simultaneously promoting sustainable solutions. In the future, we will likely see a shift from herds composed of average individuals and few elite ones to herds composed of only individuals traditionally labelled as elite. While these technological advancements will require reconciliation with existing legislation in each country, there is no doubt about their potential to lower the impact on the environment and human health. Fig. 1. Boosting the agricultural potential of cattle using genotyping (microarrays, next-generation sequencing, Nanopore sequencing), gene editing, and microbiome research. The superior embryos will be selected, screened, and implanted into surrogate mothers. The elite individuals from this process will replace larger, traditional herds. The figure was created using BioRender.com.