Transcriptional and epi-transcriptional dynamics of SARS-CoV-2 during cellular infection

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) uses subgenomic RNA (sgRNA) to produce viral proteins for replication and immune evasion. We apply long-read RNA and cDNA sequencing to in vitro human and primate infection models to study transcriptional dynamics. Transcription-regulating sequence (TRS)-dependent sgRNA upregulates earlier in infection than TRS-independent sgRNA. An abundant class of TRS-independent sgRNA consisting of a portion of open reading frame 1ab (ORF1ab) containing nsp1 joins to ORF10, and the 3′ untranslated region (UTR) upregulates at 48 h post-infection in human cell lines. We identify double-junction sgRNA containing both TRS-dependent and -independent junctions. We find multiple sites at which the SARS-CoV-2 genome is consistently more modified than sgRNA and that sgRNA modifications are stable across transcript clusters, host cells, and time since infection. Our work highlights the dynamic nature of the SARS-CoV-2 transcriptome during its replication cycle.

In brief SARS-CoV-2 is the pathogen that is responsible for the global COVID-19 pandemic. Chang et al. demonstrate that the transcriptome of SARS-CoV-2 is dynamic and complex, with expression and relative proportions of viral mRNA changing to reflect the stage of infection in vitro. In contrast, the epi-transcriptome is stable throughout infection.

INTRODUCTION
SARS-CoV-2, a positive-strand RNA beta-coronavirus, is the causative agent of coronavirus disease 2019 (COVID-19) (Zhou et al., 2020). As with all identified coronaviruses, the replicative and infectious cycle of SARS-CoV-2 is characterized by a process termed discontinuous minus-strand extension, which occurs during replication of viral RNA by the viral replication and transcription complex (RTC) within the host cell. The RTC halts synthesis of negative sense RNA when it encounters a 6 to 8 nucleotide (nt) transcription-regulating sequence (TRS) in the body of the genome (TRS-B) and reinitiates synthesis via a template switching event with a homologous TRS present in the 5 0 leader sequence (TRS-L) (V'Kovski et al., 2021). This results in a set of nested negative-strand templates (shown in Figure S1), which are utilized for expression of subgenomic mRNA (sgRNA). Each sgRNA includes the 3 0 polyadenylated (poly(A)) untranslated region (UTR), a truncated set of 3 0 open reading frames (ORFs), and a common 5 0 leader sequence. The production of subgenomic transcripts alleviates pressure on the primary viral genome for protein synthesis and enables the translation of proteins at greater speed and concentration. Major SARS-CoV-2 TRS-dependent mRNAs have been previously described (Davidson et al., 2020;Kim et al., 2020;Taiaroa et al., 2020). However, the changes in the viral transcriptome and epi-transcriptome across the course of cellular infection have not yet been explored.
Long-read sequencing platforms can generate reads spanning the length of these sgRNA and are thus better suited to transcriptomic characterization of its highly nested transcriptome. One such platform is the MinION (Oxford Nanopore Technologies [ONT]), which can sequence either native RNA or cDNA directly without requirement for PCR amplification, therefore reducing PCR-induced biases in estimation of expression levels (Garalde et al., 2018;Batista et al., 2020). Furthermore, RNA modifications induce changes in ONT signal, which enable exploration of the epi-transcriptome using direct RNA (dRNA) sequencing (Garalde et al., 2018;Kim et al., 2020). In this manuscript, we carried out a comprehensive assessment of SARS-CoV-2 transcription. We generated more than 8 million long-read viral dRNA sequences and direct cDNA reads across multiple time points (2, 24, and 48 h post-infection [hpi]) with infected African green monkey (Vero) and human (Calu-3, Caco-2) cell lines. Our dataset was supplemented with publicly available virion (Taiaroa et al., 2020) and HCoV-229E (Viehweger et al., 2019) datasets. We have developed a site to explore the dynamic SARS-CoV-2 transcriptome in an interactive web app: http:// coinlab.mdhs.unimelb.edu.au/. Our dataset provides an expansive overview of the SARS-CoV-2 transcriptome and its changes throughout the course of infection. Our work will enable the development of new diagnostic tests for monitoring the progression of SARS-CoV-2 infectious cycle both in vitro and in vivo. This will assist in better understanding the mechanism of action of therapeutic agents and in monitoring the efficiency of the immune response to SARS-CoV-2 in vaccination studies.

RESULTS
Infection dynamics are represented by changes in proportion of sgRNA Viral RNA load was substantially higher in African green monkey Vero cells in comparison with human Caco-2 and Calu-3 cell lines, reaching a maximum of 74% of all sequenced RNA at 24 hpi. In comparison, a maximum of 4% of all sequenced RNA mapped to SARS-CoV-2 in infected human cell lines at 48 hpi ( Figure 1A). Even as early as 2 hpi, substantially more viral reads were detectable in Vero compared with Caco-2 and Calu-3 cells ( Figure S2), suggesting a faster course of infection in Vero cells.
The proportion of sgRNA (i.e., reads containing both 5 0 leader and 3 0 UTR) among all viral mapping reads peaked at around 40% in all three cell lines at 24 hpi ( Figure 1A). In the non-replicating virion sample (Taiaroa et al., 2020), as well as the 2 hpi samples, most reads were sequenced from the viral genome, because they had complete 3 0 UTR but no 5 0 leader (labeled as non5_3; Figure S2). This indicated that transcriptional activity had yet to accelerate at this early time point. Vero cells showed a greater proportion of sgRNA at 2 hpi compared with the Caco-2 and Calu-3 (Fisher's exact test p = 0.02), suggesting that transcriptional activity is able to commence earlier during infection of Vero cells.
To further investigate the relationship between production of sgRNA and progression of infection, we calculated, for each ORF, the proportion of reads spanning the ORF, which also contained the leader sequence ( Figure 1B). We observed that the RNA derived from the virion sample had the least sgRNA, followed by 2 hpi, whereas the 24 hpi samples had the highest proportion of sgRNA in Vero and Calu-3 cell lines, with maximum discrimination between the virion RNA and 24 hpi obtained for the N ORF.
We then designed primers to measure both subgenomic and total N ORF expression and used quantitative reverse transcription PCR (qRT-PCR) with primers targeting these regions. For comparison, we applied the same approach for both subgenomic and total E ORF (Wö lfel et al., 2020;Corman et al., 2020). In all three cell lines, the difference between subgenomic and total N and E ORFs was smallest at 24 hpi (1.1 and 3.1 cycle threshold [Ct] difference, respectively, in Vero) with a slight increase at the final 48 hpi time point ( Figure 1C). This suggests that SARS-CoV-2 reaches its peak rate of transcriptional activity at the 24 hpi time point. By calculating expected Ct differences between subgenomic and total E and N ORFs from sequence data, we further confirmed that qRT-PCR results captured the same dynamics ( Figure S3).
Overall, these results reveal the changing proportions of sgRNA during the SARS-CoV-2 virus infectious cycle. Our analysis using qRT-PCR to compare total and sgRNA demonstrates the potential to track viral transcriptional activity using PCR. Our data indicate that the slower rate of infection in human compared with monkey Vero cell lines may arise because of both differences in viral entry and differences in rate of early viral genome replication.
Coronaviruses produce classes of TRS-independent sgRNA, which are abundantly expressed Although all coronaviruses use a repetitive 6 nt TRS throughout the genome to generate a nested set of TRS-dependent sgRNA, the breadth of data generated in this study reveals a more detailed transcriptome that is also constituted by transcripts generated through other, unknown genome mechanisms. The depth profile of sgRNA showed sharp changes in read depth, corresponding to negative-strand disjunction mediated by TRS immediately upstream of the ORF (Figure 2A). To better quantify different classes of sgRNA, we developed a new tool, npTranscript, which assigns reads to transcript clusters (see STAR Methods). Using npTranscript, we could calculate the abundance of the sgRNA at various stages of infection. At the peak of infection in Vero cells (24 hpi), the most abundant sgRNA in terms of transcripts per million (TPMs) mapped viral reads were ORFs N (266,000), 7a/7b (63,000), M (62,000), ORF1ab,ORF10 (60,000), ORF3a (26,000), ORF8 (16,000), ORF6 (13,000), S (7,500), E (6,100), and ORF1ab,N (5,700) ( Figure 2B). Time points sequenced: 2, 24, and 48 hpi; cells infected: Caco-2, Calu-3, and Vero. (A) Bar chart (left axis) indicate classification of viral mapping reads based on whether they include 5 0 (e.g., leader), as well as 3 0 (i.e., UTR and poly(A) tail), in terms of transcripts per million (TPMs) mapped viral reads. Line graph (right axis) indicates proportion of host, viral, or sequin mapping reads. (B) Proportion of reads covering each ORF, which are sgRNA by virtue of containing 5 0 leader sequence for direct RNA sequencing datasets. Error bar indicates 95% binomial confidence interval (CI) of proportion estimate using the logistic parameterization. (C) sgRNA activity of SARS-CoV-2 measured by comparing mean differences in Ct values between subgenomic and total N and E genes across technical replicate wells of infected cells (n = 2-3) from all cell lines and across four time points (0, 2, 24, and 48 hpi), shown with ± SD error bars. The mean difference between subgenomic and total transcripts decreases over time and reaches a minimum at 24 hpi, indicating that sgRNA reaches its peak transcriptional activity at 24 hpi across all cell lines. See also Figures S1-S3 and S5 and    Figure 2C). Further inspection of TRS-independent sgRNA indicated that the majority included the first polypeptide in ORF1ab ( Figure 2D). Taking into account polypeptide boundaries, these transcripts contained leader nsp1 and a variable 3 0 trailer incorporating a segment of the genome upstream of ORF10 and continuing until the terminus. The exclusion of the ORF1ab stop codon will allow translation to continue into a portion of the 3 0 ORF downstream of the junction site before a stop codon is reached, which has the potential to produce truncated proteins of unknown function.
To investigate whether this unusual transcript is unique to the SARS-CoV-2 (which is part of the beta-coronavirus family), we re-analyzed ONT dRNA sequence data from the alpha-coronavirus HCoV-229E (Viehweger et al., 2019). We found a similar pattern of TRS-dependent and TRS-independent sgRNA (Figures 3A and 3D), in which the first polypeptide nsp1 was joined to a portion of the 3 0 UTR. Using npTranscript to quantify abundance of these transcripts, we found that TRS-independent transcripts were substantially more abundant in wild-type HCoV-229E compared with a mutant form of 229E in which the conserved 5 0 stem loop 2 (SL2) in HCoV-229E is replaced with that from SARS-CoV and B-CoV 3 0 UTR ( Figure 3B). This finding is suggestive of a role for the SL2 of the leader sequence in the creation of these transcripts in 229E, perhaps via long-range RNA-RNA interaction, and may be relevant to the similar extended leader mRNAs found in SARS-CoV-2. Inspection of the RNA secondary structure of ORF10 + 3 0 UTR indicates that ORF10 forms a bulged stem loop (BSL) structure, upstream of the hypervariable BSL region of 3 0 UTR ( Figure 3C). The BSL is a conserved feature of beta-coronavirus genomes and thought to be essential for viral replication (Madhugiri et al., 2016). Taken together, this evidence supports the role of ORF10 as part of the 3 0 UTR of SARS-CoV-2.

SARS-CoV-2 produces double-junction sgRNA
We also identified a persistent ''double-junction'' pattern in SARS-CoV-2 transcripts. This category featured sgRNA that showed two patterns of disjunction present at low concentrations across both dRNA and cDNA datasets ( Figure 4A). ORF10 was the most frequently added terminal 3 0 ORF in double-junction sgRNA ( Figure 4B). Most first disjunction events were TRS dependent, although 10% used the TRS-independent ORF1ab breakpoint as described in the previous section (Figure S4). In contrast, most second disjunctions were non-TRS dependent, and the 3 0 breakpoint mirrored the ORF1ab,ORF10 breakpoint, suggestive of shared joining mechanism controlling this second junction that differs from TRS-mediated discontinuous minus-strand extension ( Figure S4A). Double-junction sgRNAs were greatest in the Calu-3 48 hpi dataset, in which we observed leader,N,ORF10 and leader,ORF7a,ORF10 as most abundant, with 1,241 and 811 TPMs, respectively. We also observed triple-disjunction clusters at very low levels of expression, such as ORF1ab,ORF1ab,ORF1ab,ORF10, which had an estimated 60 TPMs in the Calu-3 48 hpi dataset. The majority of final junctions of these triple-junction reads includes the ORF10 breakpoint ( Figure S4B).
In comparison, HCoV-229E appeared to have a smaller proportion of double-junction reads. Nevertheless, we observe a leader,ORF1ab,3UTR double-junction cluster ( Figure 3D). This cluster was observed only in the WT HCoV-229E strain, and hence highly dependent on SL2 in the 3 0 UTR ( Figure 4C).

Viral transcript polyadenylation patterns consistent with templating from negative strand
We detected reads with no region mapping to the 3 0 end of the viral genome in both cDNA and dRNA datasets ( Figure 1A). Upon inspection with Nanopolish 'polya,' we found that no poly(A) tail is detected in most of these reads ( Figure S5A), and that more than half of these reads also lacked detectable sequencing adaptor. This observation was consistent regardless of whether the transcripts mapped to the viral 5 0 terminus and contrasted with transcript categories that mapped to the 3 0 end and possessed clearly segmented poly(A) tails. The pattern observed was also consistent with polyadenylation being produced by templating from the negative strand, rather than from host polyadenylation factors. The quantity of non-3 0 reads varied from 2% to 4% (median = 3.3%) of viral reads in all the dRNA datasets we analyzed except for Vero 24 hpi in which 10% of reads lacked the expected 3 0 viral segment (Table S1). We found a nonrandom distribution of terminal breaks for non-3 0 reads; however, the sequence composition of their end segments does not support the idea that it is driven by runs of internal poly(A) (Figures S5B and S5C). Given the requirement for poly(A) tails for ONT sequencing, we considered that these reads may arise from incorrect segmentation of a single read into multiple reads, only one of which possessed a poly(A) tail.

Viral sgRNA expression patterns change during the course of cellular infection
In order to interrogate the differential expression of SARS-CoV-2 transcriptional clusters during the time course, we analyzed ONT direct cDNA data, which were sequenced in triplicate for each time point (2, 24, and 48 hpi) and each cell line (Vero, Calu-3, Caco-2). We utilized npTranscript to generate a reference transcriptome of sgRNA produced by SARS-CoV-2 and to assign    Article ll reads to transcript clusters (see STAR Methods), followed by DESeq2 for differential expression analyses. We normalized each library by the number of viral mapping reads, rather than the total number of viral and host mapping reads in order to establish changes in relative abundance, rather than simply track increase in overall viral RNA during the infection (which can be   Article ll OPEN ACCESS seen in Figure 1). From this analysis, we could identify differentially expressed transcripts between time points, 24 versus 48 hpi (late) in all three cell lines and 2 versus 24 hpi (early to late) in the Vero cell line only (because of extremely low abundance of viral mapping reads in human cell lines at 2 hpi).
Interestingly, in addition to differential expression of transcripts, which have both 5 0 leader and 3 0 UTR, we also found differential expression of transcripts that lacked the leader (non5_3) or the 3 0 UTR (5_non3), or both (non5_non3) (Figures 5A-5D). For the main analysis, we proceeded to analyze the 5_3 subset of the differential expression results (Table 1). From our data, we estimate that the general trajectory of differential expression of SARS-CoV-2 subgenomic transcripts during an infection presents an upregulation of TRS-dependent and TRS-independent transcripts between early and late infection, and then downregulation of TRS-dependent and TRS-independent transcripts, followed by an upregulation of fragmented non5_non3 transcripts at the final stage. Of note, the transcriptional activity of TRS-independent transcripts appeared to occur faster in Vero cells compared with the human cell lines, as seen by the delayed upregulation of TRS-independent transcript in human cell lines in relation to Vero cells ( Figure 5E).
Among these results, one TRS-independent transcript, lead-er_ORF1ab,ORF10_3UTR, has been shown to be consistently differentially expressed across all cell types. This transcript was significantly upregulated (adjusted p value [p-adj] < 0.05) between 2 and 24 hpi in Vero cells and downregulated between 24 versus 48 hpi ( Figures 5A-5D). In comparison, this transcript was upregulated between 24 versus 48 hpi in Caco-2 and Calu-3 cells, mirroring the viral counts over time ( Figure 1A) as the level of these transcripts peaked at 24 hpi in Vero cells and at 48 hpi in the human cell lines. Collectively, these results suggest that the peak of TRS-independent transcriptional activity occurs earlier in Vero cells compared with human cell lines, and the presence of this TRS-independent transcript is of importance because it appears in all three cell types.
Additionally, we found that differentially expressed 5_3 transcripts (p-adj < 0.05) that were either genome mapped or transcriptome mapped revealed a positive linear correlation in log 2 FC between the two mapping methods (Table S2), with less transcripts being differentially expressed in transcriptomemapped reads.

RNA modifications vary between genomic and sgRNA, but not throughout the course of infection
We used Tombo to determine de novo modification predictions on the various mRNA transcripts of the viral genome. Using virion dRNA as baseline (Taiaroa et al., 2020), we identified changes to modification of the genome throughout the course of infection, between individual transcripts, and across the three cell lines: Vero, Calu-3, and Caco-2. The vast majority (98.2%) of reads from the virion dataset included the 3 0 UTR, but not the 5 0 leader, and thus we inferred that it was almost entirely composed of reads from the viral genome rather than transcribed mRNA. The depth of coverage of this dataset was very low at the 5 0 leader, and thus we are unable to report results of RNA modifications in the leader region ( Figure S6).
The rapid infectibility of Vero cells allows a clear analysis of modifications at 24 and 48 hpi. The 2 hpi time point failed to produce adequate subgenomic expression for the analysis, and only 311 viral reads were detected in total.
In our analysis, predicted viral modification sites on specific sgRNA clusters did not change markedly throughout the infection time course ( Figure 6). However, we saw differences on sgRNA as compared with the RNA genome. In particular, all analyzed sgRNA clusters displayed an absence of modification relative to virion genome in three regions as measured by the mean difference in methylated fraction (m DMF; see STAR Methods): 26,130-26,135 in ORF3a (m DMF = 0.62), 28,858-28,862 in ORFN (m DMF = 0.6), and 29750A in 3 0 UTR (m DMF = 0.48) (Figure 6). We also observed that these modifications in the virion genome generated an artificially high rate of base-calling error at these positions.
These same findings were repeated in data from Calu-3 and Caco-2 cells at 24 hpi, indicating that the different cell lines had little impact on modification changes ( Figure 6). These results demonstrate that the viral genome carries RNA modifications that are not detectable on expressed mRNA.
The modifications reported here all consist of changes at >0.2 DMF 2 (equivalent to DMF 0.44). A results summary including bases at >0.1 DMF 2 (equivalent to 0.31) are included in Data S1.

DISCUSSION
The use of long-read native RNA and direct cDNA sequencing allowed the identification of TRS-dependent and -independent transcripts in SARS-CoV-2. TRS-independent transcripts (sometimes referred to as non-canonical sgRNA) are formed without utilizing homologous TRS sequences and have been observed to occur in other SARS-CoV-2 transcriptome studies (Nomburg et al., 2020;Gribble et al., 2021;Kim et al., 2020;Taiaroa et al., 2020). Analysis of the time-course data presented in this manuscript shows a delayed increase of TRS-independent transcripts relative to TRS-dependent transcripts in two , and Caco-2 (D) cell lines between 24 versus 48 hpi analyzed using DESeq2. Thresholds of p-adj < 0.05 and |log 2 FC| > 0.5 were applied to the data. Orange dots indicate transcripts that have |log 2 FC| > 0.5, blue dots indicate transcripts that have p-adj < 0.05, and green dots indicate transcripts that satisfy both criteria. Positive and negative log 2 FC indicate upregulation and downregulation at the latter time point, respectively. The transcript nomenclature W_X,Y_Z indicates that the transcript consists of the segment from W to X joined with the segment from Y_Z. (E) Changes in TRS-dependent (dotted) and TRS-independent (continuous) TPM mapped viral reads across multiple time points (2, 24, and 48 hpi) in Caco-2 (orange), Calu-3 (green), and Vero (blue) cell lines. Error bars indicate 95% binomial CI of TPM estimate using the logistic parameterization. See also Figures S1 andS7 and Table S2.
The function of SARS-CoV-2 ORF10 remains unclear. Some studies have reported evidence of ORF10 translation (Finkel et al., 2021), while others have not found conclusive evidence of its existence in proteome databases (Taiaroa et al., 2020). Pancer et al. (2020) identified single-nucleotide polymorphisms (SNPs) that cause premature stop codons in ORF10 but do not impact viability in vitro or in vivo. The active transcription of the leader ORF1ab,ORF10_3UTR transcript in our data ( Figures  5A-5D) suggests a role for ORF10 distinct from its protein coding potential. This transcript contains the full-length nsp1 peptide, which is responsible for inhibiting host translation (Schubert et al., 2020), as well as the stabilizing stem loop structure from ORF10. Thus, the role of ORF10 in this context may be to stabilize the RNA molecule and enhance production of nsp1. The RNA family database RFAM (Kalvari et al., 2018) includes ORF10 in the Sarbecovirus-3UTR-annotated (RF03125) region of the SARS-CoV-2 genome.
We also observed ORF10 participating in the formation of the second junction in double-junction transcripts. These transcripts typically have a TRS-dependent first junction and a TRS-independent second junction to ORF10. The position of the second junction in the region upstream of ORF10 is variable, further supporting the notion that the ORF10 junction occurs in a homologyindependent manner.
One explanation for the mechanism of TRS-independent sgRNA formation may be long-range RNA interactions. Longrange RNA interactions have previously been demonstrated as important for TRS-mediated leader-body joining in other coronaviruses (Mateos-Gomez et al., 2013) and may also be essential for non-TRS-mediated binding as seen in SARS-CoV-2. Ziv et al. (2020) explored cis-acting RNA-RNA interactions in SARS-CoV-2 and found several long-distance interactions within ORF1ab, including one that binds position 8357 nt with the 3 0 UTR of the genome. This interaction may be responsible for promoting generation of the ORF1ab-ORF10 transcripts we describe.
Differential expression analyses were produced by mapping to the viral genome, as well as to the transcriptome (Table S2). Mapping to the genome allows novel transcripts to be found (Tombá cz et al., 2016), whereas mapping to the transcriptome ensures the identity of the transcripts by clearly defining the junctions/breakpoints (Zhao, 2014). In this context of investigating the transcriptome of a novel coronavirus, there is more merit in mapping to the genome than the transcriptome because the transcriptome has not yet been extensively investigated and is most likely to be incomplete. The issues of using an incomplete reference transcriptome have been outlined previously (Pyrkosz et al., 2013). In our data, this is exemplified in transcripts that contain ORF1ab, because the breakpoint of ORF1ab is variable and cannot simply be defined by one breakpoint coordinate (Figure S7). This may explain why some transcripts are found differentially expressed in genome-mapped, but not transcriptome-mapped, analyses.
The data generated in this study are uniquely suited to studying the dynamics of viral epi-transcriptomics. Earlier studies The differential expression results have been filtered by p-adj < 0.05 and | log 2 FC| > 0.5, and the transcript nomenclature W_X,Y_Z indicates that the transcript consists of the segment from W to X joined with the segment from Y_Z.
10 Cell Reports 35, 109108, May 11, 2021 Article ll OPEN ACCESS have gained insights on methylation of 5 0 capping for the escape of host immunity (Chen et al., 2011) and the impact of host epigenetics on disease outcome (Pinto et al., 2020). Kim et al. (2020) published the first bioinformatics analysis of base modifications on the SARS-CoV-2 viral genome in which they report 41 potential 5mC viral modification sites by contrasting signalspace information of the dRNA-sequenced viral genome against unmodified in-vitro-transcribed (IVT) sequence data. We used the virion-derived RNA as a control, which enabled us to focus on differences between modifications on the viral genome and transcriptome. We find that the genomic RNA harbors more RNA modifications than the transcribed sgRNA. In particular, we report three regions that are more modified in genomic RNA than sgRNA. The strongest of these (position 28858-28862) was reported by Kim et al. (2020), who also reported that position 28859 is more modified among longer sgRNA. We extended this finding by showing that the modified state is representative of the  Figure S6 and Data S1.
Cell Reports 35, 109108, May 11, 2021 11 Article ll genome RNA. We also report a remarkably stable pattern of modifications that showed very little change across cell lines and time points in transcribed sgRNA. This is the first evidence reported for the stability of SARS-CoV-2 epi-transcriptome throughout infection.
A deeper understanding of the SARS-CoV-2 transcriptome and how it changes during infection may lead to new avenues for therapeutic strategies. One example is development of strategies to disrupt the complex patterns of negative-strand disjunction to form sgRNA. Our work also highlights the importance of TRS-independent transcripts in the infectious cycle of SARS-CoV-2, which may also be an avenue for therapeutic development. Moreover, such knowledge also spurs the next generation of diagnostics for monitoring infection progression. The RNA genome modifications described here may also be a target for therapy, although further research is required to understand the role of the modifications described here.

Limitations of study
We used in vitro infection of mammalian cell lines, and as such our conclusions may not fully reflect in vivo SARS-CoV-2 infection. We included three time points in our study (2, 24, and 48 h), which may potentially miss key transcriptional changes in between these time points or after the final time point. We used a SARS-CoV-2 strain obtained from early (January 2020) in the pandemic, which is close to the original Wuhan strain and may not fully reflect infection dynamics of currently circulating strains. Our experimental design used replicates resulting from three separate infections of the same cell line performed at the same time and should therefore be regarded as technical rather than biological replicates. These replicates were sequenced separately for direct cDNA sequencing but were sequenced after pooling for the dRNA because of difficulties in multiplexing dRNA sequencing using ONT sequencing kits. The direct cDNA libraries were multiplexed and sequenced in batches of three infected and three control from the same time point on the same flow cell; thus, there is a possibility of batch effects in comparison between time points.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:

RESOURCE AVAILABILITY
Lead contact Further information and requests for resources, reagents, and code should be directed to and will be fulfilled by the lead contact, Lachlan Coin (lachlan.coin@unimelb.edu.au).

Materials availability
This study did not generate new unique reagents.