阅读下这篇文章
Created on: November 26, 2025
Answered using GPT-5.1 Thinking by Chat01
Created on: November 26, 2025
Answered using GPT-5.1 Thinking by Chat01
这篇文章我已经读完了,大致内容先帮你梳理一下👇 (s41477-020-0688-1.pdf)
这篇文章聚焦在 拟南芥,想系统回答:
作者建立了一套 Nanopore 长读长的“全长 chromatin-bound RNA-seq”:
得到约 350 万条高质量全长 reads,覆盖 2 万多个蛋白编码基因。(s41477-020-0688-1.pdf)
结论: 在拟南芥中,共转录剪接存在,但整体偏慢。
解释:
这些带内含子的 poly(A)+ 转录本 被“扣留”在染色质上,只有在后续把特定内含子剪掉后,成熟 mRNA 才能释放到核质和胞质。
作者把这类 在染色质 poly(A)+ 中仍有明显保留,但在胞质里基本全剪掉的、构成型内含子,定义为:
pts introns(post-transcriptionally spliced introns,后转录剪接内含子)。(s41477-020-0688-1.pdf)
作者整理了 6,521 个公开 RNA-seq 文库(>1500 个“mutant vs WT” 或 “treatment vs control” 配对),系统寻找:
突变体:
环境处理:
结论:
Arabidopsis 里大量观察到的 intron retention 事件,本质上大多发生在 pts 内含子上,是染色质滞留 + 后转录剪接所允许的一种可调控状态,而不是随机剪接失败。
可以看成是一个“核内待命 mRNA 仓库”模型:
这样,后转录剪接 + 核内滞留 就变成了一个可以快速释放响应的调控层,在植物和动物中可能都是保守机制。(s41477-020-0688-1.pdf)
如果你接下来想:
直接告诉我你更关心哪一块,我可以按那一块给你“逐段精读版”。
Unexpectedly, we also detected a high percentage of polyad-
enylated transcripts on chromatin, and ~30% of these transcripts
remained incompletely spliced (Figs. 1b, 2a and Fig. 3a). Both the ratio
of incompletely splicing transcripts and the ratio of unspliced introns
are highly consistent between the two Nanopore libraries (r = 0.83
for genes and r = 0.85 for introns; Supplementary Figs. 6 and 7). The
splicing results are also highly consistent between the PacBio library
and the two Nanopore libraries (r = 0.87 and r = 0.92 for genes, and
r = 0.88 and r = 0.91 for introns; Supplementary Figs. 6 and 7). We
also constructed Illumina RNA-seq libraries of chromatin-bound
polyadenylated RNA with three biological replicates (Supplementary
Table 1). The ratios of unspliced introns are also highly consistent
between Illumina RNA-seq and Nanopore data (r = 0.91), and are
also consistent across three Illumina RNA-seq biological replicates
(r = 0.94, r = 0.95 and r = 0.96; Fig. 3b and Supplementary Fig. 7).
Furthermore, most of the incompletely spliced polyadenylated
transcripts contained only one or a few unspliced introns, whereas
the other introns had been spliced efficiently (Figs. 2d, 3a and 3c),
indicating the specificity of unspliced introns. To further analyse the
differences between introns with high and low unspliced ratios in
chromatin-bound polyadenylated RNA, we separated introns into
two groups using a cut-off value of 0.1. For introns in the group
with an unspliced ratio of chromatin-bound polyadenylated RNA
of higher than 0.1, the cotranscriptional splicing rates were much
slower (Fig. 3d), consistent with the result showing that these introns
remain unspliced after PA.
这一大段再说什么,我理解前一大段说这一段说明核内存在加了a尾的read仍然有一部分还有内含子,感觉后面说的是一样的事情呢
你理解得对:这段开头那句,确实是在重复前面那个结论——
“染色质上有不少已经 polyA 的转录本,其中大概 30% 还是没剪完内含子”。
但这整段的 重点不是再说一遍现象本身,而是做两件事:
后面一串 r = 0.xx,都是在做“交叉验证”:
Nanopore 两个生物学重复之间
PacBio 和 Nanopore 之间
Illumina 的短读长 vs Nanopore 长读长
意思就是:
“不管用哪个平台(Nanopore / PacBio / Illumina),用哪次重复,我看到的‘polyA+ 但有内含子保留’这个图像都一样。
所以这绝不是某个平台特有的错误或某一次实验的偶然现象。”
这一步是在 打消读者的技术质疑:
“Nanopore 错配/错判内含子吧?” —— 他们用不同平台 + 多重复来证明 不是技术 artifact。
后半段又做了两件事情:
“most of the incompletely spliced polyadenylated transcripts contained only one or a few unspliced introns, whereas the other introns had been spliced efficiently … indicating the specificity of unspliced introns.”
翻成白话就是:
一条 polyA+ 转录本如果没剪完,通常只剩一两个内含子没剪,
其余内含子都剪得很好,
所以被保留的那些内含子是“被特别点名留下来的”,不是随机漏剪。
这在图 2d, 3a, 3c 里都有可视化。
他们设了一个阈值 0.1 来分组:
然后去看这两组内含子在 共转录剪接时的速度曲线(Fig.3d):
结果:高 unspliced ratio 组的共转录剪接明显更慢,
也就是:在 Pol II 走过 3′SS 很远之后,这些内含子仍然不太容易被剪掉。
这句话的逻辑是:
“之所以它们在 polyA 之后还留着内含子,是因为它们从一开始共转录剪接时就剪得比较慢。”
这一步是在把 “后转录还没剪完” 和 “共转录剪接动力学偏慢” 联系起来,
给后面“pts intron = 本来就剪得慢的一类内含子”这个概念打基础。
所以它看上去好像在“重复”,其实是在 从“有这个现象”走到“这个现象是可靠的,而且指向一类特殊慢剪内含子”。
后面把这类内含子命名为 pts introns,再去分析它们在突变体/环境胁迫中的调控。
We next examined whether these incompletely spliced transcripts
can be released from the chromatin. We constructed next-generation
sequencing (NGS) RNA-seq libraries using polyadenylated RNAs
from different cell fractions, and found that the introns with an
unspliced ratio of higher than 0.1 in chromatin-bound polyadenyl-
ated RNA were barely detectable in the nucleoplasm and cytoplasm
(Fig. 3e), suggesting that the intron-containing transcripts were
still bound to chromatin after PA. Moreover, the RNA accumula-
tion level (fragments per kilobase of transcript per million mapped
reads (FPKM) value) of genes with a higher incompletely spliced
ratio in chromatin-bound polyadenylated RNA was notable higherWe next examined whether these incompletely spliced transcripts
can be released from the chromatin. We constructed next-generation
sequencing (NGS) RNA-seq libraries using polyadenylated RNAs
from different cell fractions, and found that the introns with an
unspliced ratio of higher than 0.1 in chromatin-bound polyadenyl-
ated RNA were barely detectable in the nucleoplasm and cytoplasm
(Fig. 3e), suggesting that the intron-containing transcripts were
still bound to chromatin after PA. Moreover, the RNA accumula-
tion level (fragments per kilobase of transcript per million mapped
reads (FPKM) value) of genes with a higher incompletely spliced
ratio in chromatin-bound polyadenylated RNA was notable higher
我还是有一点疑问,未剪切比不是read吗,怎么能是内含子呢? 第二点在核质和胞质看不见,核质不是我们传统意义上的核内吗
好问题,这两点一开始确实很容易绕晕 🤝 我按两个小问题分别说。
这里有两个层次的量,文章里都有提到:
ratio of incompletely spliced transcripts
ratio of unspliced introns(未剪切内含子比例)
也就是说:
“introns with an unspliced ratio of higher than 0.1”
指的是:在 chromatin-bound polyA RNA 中,有 ≥10% 的 reads 仍然保留这个 intron 的那些 intron。(s41477-020-0688-1.pdf)
read 只是用来“数”,真正被打分的是 每一个内含子位点。
所以“未剪切比例高的内含子”这句话是没问题的。
在 Illumina 数据里,他们用 PIR(percent intron retention)做这个 “unspliced ratio”,方法部分也写得很清楚:拿 E1–I、I–E2 的 junction reads 代表保留内含子,E1–E2 的 junction 代表已剪接,然后算一个百分比。(s41477-020-0688-1.pdf)
对,这里是分级亚细胞分馏的问题:作者把“核内”再细分成了两块。
他们的操作大致是这样(Methods 页有详细 protocol):(s41477-020-0688-1.pdf)
所以:
因此他们的逻辑是:
如果这些带内含子的 polyA+ 转录本已经从染色质上释放下来,
那至少应该在 nucleoplasm(核质) 里能看到它们的踪迹,
最后才有机会出口到 cytoplasm(胞质)。
但他们看到的是(Fig.3e):
加上 Fig.3f 里这些基因在 chromatin 里的表达量高于 nucleoplasm / cytoplasm,更说明:
这些“不完全剪接的 polyA+ 转录本”
停留在“染色质这一层的核内”,而不是核质 / 胞质。
所以你可以这样理解他们的几个 compartment:
如果你愿意,我可以给你画一个小示意流程图(文字版),把 “chromatin → nucleoplasm → cytoplasm” 三层和剪接状态对应上,一眼就能看到这些 pts intron 卡在哪一层。
这一段就是 这些不剪的intro在核外几乎看不到,意思就是到核外会被剪,是吗
对,你现在的理解可以这样微调一下,会更准确一点👇
introns with an unspliced ratio of higher than 0.1 in chromatin-bound polyadenylated RNA were barely detectable in the nucleoplasm and cytoplasm … suggesting that the intron-containing transcripts were still bound to chromatin after PA. (s41477-020-0688-1.pdf)
翻成白话:
可以这么理解,但要注意两个层次:
实验现象(他们真的看到的)
作者的推论 / 模型
所以你问的那句:
“意思就是到核外会被剪,是吗?”
更精确一点说:
也就是:
👉 “带内含子的 polyA+ 转录本不会离开染色质;能到核质/胞质的,基本都已经把这些特定内含子剪掉了。”
如果你愿意,我可以用“时间线”方式帮你把一个 pts intron 所经历的几个阶段画成 1→2→3→4 的流程(比如:共转录剪一部分 → 加 polyA 但还留一个内含子 → 染色质滞留 → 剪完 → 释放/出核)。
更精确一点说是必须剪完才能出核?
可以这么理解,但要把“必须”说得稍微柔一点 😊
对这些 pts intron 所在的转录本,作者的模型基本就是:
只要还带着 pts intron,polyA+ 转录本就被扣在染色质上;
能到核质 / 胞质的,几乎全都是把 pts intron 剪掉后的版本。 (s41477-020-0688-1.pdf)
证据主要有三块:
分级分馏:
表达量分布:
NMD 双突变 upf1 upf3:
所以,在这套数据的统计意义上,可以很接地气地说一句:
👉 对这些 pts intron 来讲,“基本上是剪完了才能走出染色质 → 进核质 → 再出核”。
所以更严谨一点的说法是:
它们的结果“强烈暗示”——
带 pts intron 的 polyA+ 转录本在剪掉这些 intron 之前,一般不会被释放到核质/胞质;
实际上我们几乎只在核外看到全剪接版本。
如果你以后在写综述/汇报,可以这样概括一句:
“对于 pts introns,核输出与否基本上取决于这些内含子是否被剪完,是一个关键的后转录‘质检/闸门’步骤。”
这样既保留了作者想表达的生物学模型,又不过度绝对化“必须剪完”这个词。
Furthermore, if these
intron-containing transcripts were to be released from chromatin,
they would probably be targeted by the nonsense-mediated decay
(NMD) pathway, but we found that the accumulation level of most
of these unspliced introns remained the same in the up frameshift
1 (upf1) upf3 double mutant, which disrupts the NMD pathway
(Fig. 4b), suggesting that most of these incompletely spliced tran-
scripts are not degraded by NMD, consistent with previous reports
that most transcripts with intron retention (IR) are resistant to
NMD in plants23. Thus, our results suggest that chromatin retention
of incompletely spliced transcripts represents an important step in
mRNA maturation and has a critical role in post-transcriptional
gene regulation. Misannotation of introns are frequent and would
appear as IR events when analysing the RNA-seq data. To obtain
introns that are being actively spliced, we followed a recent study
that characterized splicing kinetics in Drosophila and humans, and
we used only constitutively spliced introns for our splicing analy-
sis (see Methods)22
. To simplify the analysis, we hereafter define pts
introns as constitutively spliced introns with an unspliced ratio of
at least 0.1 in the chromatin-bound polyadenylated RNA fraction.
Furthermore, if these
intron-containing transcripts were to be released from chromatin,
they would probably be targeted by the nonsense-mediated decay
(NMD) pathway, but we found that the accumulation level of most
of these unspliced introns remained the same in the up frameshift
1 (upf1) upf3 double mutant, which disrupts the NMD pathway
(Fig. 4b), suggesting that most of these incompletely spliced tran-
scripts are not degraded by NMD, consistent with previous reports
that most transcripts with intron retention (IR) are resistant to
NMD in plants23. Thus, our results suggest that chromatin retention
of incompletely spliced transcripts represents an important step in
mRNA maturation and has a critical role in post-transcriptional
gene regulation. Misannotation of introns are frequent and would
appear as IR events when analysing the RNA-seq data. To obtain
introns that are being actively spliced, we followed a recent study
that characterized splicing kinetics in Drosophila and humans, and
we used only constitutively spliced introns for our splicing analy-
sis (see Methods)22
. To simplify the analysis, we hereafter define pts
introns as constitutively spliced introns with an unspliced ratio of
at least 0.1 in the chromatin-bound polyadenylated RNA fraction.
我给你把这段拆成三块说,一句一句对上来。
if these intron-containing transcripts were to be released from chromatin, they would probably be targeted by the nonsense-mediated decay (NMD) pathway …
意思是:
按照常识,如果这些还带内含子的 mRNA 真出了核(离开染色质到细胞质),大概率会因为带有提前终止密码子被 NMD 降解。
所以他们做了一个“反证”实验:
… but we found that the accumulation level of most of these unspliced introns remained the same in the upf1 upf3 double mutant, which disrupts the NMD pathway …
… suggesting that most of these incompletely spliced transcripts are not degraded by NMD …
所以结论是:
👉 这些 incompletely spliced transcripts 不是“出核→被 NMD 清除”的那一类,
而是压根没有大规模走 NMD 这条路。
这和前面你问的“是不是必须剪完才能出核?”是呼应的:
数据支持的模型是——没剪完的那批基本就卡在染色质,不是出去被 NMD 杀。
Thus, our results suggest that chromatin retention of incompletely spliced transcripts represents an important step in mRNA maturation and has a critical role in post-transcriptional gene regulation.
翻成白话:
跟你之前的总结可以合并成一句话:
对 pts intron 来说,剪完 / 不剪完 = 出不出核,这个“卡在染色质上等剪完”的状态被用来做基因表达调控。
接下来这几句是在说 怎么避免把注释错误当成 IR:
Misannotation of introns are frequent and would appear as IR events when analysing the RNA-seq data.
有些“被标成 intron 的片段”其实生理上就是 exon 或 UTR,永远不会被剪掉;
如果不小心把它们也算进来,就会 “看上去都是 IR”,但那其实只是注释错了。
所以他们做了一个筛选:
we used only constitutively spliced introns for our splicing analysis …
这样就确保:
我们讨论的 IR 事件,都是发生在“正常情况下必剪”的内含子上,
不是因为基因注释错导致的“永远 IR”。
To simplify the analysis, we hereafter define pts introns as constitutively spliced introns with an unspliced ratio of at least 0.1 in the chromatin-bound polyadenylated RNA fraction.
翻译成公式就是:
pts intron 满足两个条件:
结合前面的结果:
所以 pts intron 就是你现在脑子里的那个形象:
“在染色质上拖着不剪,作为一个可调控的中间状态;剪完之后才放行出核”的那一类内含子。
如果你愿意,下一步我们可以把“普通 intron vs pts intron”的区别整理成一张小对照表(剪接速度、所在 compartment、对 mutant / stress 的敏感性),方便你以后做汇报 / 画模型图。
所以整个figure3这一部分就是在说这些加了a尾巴在核内的read含保留一些特殊的intron,且这些intron如果出核,就不会存在在read里,然后根据这个假设连续的nmd途径,如果他是被nmd途径干掉了,那我沉默nmd途径他应该大量上升,但是实验发现并没有类似的现象出现
对,你这段话的理解大方向是对的,我帮你把逻辑再“校正 + 补几句细节”,变成一个更严谨版👇 (s41477-020-0688-1.pdf)
可以拆成三步:
现象:染色质上的 polyA+ 转录本里,有一批内含子经常被保留
分馏对比:这些保留的内含子只出现在染色质那一层
“带这些 intron 的 polyA+ 转录本主要堆在染色质上,几乎不会以‘带 intron 形态’出现在核质和胞质里。”
推断:那它们如果真的跑出去了,会怎样?——引出 NMD 这条线
你说的版本大概是:
“这些加了 A 尾的核内 read 含保留一些特殊 intron;
如果这些 intron 出核,就不会存在在 read 里;
可能是被 NMD 干掉;
如果真是 NMD 干掉,那敲掉 NMD 它应该上升,但实验没看到。”
我帮你稍微修一下重点和用词:
“chromatin fraction 中的 polyA+ RNA 里,有很多带 pts intron 的;
但在 nucleoplasm 和 cytoplasm 的 polyA+ 里几乎看不到这些 intron。”
所以更精确的说法是:
这些带 pts intron 的 polyA+ 转录本主要被扣在染色质上,没有作为“带 intron 的形式”进入核质和胞质。
从数据看:
所以比较好的说法是:
只要我们在核质/胞质里抓到的 polyA+ mRNA,基本都不再带这些 pts intron。
至于“是不是每一条出核之前都剪完”,严格来说他们是从群体分布推断,而不是做单分子追踪,所以我们说“基本上需要剪完才能出去”会更恰当。
这里确实是你说的那条推理链,但可以拆得更清楚一点:
→ 所以:
这些 incompletely spliced transcripts 并不是主要在胞质被 NMD 清除,而是根本就没怎么出核(更准确:没离开 chromatin 层)。
然后顺势得出那句结论:
chromatin retention of incompletely spliced transcripts represents an important step in mRNA maturation and has a critical role in post-transcriptional gene regulation.
也就是:
把带 pts intron 的 polyA+ 转录本锁在染色质上、等它们后转录剪完再放行,这是 mRNA 成熟过程中的一个关键步骤,也是一层重要的转录后调控。
Fig.3 先证明:有一批 polyA+ 转录本在染色质上带着特定 intron(pts intron)堆积,但这些 intron 在核质和胞质里几乎看不到;
再结合 upf1 upf3 不会让这些 intron 大量上升,说明它们不是出核后被 NMD 清除,而是作为“染色质滞留的、等待后转录剪接的中间状态”存在。
我希望我们以后这个有一批 polyA+ 转录本就说成加了a尾的rnaread 我明白它的具体意思,但是你那种说法泰国拗口,我每次看到都要反应一下,你可以在我给你的说法后面加个括号备注一下
好哒,收到!以后我就按你说的来 👇
比如这篇文章里,我们就可以总结成:
在染色质上,有一批 加了 A 尾的 RNA read(polyA+ transcripts) 还保留着特定的内含子(pts introns),这些带内含子的加 A 尾 read 在核质和胞质里几乎看不到,说明它们还扣在染色质上,没被释放出去。(s41477-020-0688-1.pdf)
你后面要是对别的术语也有“顺嘴版本”,随时跟我说,我都按你的习惯来写~
To determine whether the splicing of pts introns is affected in
certain mutants or regulated by environment signals, we selected
6,521 high-quality, publicly available RNA-seq libraries that account
for 1,512 matched groups with at least two biological replicates of
each sample, including 713 groups of mutant versus wild type (WT)
and 799 groups of treatment versus mock/CK, to search for dif-
ferentially retained introns at genome-wide scale (Supplementary
Tables 2 and 3). From the mutant versus WT analysis, we identi-
fied ten mutant samples with an increased retention ratio (that
is, increased unspliced ratio) in more than 500 introns compared
with their matched WT controls (see Methods). The corresponding
genes for these mutants were all previously reported to be involved
in splicing, such as PROTEIN ARGININE METHYLTRANSFERASE
5 (PRMT5) and SKI-INTERACTING PROTEIN (SKIP)24–27
这段主要是在说 “我们怎么证明 pts 内含子的剪接确实会被剪接因子和环境信号调控”,以及他们具体是怎么挖这些调控事件的。(s41477-020-0688-1.pdf)
我给你拆成几步看:
we selected 6,521 high-quality, publicly available RNA-seq libraries … 1,512 matched groups … 713 groups of mutant versus wild type (WT) and 799 groups of treatment versus mock/CK …
意思就是:
可以理解成:
“我们把公开数据库里所有像样的 Arabidopsis RNA-seq 都翻出来,当一个超大规模的 stress/mutant screen,用来查 pts 内含子剪接有没有被影响。”
From the mutant versus WT analysis, we identified ten mutant samples with an increased retention ratio (that is, increased unspliced ratio) in more than 500 introns compared with their matched WT controls …
这里做的是:
最后筛出来 10 个这样的突变体样本。
The corresponding genes for these mutants were all previously reported to be involved in splicing, such as PRMT5 and SKIP …
也就是说:
大规模 IR 上升 → 基本就指向剪接因子突变,而不是别的乱七八糟的基因。
后面紧接着的一段(你上一条没贴完的部分)就进一步说:
他们用 6,521 个公共 RNA-seq 做了一个超大规模筛选,
找到 10 个“让很多内含子都更容易被保留”的剪接因子突变体(比如 PRMT5、SKIP),
后面再证明:这些突变主要是 让 pts 内含子更难被剪完,
说明 pts 内含子的剪接确实是一个可以被剪接因子和环境信号专门调控的层级。
如果你愿意,下一步我可以帮你把 Fig.4a–e 每一个 panel 和这一段对应起来,看具体是怎么算“>500 introns IR 上升”、以及 pts intron 在图上是怎么体现出来的。