这是我现在的画图脚本，我原来是通过热图肉眼来判断是否有影响的，但是作为科研论文这不好下结论，需要在此...

Question

这是我现在的画图脚本，我原来是通过热图肉眼来判断是否有影响的，但是作为科研论文这不好下结论，需要在此基础上进行优化，要在我原有画图的脚本基础上进行更改，我的数据格式已经固定,现在的脚本不能通过统计学来判断样本间是否存在差异，需要增加统计学的判断依据，尽可能简单直白，优先考虑生命科学常用的方法，两个样本间是否存在差异,dataframe中的['db1_ratio']和['db2_ratio']一个是野生型一个是突变体，现在有一些突变体肉眼看均匀的分布在对角线上，不知道是否是真的没影响还是与野生型比增加或减少，我需要你给我增加一些有说服力的参考客观进行评判，避免主观臆断：
def plot_cleavage_ratio(data1,lable1):
x = data1['db2_ratio']
y = data1['db1_ratio']
xy = np.vstack([x,y])
z = gaussian_kde(xy)(xy)
idx = z.argsort()
x, y, z = x.iloc[idx], y.iloc[idx], z[idx]
plt.xlabel(lable1)
ax = plt.gca()
plt.plot([0, 1], [0, 1], ls='--', color='#555555')
#ax.plot(0.164179,0.280000, marker='*',color='w',markersize=5)
xticks = np.array([0.00,0.25,0.5,0.75,1.00])
plt.xticks(xticks)
scatter = ax.scatter(x,y,marker='o',c=z,s=5,cmap="RdYlBu_r",label='LST')
divider = make_axes_locatable(ax)

cax = divider.append_axes("right", size="5%", pad=0.1)

cbar = fig.colorbar(scatter, cax=cax, label='Density')

text
sns.despine(top=True, right=True)
return ax

我原来是用下面脚本调用我原有的画图函数的，现在我应该如何修改来调用你更新的函数：plt.figure(figsize=(12, 12))
plt.subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=0.3,hspace=0.3)
plt.subplot(4,4,1)

plot_cleavage_ratio(fy_rep2_cleavage,'')
#plt.yticks(color='w')
#plt.xticks(color='w')

plt.subplot(4,4,2)

plot_cleavage_ratio(cpsf30_rep2_cleavage,'')
#plt.yticks(color='w')
#plt.xticks(color='w')

plt.subplot(4,4,3)

plot_cleavage_ratio(fip1_rep2_cleavage,'')
#plt.yticks(color='w')
#plt.xticks(color='w')

plt.subplot(4,4,4)

plot_cleavage_ratio(esp5_rep2_cleavage,'')
#plt.xticks(color='w')
#plt.yticks(color='w')

plt.subplot(4,4,5)

plot_cleavage_ratio(fpa_rep1_cleavage,'')
#plt.yticks(color='w')
#plt.xticks(color='w')
plt.subplot(4,4,6)

plot_cleavage_ratio(fca_rep2_cleavage,'')
#plt.yticks(color='w')
#plt.xticks(color='w')
plt.subplot(4,4,7)

plot_cleavage_ratio(paps1_rep2_cleavage,'')
#plt.yticks(color='w')
#plt.xticks(color='w')

plt.subplot(4,4,8)

plot_cleavage_ratio(paps24_rep2_cleavage,'')
#plt.yticks(color='w')
#plt.xticks(color='w')
plt.subplot(4,4,9)

plot_cleavage_ratio(cstf64_rep2_cleavage,'')

#plt.xticks(color='w')
#plt.yticks(color='w')
plt.subplot(4,4,10)

plot_cleavage_ratio(cstf77_rep2_cleavage,'')
#plt.xticks(color='w')
#plt.yticks(color='w')

plt.subplot(4,4,11)
plot_cleavage_ratio(pcfs4_rep2_cleavage,'')
#plt.yticks(color='w')
#plt.xticks(color='w')
plt.subplot(4,4,12)

plot_cleavage_ratio(bdr123_rep2_cleavage,'')
#plt.yticks(color='w')
#plt.xticks(color='w')
plt.subplot(4,4,13)

plot_cleavage_ratio(ssu72_rep2_cleavage,'')
#plt.yticks(color='w')
#plt.xticks(color='w')
plt.subplot(4,4,14)

plot_cleavage_ratio(xrn3_rep2_GB_cleavage,'')
#plt.yticks(color='w')
#plt.xticks(color='w')

sns.despine(top=True, right=True)

Answer 1

为了改进您的原始画图脚本，加入统计学的分析，来判断野生型（db1_ratio）和突变体（db2_ratio）之间是否存在显著差异，我们可以使用常见的统计方法，如t检验（用于比较两个独立样本的均值差异）。在您的案例中，t检验会比较db1_ratio和db2_ratio的均值是否存在显著性差异。

我们将修改您的绘图函数，增加t检验的步骤，并将其结果在图中进行展示。具体步骤如下：

修改后的代码

增加 t 检验的计算：我们将在绘图函数内计算db1_ratio和db2_ratio的t检验结果。
展示统计结果：可以将t检验的p值添加到图中，作为是否存在显著性差异的参考。
简化和优化代码：将绘图函数和统计分析结合，保持代码简洁且高效。

以下是修改后的代码：

python
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import ttest_ind
from scipy.stats import gaussian_kde
from mpl_toolkits.axes_grid1.inset_locator import make_axes_locatable

# 修改后的绘图函数
def plot_cleavage_ratio(data1, label1):
    # 提取数据
    x = data1['db2_ratio']
    y = data1['db1_ratio']
    
    # 计算 t 检验，判断两组数据是否显著不同
    t_stat, p_value = ttest_ind(x, y)
    
    # 绘制核密度估计
    xy = np.vstack([x, y])
    z = gaussian_kde(xy)(xy)
    idx = z.argsort()
    x, y, z = x.iloc[idx], y.iloc[idx], z[idx]
    
    # 绘制散点图
    plt.xlabel(label1)
    ax = plt.gca()
    plt.plot([0, 1], [0, 1], ls='--', color='#555555')
    
    # 设置 x 轴的刻度
    xticks = np.array([0.00, 0.25, 0.5, 0.75, 1.00])
    plt.xticks(xticks)
    
    # 绘制颜色编码的散点图
    scatter = ax.scatter(x, y, marker='o', c=z, s=5, cmap="RdYlBu_r", label='LST')

    # 绘制色条
    divider = make_axes_locatable(ax)
    cax = divider.append_axes("right", size="5%", pad=0.1)
    cbar = plt.colorbar(scatter, cax=cax, label='Density')

    # 显示 t 检验结果，p值可以作为显著性参考
    ax.text(0.5, 0.95, f't-test p-value: {p_value:.4f}', ha='center', va='top', 
            transform=ax.transAxes, fontsize=12, color='black', weight='bold')

    # 去除图像的顶部和右侧边框
    sns.despine(top=True, right=True)

    # 返回轴对象，方便布局调整
    return ax


# 更新后的绘图函数调用
plt.figure(figsize=(12, 12))
plt.subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=0.3, hspace=0.3)

# 在每个子图上调用新的函数进行绘图
plt.subplot(4, 4, 1)
plot_cleavage_ratio(fy_rep2_cleavage, '')

plt.subplot(4, 4, 2)
plot_cleavage_ratio(cpsf30_rep2_cleavage, '')

plt.subplot(4, 4, 3)
plot_cleavage_ratio(fip1_rep2_cleavage, '')

plt.subplot(4, 4, 4)
plot_cleavage_ratio(esp5_rep2_cleavage, '')

plt.subplot(4, 4, 5)
plot_cleavage_ratio(fpa_rep1_cleavage, '')

plt.subplot(4, 4, 6)
plot_cleavage_ratio(fca_rep2_cleavage, '')

plt.subplot(4, 4, 7)
plot_cleavage_ratio(paps1_rep2_cleavage, '')

plt.subplot(4, 4, 8)
plot_cleavage_ratio(paps24_rep2_cleavage, '')

plt.subplot(4, 4, 9)
plot_cleavage_ratio(cstf64_rep2_cleavage, '')

plt.subplot(4, 4, 10)
plot_cleavage_ratio(cstf77_rep2_cleavage, '')

plt.subplot(4, 4, 11)
plot_cleavage_ratio(pcfs4_rep2_cleavage, '')

plt.subplot(4, 4, 12)
plot_cleavage_ratio(bdr123_rep2_cleavage, '')

plt.subplot(4, 4, 13)
plot_cleavage_ratio(ssu72_rep2_cleavage, '')

plt.subplot(4, 4, 14)
plot_cleavage_ratio(xrn3_rep2_GB_cleavage, '')

# 去除整个图形的顶部和右侧边框
sns.despine(top=True, right=True)

主要改动：

t检验计算：在绘图函数内添加了ttest_ind(x, y)来进行两组数据（db1_ratio和db2_ratio）是否存在显著差异。
在图中添加t检验结果：通过ax.text将p_value添加到图中，便于观察是否存在显著差异。如果p值小于0.05，通常认为存在显著差异。
绘制结果：保持了您原有的散点图和密度图，只是增加了统计结果。

调用说明：

现在您可以直接调用新的plot_cleavage_ratio函数来绘制每个子图，且每个子图上会显示t检验的p值，从而为判断是否存在显著差异提供依据。

结果解读：

p值：如果p值小于0.05，通常认为两组数据之间存在显著差异。反之，p值大于0.05则表示两组数据之间没有显著差异。

这是我现在的画图脚本，我原来是通过热图肉眼来判断是否有影响的，但是作为科研论文这不好下结论，需要在此...

Question

cax = divider.append_axes("right", size="5%", pad=0.1)

cbar = fig.colorbar(scatter, cax=cax, label='Density')

Thought

Answer

修改后的代码

主要改动：

调用说明：

结果解读：

Share this Q&A