现在我有4个R数据框: S_GB:单性状、仅加性效应的预测准确度。 M_GB:多性状、仅加性效应的预...
Создано: 24 января 2026 г.
Отвечено с использованием GPT-5.2 Thinking от Chat01
Создано: 24 января 2026 г.
Отвечено с использованием GPT-5.2 Thinking от Chat01
现在我有4个R数据框:
S_GB:单性状、仅加性效应的预测准确度。
M_GB:多性状、仅加性效应的预测准确度。
S_GBD:单性状、加性+显性效应的预测准确度。
M_GBD:多性状、加性+显性效应的预测准确度。
其中,S_GB,S_GBD三列:k_Fold Acc Trait
M_GB ,M_GBD 一列:ACC,行名为性状:[1] "ph" "ear" "mu" "ph1" "ear1" "mu1" "ph2" "ear2" "mu2"
[10] "ph3" "ear3" "mu3" "ph4" "ear4" "mu4" "ph5" "ear5" "mu5"
[19] "ph6" "ear6" "mu6" "ph7" "ear7" "mu7" "ph8" "ear8" "mu8"
[28] "ph9" "ear9" "mu9" "ph10" "ear10" "mu10" "ph11" "ear11" "mu11"
[37] "ph12" "ear12" "mu12" "ph13" "ear13" "mu13" "ph14" "ear14" "mu14"
[46] "ph15" "ear15" "mu15" "ph16" "ear16" "mu16" "ph17" "ear17" "mu17"
[55] "ph18" "ear18" "mu18" "ph19" "ear19" "mu19" "ph20" "ear20" "mu20"
[64] "ph21" "ear21" "mu21" "ph22" "ear22" "mu22" "ph23" "ear23" "mu23"
[73] "ph24" "ear24" "mu24"
请你据此帮我通过箱线图或其他你认为更好的展示方式为我比较不同条件下的预测结果,图片质量要求美观,符合顶刊发表级别水准。使用R代码实现。
下面给你一套**“可直接跑”的 R 代码,把 4 个数据框统一整理成长表,然后输出 3 张顶刊友好**的图(你可任选其一或同时发表):
假设:
S_GB/S_GBD列名:k_Fold Acc TraitM_GB/M_GBD:只有一列ACC,行名是 Trait- Trait 命名如
ph,ph1...ph24等
r## ========================= ## 0) Packages ## ========================= suppressPackageStartupMessages({ library(dplyr) library(tidyr) library(stringr) library(tibble) library(ggplot2) }) ## 如果你希望 PDF 输出更“出版级”,建议装 showtext(可选) ## install.packages("showtext") ## library(showtext) ## showtext_auto() ## 色盲友好 Okabe-Ito 调色(顶刊常用) okabe_ito <- c( "Single (A)" = "#0072B2", "Single (A+D)" = "#D55E00", "Multi (A)" = "#009E73", "Multi (A+D)" = "#CC79A7" ) ## ========================= ## 1) Helper functions ## ========================= .get_acc_col <- function(df) { nms <- names(df) idx <- which(tolower(nms) %in% c("acc", "accuracy")) if (length(idx) == 0) stop("找不到 Acc/ACC/Accuracy 列,请检查列名。") nms[idx[1]] } prep_single <- function(df, approach = "Single", model = "A") { stopifnot(all(c("k_Fold", "Acc", "Trait") %in% names(df))) df %>% transmute( Trait = as.character(Trait), Fold = as.factor(k_Fold), Acc = as.numeric(Acc), Approach = approach, Model = model ) } prep_multi <- function(df, approach = "Multi", model = "A") { acc_col <- .get_acc_col(df) df %>% rownames_to_column("Trait") %>% transmute( Trait = as.character(Trait), Fold = NA, Acc = as.numeric(.data[[acc_col]]), Approach = approach, Model = model ) } parse_trait <- function(trait_chr) { base <- str_extract(trait_chr, "^[A-Za-z]+") tp <- str_extract(trait_chr, "\\d+$") tp_i <- suppressWarnings(as.integer(tp)) tp_i[is.na(tp_i)] <- 0L # ph/ear/mu 视为 timepoint=0 tibble(base_trait = base, timepoint = tp_i) } ## ========================= ## 2) Combine four dataframes into long format ## ========================= acc_long <- bind_rows( prep_single(S_GB, approach = "Single", model = "A"), prep_single(S_GBD, approach = "Single", model = "A+D"), prep_multi(M_GB, approach = "Multi", model = "A"), prep_multi(M_GBD, approach = "Multi", model = "A+D") ) %>% bind_cols(parse_trait(.$Trait)) %>% mutate( Condition = case_when( Approach == "Single" & Model == "A" ~ "Single (A)", Approach == "Single" & Model == "A+D" ~ "Single (A+D)", Approach == "Multi" & Model == "A" ~ "Multi (A)", Approach == "Multi" & Model == "A+D" ~ "Multi (A+D)" ), Condition = factor(Condition, levels = names(okabe_ito)), base_trait = factor(base_trait, levels = c("ph","ear","mu")) ) %>% filter(!is.na(Acc)) ## ========================= ## 3) Trait-level summary (关键:避免 k-fold 让单性状“权重更大”) ## ========================= trait_sum <- acc_long %>% group_by(Condition, Approach, Model, Trait, base_trait, timepoint) %>% summarise( mean_acc = mean(Acc, na.rm = TRUE), sd_acc = sd(Acc, na.rm = TRUE), n = sum(!is.na(Acc)), .groups = "drop" ) %>% mutate( se = ifelse(is.na(sd_acc) | n <= 1, 0, sd_acc / sqrt(n)), ci = ifelse(n <= 1, 0, qt(0.975, df = n - 1) * se) ) ## ========================= ## 4) Publication theme ## ========================= theme_pub <- theme_classic(base_size = 13) + theme( legend.title = element_blank(), legend.position = "top", axis.title.x = element_text(margin = margin(t = 8)), axis.title.y = element_text(margin = margin(r = 8)), strip.background = element_blank(), strip.text = element_text(face = "bold"), plot.title = element_text(face = "bold", hjust = 0), plot.margin = margin(8, 10, 8, 10) ) ## ========================= ## FIG 1: 推荐——“雨云/小提琴 + 箱线 + 抖动点” ## 每个性状贡献一个值(mean_acc),公平对比四条件 ## ========================= p1 <- ggplot(trait_sum, aes(x = Condition, y = mean_acc, fill = Condition)) + geom_violin( width = 0.95, trim = TRUE, alpha = 0.45, color = "grey25", linewidth = 0.35 ) + geom_boxplot( width = 0.18, outlier.shape = NA, alpha = 0.75, color = "grey15", linewidth = 0.35 ) + geom_jitter( aes(color = Condition), width = 0.10, height = 0, size = 1.0, alpha = 0.35 ) + facet_wrap(~ base_trait, nrow = 1, scales = "free_y") + scale_fill_manual(values = okabe_ito) + scale_color_manual(values = okabe_ito) + labs( title = "Prediction accuracy comparison across conditions", x = NULL, y = "Accuracy (trait-level mean)" ) + theme_pub + guides(color = "none") ## ========================= ## FIG 2: 趋势图(timepoint 0..24)+ 单性状 95%CI ## 多性状如果只有一个值,ci=0,误差条不会显著 ## ========================= p2 <- ggplot(trait_sum, aes(x = timepoint, y = mean_acc, color = Condition)) + geom_line(linewidth = 0.6) + geom_point(size = 1.6) + geom_errorbar(aes(ymin = mean_acc - ci, ymax = mean_acc + ci), width = 0.22, linewidth = 0.35, alpha = 0.8) + facet_wrap(~ base_trait, ncol = 1, scales = "free_y") + scale_color_manual(values = okabe_ito) + scale_x_continuous(breaks = sort(unique(trait_sum$timepoint))) + labs( title = "Accuracy trends across timepoints (with 95% CI for k-fold single-trait)", x = "Timepoint index (0 = baseline trait name without suffix)", y = "Accuracy (mean ± 95% CI)" ) + theme_pub ## ========================= ## FIG 3: 显性效应增益(A+D − A)差值图 ## 强烈建议用于结果表述:dominance improves accuracy by ... ## ========================= delta <- trait_sum %>% select(Approach, base_trait, timepoint, Trait, Model, mean_acc) %>% pivot_wider(names_from = Model, values_from = mean_acc) %>% mutate(delta_AD = `A+D` - `A`) %>% filter(!is.na(delta_AD)) %>% mutate(Approach = factor(Approach, levels = c("Single","Multi"))) p3 <- ggplot(delta, aes(x = Approach, y = delta_AD, fill = Approach)) + geom_hline(yintercept = 0, linetype = "dashed", linewidth = 0.4, color = "grey35") + geom_violin(width = 0.9, trim = TRUE, alpha = 0.5, color = "grey25", linewidth = 0.35) + geom_boxplot(width = 0.18, outlier.shape = NA, alpha = 0.75, color = "grey15", linewidth = 0.35) + geom_jitter(width = 0.08, height = 0, size = 1.0, alpha = 0.35) + facet_wrap(~ base_trait, nrow = 1, scales = "free_y") + scale_fill_manual(values = c("Single" = "#0072B2", "Multi" = "#009E73")) + labs( title = "Dominance gain: (A+D) − A", x = NULL, y = "Δ Accuracy" ) + theme_pub + guides(fill = "none") ## ========================= ## 5) Print plots ## ========================= p1 p2 p3 ## ========================= ## 6) Save (出版级:优先 PDF;位图用 600–900 dpi) ## ========================= ggsave("Fig1_violin_box_traitmean.pdf", p1, width = 9.2, height = 3.8, device = cairo_pdf) ggsave("Fig2_trend_timepoints.pdf", p2, width = 8.2, height = 7.5, device = cairo_pdf) ggsave("Fig3_dominance_gain.pdf", p3, width = 9.2, height = 3.8, device = cairo_pdf) ggsave("Fig1_violin_box_traitmean.png", p1, width = 9.2, height = 3.8, dpi = 800) ggsave("Fig2_trend_timepoints.png", p2, width = 8.2, height = 7.5, dpi = 800) ggsave("Fig3_dominance_gain.png", p3, width = 9.2, height = 3.8, dpi = 800)
如果你希望我把 Fig2 做成更“Nature/Science 风格”的版本(比如每个 timepoint 用 paired slopegraph 或者把三类性状拼成一个 panel 并统一 y 轴、加 A/B/C 面板编号),我也可以在你现有数据结构上再给一版更“版面级”的排版代码(patchwork/cowplot)。