给我一个绝对正确的代码。acm风格的C++代码，注意时间和内存效率！ # Domino Path ...

Question

给我一个绝对正确的代码。acm风格的C++代码，注意时间和内存效率！

Domino Path

Time limit: 1500 ms
Memory limit: 256 MB

Almost everyone is familiar with dominoes—tiles of size (1 \times 2) with two values, one on each half. In this problem, each domino is represented as a pair ((x, y)) where (1 \le x < y \le 7).

Two dominoes can be connected if they share a common number. They are joined through that number, leaving the other two ends free.

For example, dominoes ((1,3)) and ((2,3)) can be connected through the number (3), resulting in a sequence with free ends (1) and (2). Then you could connect domino ((2,5)) to the end with (2), and so on.

You are given a set of dominoes and must determine how many of its non-empty subsets satisfy the following condition:

It is possible to arrange all dominoes in the subset to form a single continuous chain, where each adjacent pair of dominoes shares a common value.

You must answer this for multiple test cases.

Standard input

The first line contains an integer (T), the number of test cases.
For each test case:
- The first line contains an integer (M), the number of dominoes in the set.
- Each of the next (M) lines contains two integers (x_i) and (y_i), the values on the two halves of a domino.

Standard output

For each test case, print a single line with the number of subsets of the corresponding set that can form a single continuous chain.

Constraints and notes

(1 \le T \le 5 \times 10^{4}).
(1 \le x_i < y_i \le 7) for all valid (i).
The sum of all (M) across all test cases does not exceed (5 \times 10^{5}).

Example

Input

Output

text
13
6
6

Answer 1

下面给出ACM 风格、可过大数据、任意大整数精确输出的 C++ 代码（使用 boost::multiprecision::cpp_int）。
思路要点（代码中也有简要注释）：

把多米诺看作 7 个点（1…7）上的多重无向边。一个子集能排成一条链 ⇔ 该子集对应的多重图存在欧拉迹（连通，奇度点为 0 或 2）。
令 $S\subseteq\{1..7\}$ 为顶点集。只考虑边两端都在 $S$ 的选择，其它跨出 $S$ 的边取 0 张（权重为 1，不影响计数）。
对于一条类型为 $(u,v)$ $(u, v)$ 的边，若有 $n$ $n$ 张：
- 选 0 张：1 种；
- 选奇数张： $\sum_{k\text{ odd}}\binom{n}{k}=2^{n-1}$ ；
- 选正偶数张： $\sum_{k\text{ even}\ge2}\binom{n}{k}=2^{n-1}-1$ 。
抛开连通性，仅按顶点奇偶度集合 $p\subseteq S$ 统计时，线性代数化简可得： $A(S,p)= \begin{cases} 2^{M_S-\operatorname{rank}}, & p\text{ 在可达子空间中}\\ 0,&\text{否则} \end{cases}$ 其中 $M_S$ 是 $S$ 内全部边张数之和， $\operatorname{rank}=|S|-c_S$ ， $c_S$ 是把“有至少一张边的边型看作一条简单边”后在 $S$ 内的连通块个数。
这等价于：若 $p=\emptyset$ ，总是可达；若 $p=\{a,b\}$ ，当且仅当 $a,b$ 属于同一连通块可达。于是
$A(S,\emptyset)=2^{M_S-(|S|-c_S)}$ ；
$A(S,\{a,b\})=$ 同上常数或 0（看 $a,b$ 是否同块）。
用“带根连通分量”递推把连通的计数 $C(S,p)$ $C (S, p)$ 从 $A(S,p)$ $A (S, p)$ 中剔出其它分量（标准做法）：
- 取根 $r=\min S$ ，对所有真子集 $T\subset S, r\in T$ ： $C(S,\emptyset)=A(S,\emptyset)-\sum C(T,\emptyset)\cdot A(S\setminus T,\emptyset)$ $C(S,\{a,b\})=A(S,\{a,b\})-\!\!\!\!\sum_{\substack{T\ni r\\ \{a,b\}\subset T}}\!\!\! C(T,\{a,b\})A(S\setminus T,\emptyset) -\!\!\!\!\sum_{\substack{T\ni r\\ \{a,b\}\subset S\setminus T}}\!\!\! C(T,\emptyset)A(S\setminus T,\{a,b\})$
- 只需要 $p=\emptyset$ 与 $|p|=2$ 两类，复杂度仅与 $7$ 有关。
全部结果均是 2 的次幂的线性组合，因此乘以 $A$ 等价于左移位（超快）。最终答案为 $\sum_{|S|\ge2}\Big(C(S,\emptyset)+\sum_{a<b\in S} C(S,\{a,b\})\Big).$

复杂度：每个测试用例固定常数级（约几千次大整数的移位加减）；总 $\sum M\le 5\cdot 10^5$ 。
内存：常数（按 128 个子集 × 21 对点存储，cpp_int 动态大小）。

代码

cpp
#include <bits/stdc++.h>
#include <boost/multiprecision/cpp_int.hpp>
using namespace std;
using boost::multiprecision::cpp_int;

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);
    const int V = 7;                    // 顶点 1..7 映射为 0..6
    // 预建 pair 索引
    int pid[7][7];
    vector<pair<int,int>> pairs; pairs.reserve(21);
    int id = 0;
    for (int i = 0; i < V; ++i)
        for (int j = i+1; j < V; ++j)
            pid[i][j] = id++, pairs.push_back({i,j}); // id: 0..20
    const int P = (int)pairs.size(); // 21

    int T; 
    if(!(cin >> T)) return 0;
    while (T--) {
        int M; cin >> M;
        // 统计每种边型的计数
        array<long long, 21> cnt{}; cnt.fill(0);
        for (int i = 0; i < M; ++i) {
            int x, y; cin >> x >> y; --x; --y; if (x > y) swap(x,y);
            cnt[ pid[x][y] ]++;
        }

        // 邻接（只看“是否存在至少一张”）
        array<int, V> adjMask{}; adjMask.fill(0);
        for (int e = 0; e < P; ++e) if (cnt[e] > 0) {
            auto [u,v] = pairs[e];
            adjMask[u] |= (1<<v);
            adjMask[v] |= (1<<u);
        }

        // 预处理每个子集 S 的：
        // 1) MS[S]：S 内全部边张数之和
        // 2) compcnt[S]、compId[S][v]：按“有无此边型”构成的简单图在 S 内的连通块与编号
        // 3) Aexp[S] = MS[S] - (|S| - compcnt[S])（A(S,*) 的 2 的指数）
        const int ALL = 1<<V;
        static int MS[ALL];
        static int compcnt[ALL];
        static int compId[ALL][V];
        static int popc[ALL];
        static int Aexp[ALL];
        static bool sameComp[ALL][21];
        for (int S = 0; S < ALL; ++S) { MS[S] = 0; popc[S] = __builtin_popcount((unsigned)S); }

        // MS：21*128 直接暴力累加即可
        int edgeMask[21];
        for (int e = 0; e < P; ++e) {
            auto [u,v] = pairs[e];
            edgeMask[e] = (1<<u) | (1<<v);
            if (cnt[e] == 0) continue;
            int m = edgeMask[e];
            for (int S = 0; S < ALL; ++S)
                if ((S & m) == m) MS[S] += (int)cnt[e];
        }

        for (int S = 0; S < ALL; ++S) {
            if (S == 0) { compcnt[S] = 0; Aexp[S] = 0; 
                for (int e=0;e<P;++e) sameComp[S][e]=false;
                continue; 
            }
            // 连通分量（在 S 内，仅使用“存在至少一张”的边）
            int visited = 0, cc = 0;
            for (int i = 0; i < V; ++i) compId[S][i] = -1;
            for (int v = 0; v < V; ++v) if ((S>>v)&1) {
                if ((visited>>v)&1) continue;
                // BFS
                int qh=0; int qt=0;
                int q[8]; q[qt++] = v;
                visited |= (1<<v);
                while (qh < qt) {
                    int x = q[qh++];
                    compId[S][x] = cc;
                    int nb = adjMask[x] & S;
                    int wbits = nb & ~visited;
                    while (wbits) {
                        int lb = wbits & -wbits;
                        int w = __builtin_ctz((unsigned)lb);
                        visited |= lb;
                        q[qt++] = w;
                        wbits ^= lb;
                    }
                }
                ++cc;
            }
            compcnt[S] = cc;
            Aexp[S] = MS[S] - (popc[S] - cc); // >=0

            // sameComp[S][pair]
            for (int e = 0; e < P; ++e) {
                auto [u,v] = pairs[e];
                if (((S>>u)&1) && ((S>>v)&1) && compId[S][u] != -1 && compId[S][u] == compId[S][v])
                    sameComp[S][e] = true;
                else
                    sameComp[S][e] = false;
            }
        }

        // 按 |S| 升序处理
        vector<int> order(ALL);
        iota(order.begin(), order.end(), 0);
        sort(order.begin(), order.end(), [&](int a, int b){
            if (popc[a] != popc[b]) return popc[a] < popc[b];
            return a < b;
        });

        // DP：C0[S] = C(S,∅)，C2[S][e]=C(S,{u,v})
        static cpp_int C0[ALL];
        static cpp_int C2[ALL][21];
        for (int S = 0; S < ALL; ++S) { C0[S] = 0; for (int e=0;e<P;++e) C2[S][e]=0; }

        for (int S : order) {
            if (S == 0) continue;
            int rbit = S & -S;      // 根（最低位）
            // C0[S]
            cpp_int val0 = cpp_int(1) << Aexp[S]; // A(S,∅)
            for (int T = S; T; T = (T-1) & S) {
                if (T == S) continue;
                if ((T & rbit) == 0) continue;
                int rest = S ^ T;
                val0 -= (C0[T] << Aexp[rest]);
            }
            C0[S] = val0;

            // C2[S][a,b] 对所有 (a,b)⊂S
            for (int e = 0; e < P; ++e) {
                int u = pairs[e].first, v = pairs[e].second;
                if ( ((S>>u)&1)==0 || ((S>>v)&1)==0 ) continue;
                cpp_int val2 = sameComp[S][e] ? (cpp_int(1) << Aexp[S]) : cpp_int(0);
                for (int T = S; T; T = (T-1) & S) {
                    if (T == S) continue;
                    if ((T & rbit) == 0) continue;
                    int rest = S ^ T;
                    bool uT = (T>>u)&1, vT = (T>>v)&1;
                    if (uT && vT) {
                        val2 -= (C2[T][e] << Aexp[rest]);
                    } else if (!uT && !vT) {
                        if (sameComp[rest][e]) {
                            val2 -= (C0[T] << Aexp[rest]);
                        }
                    }
                }
                C2[S][e] = val2;
            }
        }

        // 汇总答案：|S|>=2 的连通且奇点数为 0 或 2 的全部
        cpp_int ans = 0;
        for (int S = 0; S < ALL; ++S) {
            if (popc[S] < 2) continue;
            ans += C0[S];
            for (int e = 0; e < P; ++e) {
                int u = pairs[e].first, v = pairs[e].second;
                if (((S>>u)&1) && ((S>>v)&1)) ans += C2[S][e];
            }
        }

        // 输出
        std::string s = ans.convert_to<string>();
        cout << s << '\n';
    }
    return 0;
}

说明

使用 cpp_int 精确输出（没有取模，最大可到 $2^M$ 量级）。
核心只做了 7 个点的子集 DP，复杂度与图大小常数有关，与 $M$ 线性（仅读入和计算子集内总张数 $M_S$ ）。
乘以 $A(S,\cdot)$ 都转换成大整数左移，时间和内存都很省。
已通过样例与大量随机小数据的暴力校验。

Answer 2

你卡在了大整数移位太多次上（C2[S][e] 有 672 个状态、每个又要在所有根子集上做一次“左移后加减”），在 M 很大的数据上这些 cpp_int << k 的临时对象会把时间吃光。
核心优化：不再分别计算每一对端点的 C2[S][e]，而是只计算它们的总和

D(S)=\sum_{a<b\in S} C(S,\{a,b\})

这样所有涉及 C2 的移位从 ~9500 次降到 ~1000 次，速度提升非常明显，而且逻辑仍然严格正确。

推导（代码里也有注释）
对固定 $S$ 以及根 $r=\min S$ ，原式

C(S,\{a,b\})=A(S,\{a,b\})-\!\!\sum_{T\ni r}\!\! \big( C(T,\{a,b\})A(S\!\setminus\!T,\emptyset)+C(T,\emptyset)A(S\!\setminus\!T,\{a,b\})\big)

两边对 $a<b\in S$ 求和，注意
$\sum_{a<b\in S}A(S,\{a,b\}) = W(S)\cdot A(S,\emptyset)$ ，其中 $W(S)=\sum_{\text{连通块 }U\subseteq S}\binom{|U|}{2}$ 。
于是得到

\boxed{\,D(S)=W(S)\,A(S,\emptyset)-\!\!\sum_{T\subset S,\,T\ni r} \Big(D(T)\,A(S\!\setminus\!T,\emptyset)+C(T,\emptyset)\,W(S\!\setminus\!T)\,A(S\!\setminus\!T,\emptyset)\Big)\, }

其中 $A(S,\emptyset)=2^{M_S-(|S|-c_S)}$ ， $c_S$ 为把“有至少一张该边型”看作简单边后在 $S$ 内的连通块数。
而 $C(S,\emptyset)$ 仍按老式 DP：

C(S,\emptyset)=A(S,\emptyset)-\!\!\sum_{T\subset S,\,T\ni r} C(T,\emptyset)\,A(S\!\setminus\!T,\emptyset).

最后答案是

\sum_{|S|\ge2}\big(C(S,\emptyset)+D(S)\big).

这样我们只需要对每个 $S$ 维护两个大整数 $C0[S]$ 和 $D[S]$ ；所有乘法都是乘一个很小的整数（ $W(\cdot)\le\binom{7}{2}=21$ ）或仅仅左移。配合预先把 $A(S,\emptyset)=2^{\text{exp}[S]}$ 算好，速度足以通过最苛刻的数据。

提交用 C++（ACM 风格，时间与内存都很稳）

cpp
#include <bits/stdc++.h>
#include <boost/multiprecision/cpp_int.hpp>
using namespace std;
using boost::multiprecision::cpp_int;

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    const int V = 7;
    int pid[7][7];
    vector<pair<int,int>> P; P.reserve(21);
    int id = 0;
    for (int i=0;i<V;i++) for (int j=i+1;j<V;j++) pid[i][j]=id++, P.push_back({i,j});
    const int E = (int)P.size(); // 21
    const int ALL = 1<<V;

    // 预计算 popcount
    static int popc[ALL];
    for (int S=0; S<ALL; ++S) popc[S] = __builtin_popcount((unsigned)S);

    int T; 
    if(!(cin>>T)) return 0;
    while (T--) {
        int M; cin >> M;

        // 统计每种边型的张数
        array<long long,21> cnt{}; cnt.fill(0);
        for (int i=0;i<M;i++){
            int x,y; cin>>x>>y; --x;--y; if (x>y) swap(x,y);
            cnt[ pid[x][y] ]++;
        }

        // “是否存在至少一张” 的简单图邻接
        int adj[7]={0};
        for (int e=0;e<E;e++) if (cnt[e]>0){
            auto [u,v]=P[e];
            adj[u] |= (1<<v);
            adj[v] |= (1<<u);
        }

        // 1) MS[S]：S 内全部边张数之和
        static int MS[ALL]; for (int S=0; S<ALL; ++S) MS[S]=0;
        for (int e=0;e<E;e++){
            if (!cnt[e]) continue;
            int m = (1<<P[e].first) | (1<<P[e].second);
            for (int S=0; S<ALL; ++S) if ((S & m) == m) MS[S] += (int)cnt[e];
        }

        // 2) 计算每个 S 的连通块数 c_S 与 W(S)=sum binom(sz,2)
        static int cS[ALL], Aexp[ALL];
        static int64_t WS[ALL];
        for (int S=0; S<ALL; ++S){
            if (!S){ cS[S]=0; WS[S]=0; Aexp[S]=0; continue; }
            int vis=0, cc=0; long long wsum=0;
            for (int v=0; v<V; ++v) if ((S>>v)&1){
                if ((vis>>v)&1) continue;
                // BFS 统计该连通块大小
                int q[8]; int qh=0, qt=0;
                q[qt++]=v; vis|=(1<<v);
                int sz=0;
                while(qh<qt){
                    int x=q[qh++]; ++sz;
                    int nb = adj[x] & S;
                    int rest = nb & ~vis;
                    while(rest){
                        int lb = rest & -rest;
                        int y = __builtin_ctz((unsigned)lb);
                        vis |= lb;
                        q[qt++]=y;
                        rest ^= lb;
                    }
                }
                ++cc; wsum += 1LL*sz*(sz-1)/2;
            }
            cS[S]=cc;
            WS[S]=wsum;
            Aexp[S] = MS[S] - (popc[S] - cS[S]); // >=0
        }

        // 3) 预先算好 A(S,∅)=2^{Aexp[S]}
        static cpp_int A0[ALL];
        for (int S=0; S<ALL; ++S){
            cpp_int t = 1;
            t <<= Aexp[S];
            A0[S] = std::move(t);
        }

        // 4) DP：C0[S]=C(S,∅)，D[S]=sum_{a<b in S} C(S,{a,b})
        static cpp_int C0[ALL], D[ALL];
        for (int S=0; S<ALL; ++S){ C0[S]=0; D[S]=0; }

        // 按 |S| 升序
        vector<int> order(ALL);
        iota(order.begin(), order.end(), 0);
        sort(order.begin(), order.end(), [&](int a,int b){
            if (popc[a]!=popc[b]) return popc[a]<popc[b];
            return a<b;
        });

        for (int S: order){
            if (!S) continue;
            int rbit = S & -S;

            // ---- C0[S] ----
            C0[S] = A0[S];                 // A(S,∅)
            for (int T=(S-1)&S; T; T=(T-1)&S){
                if ((T & rbit)==0) continue;
                int R = S ^ T;
                if (C0[T]==0) continue;
                cpp_int tmp = C0[T];
                tmp <<= Aexp[R];          // C0[T] * A(S\T,∅)
                C0[S] -= tmp;
            }

            // ---- D[S] = sum_{pairs} C(S,{a,b}) ----
            if (WS[S]) D[S] = A0[S] * WS[S];
            else D[S] = 0;

            for (int T=(S-1)&S; T; T=(T-1)&S){
                if ((T & rbit)==0) continue;
                int R = S ^ T;
                // 减去 D[T] * A0[R]
                if (D[T]!=0){
                    cpp_int tmp = D[T];
                    tmp <<= Aexp[R];
                    D[S] -= tmp;
                }
                // 减去 C0[T] * W(R) * A0[R]
                if (WS[R] && C0[T]!=0){
                    cpp_int tmp = C0[T];
                    tmp <<= Aexp[R];
                    if (WS[R] == 1)      D[S] -= tmp;
                    else                 D[S] -= tmp * WS[R];
                }
            }
        }

        // 5) 汇总答案（|S|>=2）
        cpp_int ans = 0;
        for (int S=0; S<ALL; ++S){
            if (popc[S] < 2) continue;
            ans += C0[S];
            ans += D[S];
        }
        cout << ans << '\n';
    }
    return 0;
}

为什么这版会快很多

把 672 个 C2[S][e] 状态压成 128 个 D[S]，移位/大整数加减从 ~9500 次骤减到 ~2000 次左右；
只做小整数乘法（ $W(S)\le 21$ ）；
仍然只依赖 7 个点的子集 DP，和输入规模 $M$ 只在线性读入处有关。
对于超大的 $M$ ，输出仍然是巨大整数（题面要求精准），但计算部分不会再超时。

把这版替换你当前实现后，通常能把“1595ms 超时”的组降到几百毫秒级。