Grouping Names Using Bitmask
創建於:2025年2月4日
使用 O3 Mini High 回答 Chat01
創建於:2025年2月4日
使用 O3 Mini High 回答 Chat01
Here is the problem statement extracted from the images:
You are given an array of N names. Each name consists of uppercase English letters. You need to form groups of names based on the set of unique alphabets they contain.
A name A can be in the same group as name B if the set of unique characters in A and B differs by at most one character in one of the following ways:
Your task is to determine:
For each test case, print two space-separated integers:
text1 5 HARI HARRY LATA SARAI
3 2
Would you like a solution in a specific programming language? code in c++
We can solve this problem by “compressing” each name into a 26‐bit integer (bitmask) representing its unique letters. Two names can be “grouped” (i.e. they are in the same connected component) if their masks are “neighbors” under the following rules:
0 or 1‐bit difference by addition/removal:
Replacement (2‐bit difference):
Once we define the neighbor relation in terms of bitmasks, we can build a “graph” (or more directly use a disjoint–set union, DSU) where every name is a node and we add an edge between two names if their bitmasks are “neighbors” according to the rules above.
A good optimization is that instead of comparing every pair of names (which would be too slow when ), we do the following:
unordered_map<int,int>
) to store one representative index for each distinct mask (and union together all names that have the same mask).Finally, after processing, each connected component represents a “group.” We then count:
Below is the complete C++ solution.
cpp#include <bits/stdc++.h> using namespace std; // Disjoint Set Union (Union‐Find) structure struct DSU { vector<int> parent; vector<int> sz; DSU(int n) : parent(n), sz(n, 1) { for (int i = 0; i < n; i++) parent[i] = i; } int find(int a) { if(parent[a] == a) return a; return parent[a] = find(parent[a]); } void unionn(int a, int b) { a = find(a); b = find(b); if(a == b) return; if(sz[a] < sz[b]) swap(a, b); parent[b] = a; sz[a] += sz[b]; } }; int main(){ ios::sync_with_stdio(false); cin.tie(nullptr); int T; cin >> T; while(T--){ int N; cin >> N; vector<string> names(N); for (int i = 0; i < N; i++){ cin >> names[i]; } DSU dsu(N); // Map each distinct mask to one representative index. unordered_map<int,int> maskToRep; maskToRep.reserve(N * 2); // For each name, compute the bitmask and union names with the same mask. vector<int> masks(N, 0); for (int i = 0; i < N; i++){ int mask = 0; for (char c : names[i]){ mask |= (1 << (c - 'A')); } masks[i] = mask; if(maskToRep.find(mask) == maskToRep.end()){ maskToRep[mask] = i; } else { dsu.unionn(i, maskToRep[mask]); } } // For every distinct mask, generate all neighbor candidates and union groups. for(auto &p : maskToRep){ int mask = p.first; int rep = p.second; // 1-bit differences: removal or addition. for(int i = 0; i < 26; i++){ int bit = 1 << i; int candidate; if(mask & bit){ // Removal: remove the letter i candidate = mask ^ bit; if(maskToRep.find(candidate) != maskToRep.end()){ dsu.unionn(rep, maskToRep[candidate]); } } else { // Addition: add the letter i candidate = mask | bit; if(maskToRep.find(candidate) != maskToRep.end()){ dsu.unionn(rep, maskToRep[candidate]); } } } // 2-bit differences: replacement (remove one letter, add another) for (int i = 0; i < 26; i++){ if(mask & (1 << i)){ for (int j = 0; j < 26; j++){ if(!(mask & (1 << j))){ int candidate = (mask ^ (1 << i)) | (1 << j); if(maskToRep.find(candidate) != maskToRep.end()){ dsu.unionn(rep, maskToRep[candidate]); } } } } } } // Count sizes of each connected component. unordered_map<int,int> compSize; compSize.reserve(N); int maxSize = 0; for (int i = 0; i < N; i++){ int root = dsu.find(i); compSize[root]++; maxSize = max(maxSize, compSize[root]); } int groups = compSize.size(); cout << maxSize << " " << groups << "\n"; } return 0; }
Input & Mask Creation:
Each name is converted into a bitmask where bit (for ) is set if the letter ('A' + i)
is in the name.
Union by Same Mask:
Using the maskToRep
map, we union all names that have exactly the same bitmask.
Generating Neighbors:
For each distinct mask:
Result Computation:
Finally, we count the sizes of the connected components. The maximum size is the size of the largest component, and the number of groups is the number of distinct components.
This solution efficiently meets the problem requirements with and maximum name length 20. Let me know if you have any questions or need further clarifications!