Привет. Напиши программу на python ищущую одинаков...

Question

Привет. Напиши программу на python ищущую одинаковые дома и другие строения в папке photo. Далее в файл info.json сохраняюся все найденные совпадения. А затем строятся дерево локальный html файлов HTML5 + минимум js без серверных штук (разворачивать web сервер пока не надо), на сгенерированных страницах размещены фото номер 12 к примеру внизу под фото три ссылки назад на 11 фото, на 13 фото вперед, и на оглавление index.html. А при клике на части фото переход на то фото которое тоже содержит такой же объект как и на этом фото.

Answer 1

Ниже — готовый скрипт на Python, который:

берёт все изображения из папки photo/
находит одинаковые/похожие строения (кластеризует одинаковые объекты по частям фото)
сохраняет все совпадения в site/info.json
генерирует статический HTML-сайт (без сервера) в папку site/:
- site/index.html — оглавление + список найденных совпадений (кластеры)
- site/pages/<фото>.html — страница конкретного фото
- на странице фото внизу — ссылки назад / вперёд / оглавление
- поверх фото — кликабельные прямоугольники по найденным объектам: клик переводит на другое фото с тем же объектом и подсвечивает кластер

Важно: “распознавание домов” сделано без спец-модели домов: скрипт генерирует кандидаты-области + ранжирует их CLIP-ом по текстовым подсказкам (“house/building/facade…”) и потом матчит их между фото по эмбеддингам. Это обычно работает на реальных “одинаковых зданиях”, но порог/подсказки можно подкрутить.

Установка зависимостей

bash
pip install torch torchvision open_clip_torch pillow numpy tqdm

Если у вас нет GPU — будет работать на CPU, просто медленнее.
При первом запуске веса CLIP могут скачаться автоматически.

Запуск

Структура папок (пример):

text
project/
  photo/
    0012.jpg
    0013.jpg
    ...
  build_site.py

Запуск:

bash
python build_site.py

Открывайте в браузере:

site/index.html

Скрипт `build_site.py`

python
import os
import re
import json
import math
import shutil
import argparse
from dataclasses import dataclass
from pathlib import Path
from typing import List, Tuple, Dict, Optional
from urllib.parse import quote

import numpy as np
from PIL import Image, ImageOps
from tqdm import tqdm

# --- Optional: OpenCV selective search (ускоряет/улучшает кандидаты областей)
try:
    import cv2  # type: ignore
    _HAS_CV2 = True
    _HAS_XIMGPROC = hasattr(cv2, "ximgproc") and hasattr(cv2.ximgproc, "segmentation")
except Exception:
    _HAS_CV2 = False
    _HAS_XIMGPROC = False

# --- CLIP
import torch
import open_clip


IMG_EXTS = {".jpg", ".jpeg", ".png", ".webp", ".bmp", ".tif", ".tiff"}


@dataclass
class Obj:
    obj_id: int
    img_idx: int
    file: str
    w: int
    h: int
    bbox: Tuple[int, int, int, int]  # x1,y1,x2,y2
    score: float                     # "похожесть на здание" (CLIP text score)
    feat: Optional[np.ndarray] = None
    cluster: int = -1


def natural_sort_key(p: Path):
    """
    Сортируем по числам в имени (чтобы 12 шло перед 100 и т.п.)
    """
    s = p.stem
    m = re.search(r"\d+", s)
    if m:
        return (int(m.group()), s.lower())
    return (10**18, s.lower())


def list_images(photo_dir: Path) -> List[Path]:
    files = [p for p in photo_dir.iterdir() if p.is_file() and p.suffix.lower() in IMG_EXTS]
    files.sort(key=natural_sort_key)
    return files


def exif_open(path: Path) -> Image.Image:
    im = Image.open(path).convert("RGB")
    im = ImageOps.exif_transpose(im)
    return im


def iou(a: Tuple[int, int, int, int], b: Tuple[int, int, int, int]) -> float:
    ax1, ay1, ax2, ay2 = a
    bx1, by1, bx2, by2 = b
    ix1, iy1 = max(ax1, bx1), max(ay1, by1)
    ix2, iy2 = min(ax2, bx2), min(ay2, by2)
    iw, ih = max(0, ix2 - ix1), max(0, iy2 - iy1)
    inter = iw * ih
    if inter <= 0:
        return 0.0
    area_a = max(0, ax2 - ax1) * max(0, ay2 - ay1)
    area_b = max(0, bx2 - bx1) * max(0, by2 - by1)
    denom = area_a + area_b - inter
    return float(inter) / float(denom) if denom > 0 else 0.0


def nms_by_score(cands: List[Tuple[Tuple[int, int, int, int], float]], iou_thr: float, top_k: int) -> List[Tuple[Tuple[int, int, int, int], float]]:
    """
    Простая NMS: оставляем top_k областей с минимальным перекрытием.
    """
    cands = sorted(cands, key=lambda x: x[1], reverse=True)
    kept: List[Tuple[Tuple[int, int, int, int], float]] = []
    for box, sc in cands:
        ok = True
        for kbox, _ in kept:
            if iou(box, kbox) >= iou_thr:
                ok = False
                break
        if ok:
            kept.append((box, sc))
        if len(kept) >= top_k:
            break
    return kept


def grid_proposals(w: int, h: int, max_boxes: int = 450) -> List[Tuple[int, int, int, int]]:
    """
    Генерация кандидатов областей без OpenCV:
    multi-scale sliding windows + разные соотношения сторон.
    """
    boxes = set()
    boxes.add((0, 0, w, h))

    min_dim = min(w, h)
    scales = [0.28, 0.38, 0.50, 0.65, 0.80]
    aspect = [0.7, 1.0, 1.35, 1.8]  # здания бывают широкие

    for s in scales:
        base = int(min_dim * s)
        for ar in aspect:
            bw = int(base * ar)
            bh = int(base)
            if bw <= 0 or bh <= 0:
                continue
            bw = min(bw, w)
            bh = min(bh, h)

            step_x = max(8, int(bw * 0.33))
            step_y = max(8, int(bh * 0.33))

            for y1 in range(0, max(1, h - bh + 1), step_y):
                for x1 in range(0, max(1, w - bw + 1), step_x):
                    x2, y2 = x1 + bw, y1 + bh
                    boxes.add((x1, y1, x2, y2))
                    if len(boxes) >= max_boxes:
                        return list(boxes)

    return list(boxes)[:max_boxes]


def selective_search_proposals(pil_im: Image.Image, max_boxes: int = 700) -> List[Tuple[int, int, int, int]]:
    """
    Если доступен opencv-contrib (ximgproc), используем selective search (обычно лучше кандидатов).
    """
    if not (_HAS_CV2 and _HAS_XIMGPROC):
        return []

    im = np.array(pil_im)[:, :, ::-1]  # RGB->BGR
    ss = cv2.ximgproc.segmentation.createSelectiveSearchSegmentation()
    ss.setBaseImage(im)
    ss.switchToSelectiveSearchFast()
    rects = ss.process()  # (x,y,w,h)
    boxes = []
    for (x, y, w, h) in rects[:max_boxes]:
        x1, y1 = int(x), int(y)
        x2, y2 = int(x + w), int(y + h)
        if x2 - x1 < 32 or y2 - y1 < 32:
            continue
        boxes.append((x1, y1, x2, y2))
    if not boxes:
        boxes = []
    # добавим full image
    W, H = pil_im.size
    boxes.append((0, 0, W, H))
    return boxes[:max_boxes]


@torch.inference_mode()
def build_clip(device: str, model_name: str, pretrained: str):
    model, _, preprocess = open_clip.create_model_and_transforms(model_name, pretrained=pretrained)
    tokenizer = open_clip.get_tokenizer(model_name)
    model = model.to(device).eval()
    return model, preprocess, tokenizer


@torch.inference_mode()
def encode_image(model, preprocess, device: str, pil_im: Image.Image) -> torch.Tensor:
    x = preprocess(pil_im).unsqueeze(0).to(device)
    feat = model.encode_image(x)
    feat = feat / feat.norm(dim=-1, keepdim=True).clamp_min(1e-12)
    return feat[0]


@torch.inference_mode()
def encode_text(model, tokenizer, device: str, texts: List[str]) -> torch.Tensor:
    tokens = tokenizer(texts).to(device)
    feat = model.encode_text(tokens)
    feat = feat / feat.norm(dim=-1, keepdim=True).clamp_min(1e-12)
    # усредняем несколько подсказок
    feat = feat.mean(dim=0)
    feat = feat / feat.norm().clamp_min(1e-12)
    return feat


def clamp_box(box: Tuple[int, int, int, int], w: int, h: int) -> Tuple[int, int, int, int]:
    x1, y1, x2, y2 = box
    x1 = max(0, min(x1, w - 1))
    y1 = max(0, min(y1, h - 1))
    x2 = max(x1 + 1, min(x2, w))
    y2 = max(y1 + 1, min(y2, h))
    return (x1, y1, x2, y2)


def area_frac(box: Tuple[int, int, int, int], w: int, h: int) -> float:
    x1, y1, x2, y2 = box
    return ((x2 - x1) * (y2 - y1)) / float(w * h)


def pick_object_boxes_for_image(
    pil_im: Image.Image,
    model,
    preprocess,
    text_feat: torch.Tensor,
    device: str,
    top_candidates: int,
    final_topk: int,
    min_area_frac: float,
    max_area_frac: float,
    nms_iou: float,
) -> List[Tuple[Tuple[int, int, int, int], float, np.ndarray]]:
    """
    1) генерируем кандидаты областей
    2) считаем CLIP similarity к "building" тексту (score)
    3) NMS + берём top_k
    """
    w, h = pil_im.size

    boxes = []
    # предпочтём selective search, если есть
    ss = selective_search_proposals(pil_im, max_boxes=900)
    if ss:
        boxes = ss
    else:
        boxes = grid_proposals(w, h, max_boxes=550)

    # фильтрация по площади
    filtered = []
    for b in boxes:
        b = clamp_box(b, w, h)
        af = area_frac(b, w, h)
        if af < min_area_frac or af > max_area_frac:
            continue
        filtered.append(b)

    # ограничим количество кандидатов
    if len(filtered) > 1000:
        filtered = filtered[:1000]

    scored: List[Tuple[Tuple[int, int, int, int], float, np.ndarray]] = []

    # Считаем score для top_candidates кандидатов по грубой эвристике:
    # сначала пройдёмся равномерно, чтобы не утонуть в 1000+ регионах.
    # Потом отсортируем.
    step = max(1, len(filtered) // max(1, top_candidates))
    sampled = filtered[::step][:top_candidates]

    for b in sampled:
        x1, y1, x2, y2 = b
        crop = pil_im.crop((x1, y1, x2, y2))
        feat = encode_image(model, preprocess, device, crop)  # torch
        sc = float(torch.dot(feat, text_feat).item())
        scored.append((b, sc, feat.detach().cpu().numpy().astype(np.float32)))

    # NMS на score
    for_nms = [(b, sc) for (b, sc, _) in scored]
    kept = nms_by_score(for_nms, iou_thr=nms_iou, top_k=final_topk)

    kept_set = set([k[0] for k in kept])
    out = [(b, sc, f) for (b, sc, f) in scored if b in kept_set]
    out.sort(key=lambda x: x[1], reverse=True)
    return out


class DSU:
    def __init__(self, n: int):
        self.p = list(range(n))
        self.r = [0] * n

    def find(self, a: int) -> int:
        while self.p[a] != a:
            self.p[a] = self.p[self.p[a]]
            a = self.p[a]
        return a

    def union(self, a: int, b: int):
        ra, rb = self.find(a), self.find(b)
        if ra == rb:
            return
        if self.r[ra] < self.r[rb]:
            self.p[ra] = rb
        elif self.r[ra] > self.r[rb]:
            self.p[rb] = ra
        else:
            self.p[rb] = ra
            self.r[ra] += 1


def ensure_dir(p: Path):
    p.mkdir(parents=True, exist_ok=True)


def write_text(path: Path, s: str):
    path.write_text(s, encoding="utf-8")


def pct(v: float) -> str:
    return f"{v:.4f}%"


def gen_assets(out_dir: Path):
    css = """
:root { font-family: system-ui, -apple-system, Segoe UI, Roboto, Arial, sans-serif; }
body { margin: 16px; line-height: 1.35; }
a { color: #0645ad; text-decoration: none; }
a:hover { text-decoration: underline; }
.header { display:flex; gap:16px; align-items:baseline; flex-wrap:wrap; }
small { color: #555; }
.photo-wrap { position: relative; display: inline-block; max-width: 100%; }
#mainPhoto { max-width: 100%; height: auto; display:block; }
.box {
  position:absolute;
  border: 2px solid rgba(255, 165, 0, 0.55);
  background: rgba(255, 165, 0, 0.10);
  box-sizing: border-box;
  border-radius: 4px;
  cursor: pointer;
}
.box:hover { border-color: rgba(0, 128, 255, 0.85); background: rgba(0, 128, 255, 0.10); }
.box.hl { border-color: rgba(255, 0, 0, 0.90); background: rgba(255, 0, 0, 0.12); }
.nav { margin-top: 12px; display:flex; gap:12px; flex-wrap:wrap; }
.hr { margin: 16px 0; border-top: 1px solid #ddd; }
.grid { display:grid; grid-template-columns: repeat(auto-fill, minmax(240px, 1fr)); gap:12px; }
.card { border: 1px solid #ddd; border-radius: 8px; padding: 10px; }
.thumb { width: 100%; height: auto; border-radius: 6px; border:1px solid #eee; }
.mono { font-family: ui-monospace, SFMono-Regular, Menlo, Consolas, monospace; }
"""
    js = r"""
(function () {
  function highlightByHash() {
    const h = (location.hash || "").replace("#", "").trim();
    const boxes = document.querySelectorAll(".box");
    boxes.forEach(b => b.classList.remove("hl"));
    if (!h) return;
    if (h.startsWith("c")) {
      const cid = h.slice(1);
      boxes.forEach(b => {
        if (b.dataset.cluster === cid) b.classList.add("hl");
      });
    }
  }
  window.addEventListener("hashchange", highlightByHash);
  window.addEventListener("load", highlightByHash);
})();
"""
    ensure_dir(out_dir / "assets")
    write_text(out_dir / "assets" / "style.css", css.strip() + "\n")
    write_text(out_dir / "assets" / "script.js", js.strip() + "\n")


def rel_photo_path_from_page() -> str:
    # site/pages/*.html -> ../../photo/...
    return "../../photo"


def rel_photo_path_from_index() -> str:
    # site/index.html -> ../photo/...
    return "../photo"


def safe_page_name(photo_file: str) -> str:
    # делаем стабильное имя html-страницы
    # заменим пробелы и неудобные символы
    name = photo_file
    # оставим расширение в ссылках на фото, но страницу делаем .html
    stem = Path(name).stem
    # базовая санитаризация
    stem = re.sub(r"[^0-9A-Za-zА-Яа-я_\-]+", "_", stem)
    return stem + ".html"


def save_thumb(pil_im: Image.Image, box: Tuple[int, int, int, int], out_path: Path, max_side: int = 420):
    crop = pil_im.crop(box)
    w, h = crop.size
    scale = max_side / float(max(w, h))
    if scale < 1.0:
        crop = crop.resize((int(w * scale), int(h * scale)), Image.Resampling.LANCZOS)
    crop.save(out_path, quality=85)


def build_site(
    photo_dir: Path,
    out_dir: Path,
    device: str,
    model_name: str,
    pretrained: str,
    top_candidates: int,
    objects_per_image: int,
    sim_threshold: float,
    min_area_frac: float,
    max_area_frac: float,
    nms_iou: float,
):
    images = list_images(photo_dir)
    if not images:
        raise SystemExit(f"В папке {photo_dir} не найдено изображений.")

    ensure_dir(out_dir)
    ensure_dir(out_dir / "pages")
    ensure_dir(out_dir / "thumbs")
    gen_assets(out_dir)

    # CLIP
    model, preprocess, tokenizer = build_clip(device, model_name, pretrained)

    prompts = [
        "a photo of a house",
        "a photo of a building",
        "a photo of a facade",
        "a photo of an apartment building",
        "a photo of a cottage",
        "a photo of a barn",
        "a photo of a townhouse",
        "a photo of a storefront",
    ]
    text_feat = encode_text(model, tokenizer, device, prompts)

    # 1) Extract objects (candidate building regions)
    all_objs: List[Obj] = []
    per_img_objs: Dict[int, List[int]] = {}

    obj_id = 0
    for img_idx, img_path in enumerate(tqdm(images, desc="Сканирование фото")):
        pil_im = exif_open(img_path)
        w, h = pil_im.size
        chosen = pick_object_boxes_for_image(
            pil_im=pil_im,
            model=model,
            preprocess=preprocess,
            text_feat=text_feat,
            device=device,
            top_candidates=top_candidates,
            final_topk=objects_per_image,
            min_area_frac=min_area_frac,
            max_area_frac=max_area_frac,
            nms_iou=nms_iou,
        )

        per_img_objs[img_idx] = []
        for box, sc, feat in chosen:
            o = Obj(
                obj_id=obj_id,
                img_idx=img_idx,
                file=img_path.name,
                w=w,
                h=h,
                bbox=box,
                score=float(sc),
                feat=feat,
            )
            all_objs.append(o)
            per_img_objs[img_idx].append(obj_id)
            obj_id += 1

    # 2) Match & cluster by cosine similarity of object embeddings
    feats = np.stack([o.feat for o in all_objs], axis=0)  # (N, D)
    # нормализуем ещё раз на всякий
    norms = np.linalg.norm(feats, axis=1, keepdims=True)
    feats = feats / np.clip(norms, 1e-12, None)

    N = feats.shape[0]
    dsu = DSU(N)

    # ускорение: сравним блоками, и не сравниваем объекты из одного и того же фото только (разрешим, но это редко нужно)
    # здесь делаем полный O(N^2) — обычно N = images * objects_per_image (например 200 фото * 12 = 2400 -> уже тяжело).
    # Поэтому введём простую оптимизацию: сравниваем только между изображениями, и ограничим N.
    # Если у вас много фото, снижайте objects_per_image или повышайте sim_threshold.
    sim_thr = float(sim_threshold)

    for i in tqdm(range(N), desc="Поиск совпадений (кластеризация)"):
        fi = feats[i]
        # dot with all
        sims = feats @ fi
        # связываем только достаточно похожие
        js = np.where(sims >= sim_thr)[0]
        for j in js:
            if j <= i:
                continue
            # опционально: можно запретить внутри одного фото
            if all_objs[i].img_idx == all_objs[j].img_idx:
                continue
            dsu.union(i, j)

    # assign clusters
    root_to_cluster: Dict[int, int] = {}
    cluster_members: Dict[int, List[int]] = {}
    next_cluster = 0

    for k in range(N):
        r = dsu.find(k)
        if r not in root_to_cluster:
            root_to_cluster[r] = next_cluster
            next_cluster += 1
        cid = root_to_cluster[r]
        cluster_members.setdefault(cid, []).append(k)

    # clusters with size >=2 (real matches)
    valid_clusters = {cid for cid, mem in cluster_members.items() if len(mem) >= 2}
    for idx, o in enumerate(all_objs):
        o.cluster = root_to_cluster[dsu.find(idx)] if root_to_cluster[dsu.find(idx)] in valid_clusters else -1

    # 3) Build info.json
    img_pages = []
    for i, p in enumerate(images):
        page = safe_page_name(p.name)
        img_pages.append(page)

    clusters_json = []
    for cid in sorted(valid_clusters):
        mem = cluster_members[cid]
        members = []
        for k in mem:
            o = all_objs[k]
            members.append({
                "file": o.file,
                "img_index": o.img_idx,
                "bbox": list(map(int, o.bbox)),
                "score": float(o.score),
                "page": f"pages/{safe_page_name(o.file)}",
            })
        clusters_json.append({
            "cluster": cid,
            "size": len(mem),
            "members": members,
        })

    images_json = []
    for img_idx, img_path in enumerate(images):
        objs = []
        for oid in per_img_objs.get(img_idx, []):
            o = all_objs[oid]
            if o.cluster < 0:
                continue
            objs.append({
                "obj_id": o.obj_id,
                "cluster": o.cluster,
                "bbox": list(map(int, o.bbox)),
                "score": float(o.score),
            })
        images_json.append({
            "img_index": img_idx,
            "file": img_path.name,
            "page": f"pages/{safe_page_name(img_path.name)}",
            "objects": objs,
        })

    info = {
        "photo_dir": str(photo_dir),
        "out_dir": str(out_dir),
        "model": {"name": model_name, "pretrained": pretrained},
        "params": {
            "top_candidates": top_candidates,
            "objects_per_image": objects_per_image,
            "sim_threshold": sim_threshold,
            "min_area_frac": min_area_frac,
            "max_area_frac": max_area_frac,
            "nms_iou": nms_iou,
            "prompts": prompts,
        },
        "images": images_json,
        "clusters": clusters_json,
    }
    write_text(out_dir / "info.json", json.dumps(info, ensure_ascii=False, indent=2))

    # 4) Thumbnails for clusters
    # возьмём первый объект кластера как превью
    for cid in tqdm(sorted(valid_clusters), desc="Превью (thumbs)"):
        k0 = cluster_members[cid][0]
        o0 = all_objs[k0]
        pil0 = exif_open(photo_dir / o0.file)
        thumb_path = out_dir / "thumbs" / f"c{cid}.jpg"
        save_thumb(pil0, o0.bbox, thumb_path)

    # 5) Generate HTML pages
    # helper maps: cluster -> list of member pages
    cluster_to_members: Dict[int, List[Obj]] = {}
    for o in all_objs:
        if o.cluster >= 0:
            cluster_to_members.setdefault(o.cluster, []).append(o)

    # sort members for stable navigation
    for cid in cluster_to_members:
        cluster_to_members[cid].sort(key=lambda x: (x.img_idx, x.obj_id))

    # index.html
    idx_cards = []
    idx_cards.append("<h2>Найденные совпадения (кластеры одинаковых строений)</h2>")
    idx_cards.append('<div class="grid">')
    for cid in sorted(valid_clusters):
        thumb = f"thumbs/c{cid}.jpg"
        mem = cluster_members[cid]
        # link to first member page
        first_obj = all_objs[mem[0]]
        first_page = f"pages/{safe_page_name(first_obj.file)}#c{cid}"
        idx_cards.append(
            f"""
<div class="card">
  <div><b>Кластер c{cid}</b> <small>({len(mem)} совпад.)</small></div>
  <div style="margin-top:8px;">
    <a href="{quote(first_page)}"><img class="thumb" src="{quote(thumb)}" alt="c{cid}"></a>
  </div>
  <div style="margin-top:8px;"><a href="{quote(first_page)}">Открыть первое совпадение</a></div>
</div>
""".strip()
        )
    idx_cards.append("</div>")

    # list all photos
    all_list = ["<h2>Все фото</h2>", "<ul>"]
    for img_path in images:
        page = f"pages/{safe_page_name(img_path.name)}"
        all_list.append(f'<li><a class="mono" href="{quote(page)}">{img_path.name}</a></li>')
    all_list.append("</ul>")

    index_html = f"""<!doctype html>
<html lang="ru">
<head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <title>Фото-оглавление</title>
  <link rel="stylesheet" href="assets/style.css">
</head>
<body>
  <div class="header">
    <h1 style="margin:0;">Оглавление</h1>
    <small>Открывайте локально, сервер не нужен</small>
    <small>info.json: <span class="mono">site/info.json</span></small>
  </div>

  <div class="hr"></div>

  {"".join(idx_cards)}

  <div class="hr"></div>

  {"".join(all_list)}

</body>
</html>
"""
    write_text(out_dir / "index.html", index_html)

    # per image pages
    for img_idx, img_path in enumerate(tqdm(images, desc="Генерация страниц")):
        page_name = safe_page_name(img_path.name)
        prev_page = f"{safe_page_name(images[img_idx - 1].name)}" if img_idx > 0 else None
        next_page = f"{safe_page_name(images[img_idx + 1].name)}" if img_idx < len(images) - 1 else None

        # open once for dims
        pil_im = exif_open(img_path)
        w, h = pil_im.size

        # build boxes overlay
        boxes_html = []
        # also list matches by cluster
        clusters_here: Dict[int, List[Obj]] = {}

        for oid in per_img_objs.get(img_idx, []):
            o = all_objs[oid]
            if o.cluster < 0:
                continue
            clusters_here.setdefault(o.cluster, []).append(o)

        for cid, objs_here in clusters_here.items():
            # выберем "target" — следующий объект в кластере, который не в этом же фото
            members = cluster_to_members[cid]
            # найдём текущий объект в кластере и возьмём следующий, иначе первый подходящий
            targets = [m for m in members if m.img_idx != img_idx]
            if not targets:
                continue
            target = targets[0]
            target_href = f"{safe_page_name(target.file)}#c{cid}"

            for o in objs_here:
                x1, y1, x2, y2 = o.bbox
                left = 100.0 * x1 / w
                top = 100.0 * y1 / h
                ww = 100.0 * (x2 - x1) / w
                hh = 100.0 * (y2 - y1) / h
                title = f"Кластер c{cid} (score {o.score:.3f})"
                boxes_html.append(
                    f'<a class="box" data-cluster="{cid}" href="{quote(target_href)}" '
                    f'style="left:{left:.4f}%; top:{top:.4f}%; width:{ww:.4f}%; height:{hh:.4f}%;" '
                    f'title="{title}"></a>'
                )

        # list section under image
        matches_lines = []
        if clusters_here:
            matches_lines.append("<h3>Совпадения на этом фото</h3>")
            matches_lines.append("<ul>")
            for cid in sorted(clusters_here.keys()):
                others = [m for m in cluster_to_members[cid] if m.img_idx != img_idx]
                links = []
                for m in others[:12]:
                    href = f"{safe_page_name(m.file)}#c{cid}"
                    links.append(f'<a class="mono" href="{quote(href)}">{m.file}</a>')
                more = ""
                if len(others) > 12:
                    more = f" … +{len(others)-12}"
                matches_lines.append(f"<li>Кластер <b>c{cid}</b>: {', '.join(links)}{more}</li>")
            matches_lines.append("</ul>")
        else:
            matches_lines.append("<p><small>На этом фото совпадений (кластеров размером ≥ 2) не найдено.</small></p>")

        # nav
        nav = ['<div class="nav">']
        if prev_page:
            nav.append(f'<a href="{quote(prev_page)}">← Назад</a>')
        else:
            nav.append('<span><small>← Назад</small></span>')
        nav.append('<a href="../index.html">Оглавление</a>')
        if next_page:
            nav.append(f'<a href="{quote(next_page)}">Вперёд →</a>')
        else:
            nav.append('<span><small>Вперёд →</small></span>')
        nav.append("</div>")

        photo_src = f"{rel_photo_path_from_page()}/{img_path.name}"
        page_html = f"""<!doctype html>
<html lang="ru">
<head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <title>{img_path.name}</title>
  <link rel="stylesheet" href="../assets/style.css">
  <script defer src="../assets/script.js"></script>
</head>
<body>
  <div class="header">
    <h1 style="margin:0;" class="mono">{img_path.name}</h1>
    <small>Клик по выделенной области → переход к фото с тем же объектом</small>
  </div>

  <div class="photo-wrap" style="margin-top:12px;">
    <img id="mainPhoto" src="{quote(photo_src)}" alt="{img_path.name}">
    {"".join(boxes_html)}
  </div>

  {"".join(nav)}

  <div class="hr"></div>

  {"".join(matches_lines)}

  <div class="hr"></div>
  <p><small>Подсказка: чтобы подсветить кластер, можно открыть ссылку вида <span class="mono">...#c12</span></small></p>
</body>
</html>
"""
        write_text(out_dir / "pages" / page_name, page_html)

    print(f"\nГотово! Откройте: {out_dir / 'index.html'}")
    print(f"Совпадения в JSON: {out_dir / 'info.json'}")


def main():
    parser = argparse.ArgumentParser(description="Поиск одинаковых строений на фото + генерация статического HTML.")
    parser.add_argument("--photo_dir", default="photo", help="Папка с исходными фото (по умолчанию: photo)")
    parser.add_argument("--out_dir", default="site", help="Выходная папка сайта (по умолчанию: site)")

    parser.add_argument("--device", default="cuda" if torch.cuda.is_available() else "cpu",
                        help="cuda или cpu (по умолчанию: авто)")
    parser.add_argument("--model", default="ViT-B-32", help="CLIP модель open_clip (по умолчанию ViT-B-32)")
    parser.add_argument("--pretrained", default="laion2b_s34b_b79k", help="pretrained веса (по умолчанию laion2b_s34b_b79k)")

    parser.add_argument("--top_candidates", type=int, default=220, help="Сколько кандидатов областей оценивать CLIP-ом на фото")
    parser.add_argument("--objects_per_image", type=int, default=12, help="Сколько объектов (областей) оставить на фото")
    parser.add_argument("--sim_threshold", type=float, default=0.92, help="Порог похожести для 'тот же объект' (0..1)")

    parser.add_argument("--min_area_frac", type=float, default=0.06, help="Мин. площадь области как доля фото")
    parser.add_argument("--max_area_frac", type=float, default=0.90, help="Макс. площадь области как доля фото")
    parser.add_argument("--nms_iou", type=float, default=0.40, help="NMS IoU для схлопывания похожих областей")
    args = parser.parse_args()

    photo_dir = Path(args.photo_dir)
    out_dir = Path(args.out_dir)

    build_site(
        photo_dir=photo_dir,
        out_dir=out_dir,
        device=args.device,
        model_name=args.model,
        pretrained=args.pretrained,
        top_candidates=args.top_candidates,
        objects_per_image=args.objects_per_image,
        sim_threshold=args.sim_threshold,
        min_area_frac=args.min_area_frac,
        max_area_frac=args.max_area_frac,
        nms_iou=args.nms_iou,
    )


if __name__ == "__main__":
    main()

Как “подкрутить”, если совпадений мало/много

Слишком мало совпадений → уменьшите порог:
```
bash
python build_site.py --sim_threshold 0.90
```
Слишком много ложных совпадений → увеличьте порог:
```
bash
python build_site.py --sim_threshold 0.94
```
Если на фото много объектов и нужный дом “не ловится”:
```
bash
python build_site.py --objects_per_image 18 --top_candidates 350
```

Если хотите, я могу сделать второй режим “быстрый поиск дублей целых кадров” (pHash) + отдельно режим “только здания” с более тяжёлой open-vocabulary детекцией (GroundingDINO/SAM) — но текущий вариант уже выполняет все ваши пункты: JSON + дерево HTML + навигация + клики по областям.

Привет. Напиши программу на python ищущую одинаков...

Вопрос

Мысль

Defining approach for scanning and matching photos

Designing image matching program for buildings

Designing offline-friendly building matching process

Generating HTML for image clustering and navigation

Building responsive webpage with image clustering

Building the detection approach

Optimizing proposal handling for building detection

Preparing site output with relative paths

Providing implementation details for local HTML site generation

Designing clustering and webpage functionality

Ответ

Установка зависимостей

Запуск

Скрипт `build_site.py`

Как “подкрутить”, если совпадений мало/много

Поделиться этим Q&A

Вопрос

Мысль

Defining approach for scanning and matching photos

Designing image matching program for buildings

Designing offline-friendly building matching process

Generating HTML for image clustering and navigation

Building responsive webpage with image clustering

Building the detection approach

Optimizing proposal handling for building detection

Preparing site output with relative paths

Providing implementation details for local HTML site generation

Designing clustering and webpage functionality

Ответ

Установка зависимостей

Запуск

Скрипт build_site.py

Как “подкрутить”, если совпадений мало/много

Поделиться этим Q&A

Скрипт `build_site.py`