中級者向け No.068

Webスクレイピングツール

URLを入力してBeautifulSoupでページのテーブル・リンク・テキストを抽出しCSV保存するスクレイピングGUI。

🎯 難易度: ★★★ 📦 ライブラリ: tkinter(標準ライブラリ), requests, beautifulsoup4 ⏱️ 制作時間: 30〜90分

1. アプリ概要

URLを入力してBeautifulSoupでページのテーブル・リンク・テキストを抽出しCSV保存するスクレイピングGUI。

このアプリはnetworkカテゴリの実践的なPythonアプリです。使用ライブラリは tkinter(標準ライブラリ)・requests・beautifulsoup4、難易度は ★★★ です。

Pythonの豊富なライブラリを活用することで、実用的なアプリを短いコードで実装できます。ソースコードをコピーして実行し、仕組みを理解したうえでカスタマイズに挑戦してみてください。

GUIアプリ開発はプログラミングの楽しさを実感できる最も効果的な学習方法のひとつです。変数・関数・クラス・イベント処理などの重要な概念が自然と身につきます。

2. 機能一覧

  • Webスクレイピングツールのメイン機能
  • 直感的なGUIインターフェース
  • 入力値のバリデーション
  • エラーハンドリング
  • 結果の見やすい表示
  • クリア機能付き

3. 事前準備・環境

ℹ️
動作確認環境

Python 3.10 以上 / Windows・Mac・Linux すべて対応

以下の環境で動作確認しています。

  • Python 3.10 以上
  • OS: Windows 10/11・macOS 12+・Ubuntu 20.04+

インストールが必要なライブラリ

pip install requests

4. 完全なソースコード

💡
コードのコピー方法

右上の「コピー」ボタンをクリックするとコードをクリップボードにコピーできます。

app068.py
import tkinter as tk
from tkinter import ttk, messagebox, filedialog
import urllib.request
import re
import html
import csv
import threading
import json

try:
    from html.parser import HTMLParser
    HTML_AVAILABLE = True
except ImportError:
    HTML_AVAILABLE = False


class App068:
    """Webスクレイピングツール"""

    def __init__(self, root):
        self.root = root
        self.root.title("Webスクレイピングツール")
        self.root.geometry("1100x680")
        self.root.configure(bg="#0d1117")
        self._results = []
        self._build_ui()

    def _build_ui(self):
        header = tk.Frame(self.root, bg="#161b22", pady=6)
        header.pack(fill=tk.X)
        tk.Label(header, text="🕷 Webスクレイピングツール",
                 font=("Noto Sans JP", 12, "bold"),
                 bg="#161b22", fg="#f0f6fc").pack(side=tk.LEFT, padx=12)

        # URL入力
        url_f = tk.Frame(self.root, bg="#0d1117", pady=4)
        url_f.pack(fill=tk.X, padx=8)
        tk.Label(url_f, text="URL:", bg="#0d1117", fg="#c9d1d9",
                 font=("Arial", 9)).pack(side=tk.LEFT)
        self.url_var = tk.StringVar(
            value="https://quotes.toscrape.com/")
        ttk.Entry(url_f, textvariable=self.url_var,
                  width=55).pack(side=tk.LEFT, padx=4)
        ttk.Button(url_f, text="▶ 取得",
                   command=self._fetch).pack(side=tk.LEFT)

        # 抽出設定
        extract_f = tk.LabelFrame(self.root, text="抽出設定",
                                   bg="#161b22", fg="#c9d1d9",
                                   font=("Arial", 9), padx=8, pady=4)
        extract_f.pack(fill=tk.X, padx=8, pady=4)

        tk.Label(extract_f, text="モード:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=0, column=0, sticky="w")
        self.mode_var = tk.StringVar(value="links")
        modes = [("リンク", "links"), ("テキスト", "text"),
                 ("見出し", "headings"), ("テーブル", "tables"),
                 ("メタタグ", "meta"), ("CSSセレクタ", "css")]
        for i, (label, val) in enumerate(modes):
            tk.Radiobutton(extract_f, text=label, variable=self.mode_var,
                           value=val, bg="#161b22", fg="#c9d1d9",
                           selectcolor="#0d1117",
                           activebackground="#161b22").grid(
                row=0, column=i+1, padx=4)

        tk.Label(extract_f, text="CSSセレクタ:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=1, column=0, sticky="w", pady=2)
        self.css_var = tk.StringVar(value=".quote")
        ttk.Entry(extract_f, textvariable=self.css_var,
                  width=30).grid(row=1, column=1, columnspan=3, sticky="w", padx=4)
        tk.Label(extract_f, text="属性:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=1, column=4, sticky="w")
        self.attr_var = tk.StringVar(value="href")
        ttk.Entry(extract_f, textvariable=self.attr_var,
                  width=10).grid(row=1, column=5, sticky="w")
        ttk.Button(extract_f, text="🔍 抽出",
                   command=self._extract).grid(row=1, column=6, padx=8)

        # 結果エリア
        paned = ttk.PanedWindow(self.root, orient=tk.HORIZONTAL)
        paned.pack(fill=tk.BOTH, expand=True, padx=4, pady=4)

        # 左: 結果リスト
        left = tk.Frame(paned, bg="#0d1117")
        paned.add(left, weight=1)
        tk.Label(left, text="抽出結果", bg="#0d1117", fg="#8b949e",
                 font=("Arial", 9)).pack(anchor="w")
        self.result_list = tk.Listbox(
            left, bg="#161b22", fg="#c9d1d9",
            selectbackground="#1f6feb", font=("Arial", 9),
            relief=tk.FLAT, activestyle="none")
        lsb = ttk.Scrollbar(left, command=self.result_list.yview)
        self.result_list.configure(yscrollcommand=lsb.set)
        lsb.pack(side=tk.RIGHT, fill=tk.Y)
        self.result_list.pack(fill=tk.BOTH, expand=True)
        self.result_list.bind("<<ListboxSelect>>", self._on_result_select)

        btn_f = tk.Frame(left, bg="#0d1117", pady=2)
        btn_f.pack(fill=tk.X)
        ttk.Button(btn_f, text="📋 コピー",
                   command=self._copy).pack(side=tk.LEFT, padx=2)
        ttk.Button(btn_f, text="💾 CSV保存",
                   command=self._save_csv).pack(side=tk.LEFT, padx=2)
        ttk.Button(btn_f, text="JSON保存",
                   command=self._save_json).pack(side=tk.LEFT, padx=2)

        # 右: HTMLプレビュー
        right = tk.Frame(paned, bg="#0d1117")
        paned.add(right, weight=1)
        tk.Label(right, text="HTML ソース", bg="#0d1117", fg="#8b949e",
                 font=("Arial", 9)).pack(anchor="w")
        self.html_text = tk.Text(right, bg="#161b22", fg="#6e7681",
                                  font=("Courier New", 8), relief=tk.FLAT,
                                  wrap=tk.NONE, state=tk.DISABLED)
        hxsb = ttk.Scrollbar(right, orient=tk.HORIZONTAL,
                              command=self.html_text.xview)
        hysb = ttk.Scrollbar(right, command=self.html_text.yview)
        self.html_text.configure(xscrollcommand=hxsb.set,
                                  yscrollcommand=hysb.set)
        hysb.pack(side=tk.RIGHT, fill=tk.Y)
        self.html_text.pack(fill=tk.BOTH, expand=True)
        hxsb.pack(fill=tk.X)

        self.progress = ttk.Progressbar(self.root, mode="indeterminate")
        self.progress.pack(fill=tk.X, padx=8)
        self.status_var = tk.StringVar(value="URLを入力して取得してください")
        tk.Label(self.root, textvariable=self.status_var,
                 bg="#21262d", fg="#8b949e", font=("Arial", 9),
                 anchor="w", padx=8).pack(fill=tk.X, side=tk.BOTTOM)

        self._raw_html = ""

    def _fetch(self):
        url = self.url_var.get().strip()
        if not url.startswith("http"):
            messagebox.showerror("エラー", "有効なURLを入力してください")
            return
        self.progress.start()
        self.status_var.set("取得中...")
        threading.Thread(target=self._do_fetch, args=(url,),
                          daemon=True).start()

    def _do_fetch(self, url):
        try:
            req = urllib.request.Request(
                url, headers={"User-Agent": "Mozilla/5.0"})
            with urllib.request.urlopen(req, timeout=10) as resp:
                raw = resp.read()
            # エンコード検出
            for enc in ("utf-8", "shift-jis", "euc-jp", "latin-1"):
                try:
                    text = raw.decode(enc)
                    break
                except Exception:
                    text = raw.decode("utf-8", errors="replace")
            self.root.after(0, self._on_fetched, text)
        except Exception as e:
            self.root.after(0, self.status_var.set, f"エラー: {e}")
            self.root.after(0, self.progress.stop)

    def _on_fetched(self, html_text):
        self.progress.stop()
        self._raw_html = html_text
        self.html_text.configure(state=tk.NORMAL)
        self.html_text.delete("1.0", tk.END)
        self.html_text.insert("1.0", html_text[:100000])
        self.html_text.configure(state=tk.DISABLED)
        self.status_var.set(f"取得完了: {len(html_text):,} 文字")
        self._extract()

    def _extract(self):
        if not self._raw_html:
            messagebox.showwarning("警告", "先にHTMLを取得してください")
            return
        mode = self.mode_var.get()
        results = []
        h = self._raw_html

        if mode == "links":
            results = re.findall(r'href=["\']([^"\']+)["\']', h)
            results = [r for r in results if r.startswith("http")]
        elif mode == "text":
            clean = re.sub(r"<[^>]+>", " ", h)
            clean = re.sub(r"\s+", " ", html.unescape(clean)).strip()
            results = [clean[:5000]]
        elif mode == "headings":
            for tag in ("h1", "h2", "h3", "h4", "h5", "h6"):
                for m in re.finditer(
                        f"<{tag}[^>]*>(.*?)</{tag}>", h,
                        re.IGNORECASE | re.DOTALL):
                    text = re.sub(r"<[^>]+>", "", m.group(1)).strip()
                    if text:
                        results.append(f"[{tag.upper()}] {text[:100]}")
        elif mode == "tables":
            for m in re.finditer(r"<table[^>]*>(.*?)</table>", h,
                                  re.IGNORECASE | re.DOTALL):
                rows = re.findall(r"<tr[^>]*>(.*?)</tr>", m.group(1),
                                   re.IGNORECASE | re.DOTALL)
                for row in rows:
                    cells = re.findall(r"<t[dh][^>]*>(.*?)</t[dh]>", row,
                                       re.IGNORECASE | re.DOTALL)
                    row_text = " | ".join(
                        re.sub(r"<[^>]+>", "", c).strip() for c in cells)
                    if row_text.strip():
                        results.append(row_text)
        elif mode == "meta":
            for m in re.finditer(r"<meta[^>]+>", h, re.IGNORECASE):
                results.append(m.group(0)[:120])
        elif mode == "css":
            # 簡易CSSセレクタ (tag.class / #id / タグのみ)
            selector = self.css_var.get().strip()
            attr = self.attr_var.get().strip()
            tag_match = re.match(r"^([a-z]+)?(?:\.([a-z0-9_-]+))?$",
                                  selector, re.IGNORECASE)
            if tag_match:
                tag = tag_match.group(1) or r"[a-z]+"
                cls = tag_match.group(2)
                if cls:
                    pattern = (f'<{tag}[^>]*class=["\'][^"\']*{cls}[^"\']*["\']'
                                f'[^>]*>(.*?)</{tag}>')
                else:
                    pattern = f"<{tag}[^>]*>(.*?)</{tag}>"
                for m in re.finditer(pattern, h,
                                      re.IGNORECASE | re.DOTALL):
                    inner = re.sub(r"<[^>]+>", "", m.group(1)).strip()
                    inner = re.sub(r"\s+", " ", inner)
                    if attr:
                        # 属性値を取得
                        attr_m = re.search(
                            f'{attr}=["\']([^"\']+)["\']', m.group(0))
                        if attr_m:
                            results.append(attr_m.group(1))
                    else:
                        if inner:
                            results.append(inner[:150])
            else:
                results.append("セレクタの形式が非対応です")

        self._results = results
        self.result_list.delete(0, tk.END)
        for r in results:
            self.result_list.insert(tk.END, r[:120])
        self.status_var.set(f"抽出完了: {len(results)} 件")

    def _on_result_select(self, event=None):
        sel = self.result_list.curselection()
        if not sel or sel[0] >= len(self._results):
            return

    def _copy(self):
        text = "\n".join(self._results)
        self.root.clipboard_clear()
        self.root.clipboard_append(text)
        self.status_var.set("コピーしました")

    def _save_csv(self):
        path = filedialog.asksaveasfilename(
            defaultextension=".csv", filetypes=[("CSV", "*.csv")])
        if not path:
            return
        try:
            with open(path, "w", newline="", encoding="utf-8-sig") as f:
                writer = csv.writer(f)
                writer.writerow(["結果"])
                for r in self._results:
                    writer.writerow([r])
            messagebox.showinfo("完了", f"保存: {path}")
        except Exception as e:
            messagebox.showerror("エラー", str(e))

    def _save_json(self):
        path = filedialog.asksaveasfilename(
            defaultextension=".json", filetypes=[("JSON", "*.json")])
        if not path:
            return
        try:
            with open(path, "w", encoding="utf-8") as f:
                json.dump({"url": self.url_var.get(),
                           "results": self._results},
                           f, ensure_ascii=False, indent=2)
            messagebox.showinfo("完了", f"保存: {path}")
        except Exception as e:
            messagebox.showerror("エラー", str(e))


if __name__ == "__main__":
    root = tk.Tk()
    app = App068(root)
    root.mainloop()

5. コード解説

Webスクレイピングツールのコードを詳しく解説します。クラスベースの設計で各機能を整理して実装しています。

クラス設計とコンストラクタ

App068クラスにアプリの全機能をまとめています。__init__でウィンドウ設定、_build_ui()でUI構築、process()でメイン処理を担当します。責任の分離により、コードが読みやすくなります。

import tkinter as tk
from tkinter import ttk, messagebox, filedialog
import urllib.request
import re
import html
import csv
import threading
import json

try:
    from html.parser import HTMLParser
    HTML_AVAILABLE = True
except ImportError:
    HTML_AVAILABLE = False


class App068:
    """Webスクレイピングツール"""

    def __init__(self, root):
        self.root = root
        self.root.title("Webスクレイピングツール")
        self.root.geometry("1100x680")
        self.root.configure(bg="#0d1117")
        self._results = []
        self._build_ui()

    def _build_ui(self):
        header = tk.Frame(self.root, bg="#161b22", pady=6)
        header.pack(fill=tk.X)
        tk.Label(header, text="🕷 Webスクレイピングツール",
                 font=("Noto Sans JP", 12, "bold"),
                 bg="#161b22", fg="#f0f6fc").pack(side=tk.LEFT, padx=12)

        # URL入力
        url_f = tk.Frame(self.root, bg="#0d1117", pady=4)
        url_f.pack(fill=tk.X, padx=8)
        tk.Label(url_f, text="URL:", bg="#0d1117", fg="#c9d1d9",
                 font=("Arial", 9)).pack(side=tk.LEFT)
        self.url_var = tk.StringVar(
            value="https://quotes.toscrape.com/")
        ttk.Entry(url_f, textvariable=self.url_var,
                  width=55).pack(side=tk.LEFT, padx=4)
        ttk.Button(url_f, text="▶ 取得",
                   command=self._fetch).pack(side=tk.LEFT)

        # 抽出設定
        extract_f = tk.LabelFrame(self.root, text="抽出設定",
                                   bg="#161b22", fg="#c9d1d9",
                                   font=("Arial", 9), padx=8, pady=4)
        extract_f.pack(fill=tk.X, padx=8, pady=4)

        tk.Label(extract_f, text="モード:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=0, column=0, sticky="w")
        self.mode_var = tk.StringVar(value="links")
        modes = [("リンク", "links"), ("テキスト", "text"),
                 ("見出し", "headings"), ("テーブル", "tables"),
                 ("メタタグ", "meta"), ("CSSセレクタ", "css")]
        for i, (label, val) in enumerate(modes):
            tk.Radiobutton(extract_f, text=label, variable=self.mode_var,
                           value=val, bg="#161b22", fg="#c9d1d9",
                           selectcolor="#0d1117",
                           activebackground="#161b22").grid(
                row=0, column=i+1, padx=4)

        tk.Label(extract_f, text="CSSセレクタ:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=1, column=0, sticky="w", pady=2)
        self.css_var = tk.StringVar(value=".quote")
        ttk.Entry(extract_f, textvariable=self.css_var,
                  width=30).grid(row=1, column=1, columnspan=3, sticky="w", padx=4)
        tk.Label(extract_f, text="属性:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=1, column=4, sticky="w")
        self.attr_var = tk.StringVar(value="href")
        ttk.Entry(extract_f, textvariable=self.attr_var,
                  width=10).grid(row=1, column=5, sticky="w")
        ttk.Button(extract_f, text="🔍 抽出",
                   command=self._extract).grid(row=1, column=6, padx=8)

        # 結果エリア
        paned = ttk.PanedWindow(self.root, orient=tk.HORIZONTAL)
        paned.pack(fill=tk.BOTH, expand=True, padx=4, pady=4)

        # 左: 結果リスト
        left = tk.Frame(paned, bg="#0d1117")
        paned.add(left, weight=1)
        tk.Label(left, text="抽出結果", bg="#0d1117", fg="#8b949e",
                 font=("Arial", 9)).pack(anchor="w")
        self.result_list = tk.Listbox(
            left, bg="#161b22", fg="#c9d1d9",
            selectbackground="#1f6feb", font=("Arial", 9),
            relief=tk.FLAT, activestyle="none")
        lsb = ttk.Scrollbar(left, command=self.result_list.yview)
        self.result_list.configure(yscrollcommand=lsb.set)
        lsb.pack(side=tk.RIGHT, fill=tk.Y)
        self.result_list.pack(fill=tk.BOTH, expand=True)
        self.result_list.bind("<<ListboxSelect>>", self._on_result_select)

        btn_f = tk.Frame(left, bg="#0d1117", pady=2)
        btn_f.pack(fill=tk.X)
        ttk.Button(btn_f, text="📋 コピー",
                   command=self._copy).pack(side=tk.LEFT, padx=2)
        ttk.Button(btn_f, text="💾 CSV保存",
                   command=self._save_csv).pack(side=tk.LEFT, padx=2)
        ttk.Button(btn_f, text="JSON保存",
                   command=self._save_json).pack(side=tk.LEFT, padx=2)

        # 右: HTMLプレビュー
        right = tk.Frame(paned, bg="#0d1117")
        paned.add(right, weight=1)
        tk.Label(right, text="HTML ソース", bg="#0d1117", fg="#8b949e",
                 font=("Arial", 9)).pack(anchor="w")
        self.html_text = tk.Text(right, bg="#161b22", fg="#6e7681",
                                  font=("Courier New", 8), relief=tk.FLAT,
                                  wrap=tk.NONE, state=tk.DISABLED)
        hxsb = ttk.Scrollbar(right, orient=tk.HORIZONTAL,
                              command=self.html_text.xview)
        hysb = ttk.Scrollbar(right, command=self.html_text.yview)
        self.html_text.configure(xscrollcommand=hxsb.set,
                                  yscrollcommand=hysb.set)
        hysb.pack(side=tk.RIGHT, fill=tk.Y)
        self.html_text.pack(fill=tk.BOTH, expand=True)
        hxsb.pack(fill=tk.X)

        self.progress = ttk.Progressbar(self.root, mode="indeterminate")
        self.progress.pack(fill=tk.X, padx=8)
        self.status_var = tk.StringVar(value="URLを入力して取得してください")
        tk.Label(self.root, textvariable=self.status_var,
                 bg="#21262d", fg="#8b949e", font=("Arial", 9),
                 anchor="w", padx=8).pack(fill=tk.X, side=tk.BOTTOM)

        self._raw_html = ""

    def _fetch(self):
        url = self.url_var.get().strip()
        if not url.startswith("http"):
            messagebox.showerror("エラー", "有効なURLを入力してください")
            return
        self.progress.start()
        self.status_var.set("取得中...")
        threading.Thread(target=self._do_fetch, args=(url,),
                          daemon=True).start()

    def _do_fetch(self, url):
        try:
            req = urllib.request.Request(
                url, headers={"User-Agent": "Mozilla/5.0"})
            with urllib.request.urlopen(req, timeout=10) as resp:
                raw = resp.read()
            # エンコード検出
            for enc in ("utf-8", "shift-jis", "euc-jp", "latin-1"):
                try:
                    text = raw.decode(enc)
                    break
                except Exception:
                    text = raw.decode("utf-8", errors="replace")
            self.root.after(0, self._on_fetched, text)
        except Exception as e:
            self.root.after(0, self.status_var.set, f"エラー: {e}")
            self.root.after(0, self.progress.stop)

    def _on_fetched(self, html_text):
        self.progress.stop()
        self._raw_html = html_text
        self.html_text.configure(state=tk.NORMAL)
        self.html_text.delete("1.0", tk.END)
        self.html_text.insert("1.0", html_text[:100000])
        self.html_text.configure(state=tk.DISABLED)
        self.status_var.set(f"取得完了: {len(html_text):,} 文字")
        self._extract()

    def _extract(self):
        if not self._raw_html:
            messagebox.showwarning("警告", "先にHTMLを取得してください")
            return
        mode = self.mode_var.get()
        results = []
        h = self._raw_html

        if mode == "links":
            results = re.findall(r'href=["\']([^"\']+)["\']', h)
            results = [r for r in results if r.startswith("http")]
        elif mode == "text":
            clean = re.sub(r"<[^>]+>", " ", h)
            clean = re.sub(r"\s+", " ", html.unescape(clean)).strip()
            results = [clean[:5000]]
        elif mode == "headings":
            for tag in ("h1", "h2", "h3", "h4", "h5", "h6"):
                for m in re.finditer(
                        f"<{tag}[^>]*>(.*?)</{tag}>", h,
                        re.IGNORECASE | re.DOTALL):
                    text = re.sub(r"<[^>]+>", "", m.group(1)).strip()
                    if text:
                        results.append(f"[{tag.upper()}] {text[:100]}")
        elif mode == "tables":
            for m in re.finditer(r"<table[^>]*>(.*?)</table>", h,
                                  re.IGNORECASE | re.DOTALL):
                rows = re.findall(r"<tr[^>]*>(.*?)</tr>", m.group(1),
                                   re.IGNORECASE | re.DOTALL)
                for row in rows:
                    cells = re.findall(r"<t[dh][^>]*>(.*?)</t[dh]>", row,
                                       re.IGNORECASE | re.DOTALL)
                    row_text = " | ".join(
                        re.sub(r"<[^>]+>", "", c).strip() for c in cells)
                    if row_text.strip():
                        results.append(row_text)
        elif mode == "meta":
            for m in re.finditer(r"<meta[^>]+>", h, re.IGNORECASE):
                results.append(m.group(0)[:120])
        elif mode == "css":
            # 簡易CSSセレクタ (tag.class / #id / タグのみ)
            selector = self.css_var.get().strip()
            attr = self.attr_var.get().strip()
            tag_match = re.match(r"^([a-z]+)?(?:\.([a-z0-9_-]+))?$",
                                  selector, re.IGNORECASE)
            if tag_match:
                tag = tag_match.group(1) or r"[a-z]+"
                cls = tag_match.group(2)
                if cls:
                    pattern = (f'<{tag}[^>]*class=["\'][^"\']*{cls}[^"\']*["\']'
                                f'[^>]*>(.*?)</{tag}>')
                else:
                    pattern = f"<{tag}[^>]*>(.*?)</{tag}>"
                for m in re.finditer(pattern, h,
                                      re.IGNORECASE | re.DOTALL):
                    inner = re.sub(r"<[^>]+>", "", m.group(1)).strip()
                    inner = re.sub(r"\s+", " ", inner)
                    if attr:
                        # 属性値を取得
                        attr_m = re.search(
                            f'{attr}=["\']([^"\']+)["\']', m.group(0))
                        if attr_m:
                            results.append(attr_m.group(1))
                    else:
                        if inner:
                            results.append(inner[:150])
            else:
                results.append("セレクタの形式が非対応です")

        self._results = results
        self.result_list.delete(0, tk.END)
        for r in results:
            self.result_list.insert(tk.END, r[:120])
        self.status_var.set(f"抽出完了: {len(results)} 件")

    def _on_result_select(self, event=None):
        sel = self.result_list.curselection()
        if not sel or sel[0] >= len(self._results):
            return

    def _copy(self):
        text = "\n".join(self._results)
        self.root.clipboard_clear()
        self.root.clipboard_append(text)
        self.status_var.set("コピーしました")

    def _save_csv(self):
        path = filedialog.asksaveasfilename(
            defaultextension=".csv", filetypes=[("CSV", "*.csv")])
        if not path:
            return
        try:
            with open(path, "w", newline="", encoding="utf-8-sig") as f:
                writer = csv.writer(f)
                writer.writerow(["結果"])
                for r in self._results:
                    writer.writerow([r])
            messagebox.showinfo("完了", f"保存: {path}")
        except Exception as e:
            messagebox.showerror("エラー", str(e))

    def _save_json(self):
        path = filedialog.asksaveasfilename(
            defaultextension=".json", filetypes=[("JSON", "*.json")])
        if not path:
            return
        try:
            with open(path, "w", encoding="utf-8") as f:
                json.dump({"url": self.url_var.get(),
                           "results": self._results},
                           f, ensure_ascii=False, indent=2)
            messagebox.showinfo("完了", f"保存: {path}")
        except Exception as e:
            messagebox.showerror("エラー", str(e))


if __name__ == "__main__":
    root = tk.Tk()
    app = App068(root)
    root.mainloop()

UIレイアウトの構築

LabelFrameで入力エリアと結果エリアを視覚的に分けています。pack()で縦に並べ、expand=Trueで結果エリアが画面いっぱいに広がるよう設定しています。

import tkinter as tk
from tkinter import ttk, messagebox, filedialog
import urllib.request
import re
import html
import csv
import threading
import json

try:
    from html.parser import HTMLParser
    HTML_AVAILABLE = True
except ImportError:
    HTML_AVAILABLE = False


class App068:
    """Webスクレイピングツール"""

    def __init__(self, root):
        self.root = root
        self.root.title("Webスクレイピングツール")
        self.root.geometry("1100x680")
        self.root.configure(bg="#0d1117")
        self._results = []
        self._build_ui()

    def _build_ui(self):
        header = tk.Frame(self.root, bg="#161b22", pady=6)
        header.pack(fill=tk.X)
        tk.Label(header, text="🕷 Webスクレイピングツール",
                 font=("Noto Sans JP", 12, "bold"),
                 bg="#161b22", fg="#f0f6fc").pack(side=tk.LEFT, padx=12)

        # URL入力
        url_f = tk.Frame(self.root, bg="#0d1117", pady=4)
        url_f.pack(fill=tk.X, padx=8)
        tk.Label(url_f, text="URL:", bg="#0d1117", fg="#c9d1d9",
                 font=("Arial", 9)).pack(side=tk.LEFT)
        self.url_var = tk.StringVar(
            value="https://quotes.toscrape.com/")
        ttk.Entry(url_f, textvariable=self.url_var,
                  width=55).pack(side=tk.LEFT, padx=4)
        ttk.Button(url_f, text="▶ 取得",
                   command=self._fetch).pack(side=tk.LEFT)

        # 抽出設定
        extract_f = tk.LabelFrame(self.root, text="抽出設定",
                                   bg="#161b22", fg="#c9d1d9",
                                   font=("Arial", 9), padx=8, pady=4)
        extract_f.pack(fill=tk.X, padx=8, pady=4)

        tk.Label(extract_f, text="モード:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=0, column=0, sticky="w")
        self.mode_var = tk.StringVar(value="links")
        modes = [("リンク", "links"), ("テキスト", "text"),
                 ("見出し", "headings"), ("テーブル", "tables"),
                 ("メタタグ", "meta"), ("CSSセレクタ", "css")]
        for i, (label, val) in enumerate(modes):
            tk.Radiobutton(extract_f, text=label, variable=self.mode_var,
                           value=val, bg="#161b22", fg="#c9d1d9",
                           selectcolor="#0d1117",
                           activebackground="#161b22").grid(
                row=0, column=i+1, padx=4)

        tk.Label(extract_f, text="CSSセレクタ:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=1, column=0, sticky="w", pady=2)
        self.css_var = tk.StringVar(value=".quote")
        ttk.Entry(extract_f, textvariable=self.css_var,
                  width=30).grid(row=1, column=1, columnspan=3, sticky="w", padx=4)
        tk.Label(extract_f, text="属性:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=1, column=4, sticky="w")
        self.attr_var = tk.StringVar(value="href")
        ttk.Entry(extract_f, textvariable=self.attr_var,
                  width=10).grid(row=1, column=5, sticky="w")
        ttk.Button(extract_f, text="🔍 抽出",
                   command=self._extract).grid(row=1, column=6, padx=8)

        # 結果エリア
        paned = ttk.PanedWindow(self.root, orient=tk.HORIZONTAL)
        paned.pack(fill=tk.BOTH, expand=True, padx=4, pady=4)

        # 左: 結果リスト
        left = tk.Frame(paned, bg="#0d1117")
        paned.add(left, weight=1)
        tk.Label(left, text="抽出結果", bg="#0d1117", fg="#8b949e",
                 font=("Arial", 9)).pack(anchor="w")
        self.result_list = tk.Listbox(
            left, bg="#161b22", fg="#c9d1d9",
            selectbackground="#1f6feb", font=("Arial", 9),
            relief=tk.FLAT, activestyle="none")
        lsb = ttk.Scrollbar(left, command=self.result_list.yview)
        self.result_list.configure(yscrollcommand=lsb.set)
        lsb.pack(side=tk.RIGHT, fill=tk.Y)
        self.result_list.pack(fill=tk.BOTH, expand=True)
        self.result_list.bind("<<ListboxSelect>>", self._on_result_select)

        btn_f = tk.Frame(left, bg="#0d1117", pady=2)
        btn_f.pack(fill=tk.X)
        ttk.Button(btn_f, text="📋 コピー",
                   command=self._copy).pack(side=tk.LEFT, padx=2)
        ttk.Button(btn_f, text="💾 CSV保存",
                   command=self._save_csv).pack(side=tk.LEFT, padx=2)
        ttk.Button(btn_f, text="JSON保存",
                   command=self._save_json).pack(side=tk.LEFT, padx=2)

        # 右: HTMLプレビュー
        right = tk.Frame(paned, bg="#0d1117")
        paned.add(right, weight=1)
        tk.Label(right, text="HTML ソース", bg="#0d1117", fg="#8b949e",
                 font=("Arial", 9)).pack(anchor="w")
        self.html_text = tk.Text(right, bg="#161b22", fg="#6e7681",
                                  font=("Courier New", 8), relief=tk.FLAT,
                                  wrap=tk.NONE, state=tk.DISABLED)
        hxsb = ttk.Scrollbar(right, orient=tk.HORIZONTAL,
                              command=self.html_text.xview)
        hysb = ttk.Scrollbar(right, command=self.html_text.yview)
        self.html_text.configure(xscrollcommand=hxsb.set,
                                  yscrollcommand=hysb.set)
        hysb.pack(side=tk.RIGHT, fill=tk.Y)
        self.html_text.pack(fill=tk.BOTH, expand=True)
        hxsb.pack(fill=tk.X)

        self.progress = ttk.Progressbar(self.root, mode="indeterminate")
        self.progress.pack(fill=tk.X, padx=8)
        self.status_var = tk.StringVar(value="URLを入力して取得してください")
        tk.Label(self.root, textvariable=self.status_var,
                 bg="#21262d", fg="#8b949e", font=("Arial", 9),
                 anchor="w", padx=8).pack(fill=tk.X, side=tk.BOTTOM)

        self._raw_html = ""

    def _fetch(self):
        url = self.url_var.get().strip()
        if not url.startswith("http"):
            messagebox.showerror("エラー", "有効なURLを入力してください")
            return
        self.progress.start()
        self.status_var.set("取得中...")
        threading.Thread(target=self._do_fetch, args=(url,),
                          daemon=True).start()

    def _do_fetch(self, url):
        try:
            req = urllib.request.Request(
                url, headers={"User-Agent": "Mozilla/5.0"})
            with urllib.request.urlopen(req, timeout=10) as resp:
                raw = resp.read()
            # エンコード検出
            for enc in ("utf-8", "shift-jis", "euc-jp", "latin-1"):
                try:
                    text = raw.decode(enc)
                    break
                except Exception:
                    text = raw.decode("utf-8", errors="replace")
            self.root.after(0, self._on_fetched, text)
        except Exception as e:
            self.root.after(0, self.status_var.set, f"エラー: {e}")
            self.root.after(0, self.progress.stop)

    def _on_fetched(self, html_text):
        self.progress.stop()
        self._raw_html = html_text
        self.html_text.configure(state=tk.NORMAL)
        self.html_text.delete("1.0", tk.END)
        self.html_text.insert("1.0", html_text[:100000])
        self.html_text.configure(state=tk.DISABLED)
        self.status_var.set(f"取得完了: {len(html_text):,} 文字")
        self._extract()

    def _extract(self):
        if not self._raw_html:
            messagebox.showwarning("警告", "先にHTMLを取得してください")
            return
        mode = self.mode_var.get()
        results = []
        h = self._raw_html

        if mode == "links":
            results = re.findall(r'href=["\']([^"\']+)["\']', h)
            results = [r for r in results if r.startswith("http")]
        elif mode == "text":
            clean = re.sub(r"<[^>]+>", " ", h)
            clean = re.sub(r"\s+", " ", html.unescape(clean)).strip()
            results = [clean[:5000]]
        elif mode == "headings":
            for tag in ("h1", "h2", "h3", "h4", "h5", "h6"):
                for m in re.finditer(
                        f"<{tag}[^>]*>(.*?)</{tag}>", h,
                        re.IGNORECASE | re.DOTALL):
                    text = re.sub(r"<[^>]+>", "", m.group(1)).strip()
                    if text:
                        results.append(f"[{tag.upper()}] {text[:100]}")
        elif mode == "tables":
            for m in re.finditer(r"<table[^>]*>(.*?)</table>", h,
                                  re.IGNORECASE | re.DOTALL):
                rows = re.findall(r"<tr[^>]*>(.*?)</tr>", m.group(1),
                                   re.IGNORECASE | re.DOTALL)
                for row in rows:
                    cells = re.findall(r"<t[dh][^>]*>(.*?)</t[dh]>", row,
                                       re.IGNORECASE | re.DOTALL)
                    row_text = " | ".join(
                        re.sub(r"<[^>]+>", "", c).strip() for c in cells)
                    if row_text.strip():
                        results.append(row_text)
        elif mode == "meta":
            for m in re.finditer(r"<meta[^>]+>", h, re.IGNORECASE):
                results.append(m.group(0)[:120])
        elif mode == "css":
            # 簡易CSSセレクタ (tag.class / #id / タグのみ)
            selector = self.css_var.get().strip()
            attr = self.attr_var.get().strip()
            tag_match = re.match(r"^([a-z]+)?(?:\.([a-z0-9_-]+))?$",
                                  selector, re.IGNORECASE)
            if tag_match:
                tag = tag_match.group(1) or r"[a-z]+"
                cls = tag_match.group(2)
                if cls:
                    pattern = (f'<{tag}[^>]*class=["\'][^"\']*{cls}[^"\']*["\']'
                                f'[^>]*>(.*?)</{tag}>')
                else:
                    pattern = f"<{tag}[^>]*>(.*?)</{tag}>"
                for m in re.finditer(pattern, h,
                                      re.IGNORECASE | re.DOTALL):
                    inner = re.sub(r"<[^>]+>", "", m.group(1)).strip()
                    inner = re.sub(r"\s+", " ", inner)
                    if attr:
                        # 属性値を取得
                        attr_m = re.search(
                            f'{attr}=["\']([^"\']+)["\']', m.group(0))
                        if attr_m:
                            results.append(attr_m.group(1))
                    else:
                        if inner:
                            results.append(inner[:150])
            else:
                results.append("セレクタの形式が非対応です")

        self._results = results
        self.result_list.delete(0, tk.END)
        for r in results:
            self.result_list.insert(tk.END, r[:120])
        self.status_var.set(f"抽出完了: {len(results)} 件")

    def _on_result_select(self, event=None):
        sel = self.result_list.curselection()
        if not sel or sel[0] >= len(self._results):
            return

    def _copy(self):
        text = "\n".join(self._results)
        self.root.clipboard_clear()
        self.root.clipboard_append(text)
        self.status_var.set("コピーしました")

    def _save_csv(self):
        path = filedialog.asksaveasfilename(
            defaultextension=".csv", filetypes=[("CSV", "*.csv")])
        if not path:
            return
        try:
            with open(path, "w", newline="", encoding="utf-8-sig") as f:
                writer = csv.writer(f)
                writer.writerow(["結果"])
                for r in self._results:
                    writer.writerow([r])
            messagebox.showinfo("完了", f"保存: {path}")
        except Exception as e:
            messagebox.showerror("エラー", str(e))

    def _save_json(self):
        path = filedialog.asksaveasfilename(
            defaultextension=".json", filetypes=[("JSON", "*.json")])
        if not path:
            return
        try:
            with open(path, "w", encoding="utf-8") as f:
                json.dump({"url": self.url_var.get(),
                           "results": self._results},
                           f, ensure_ascii=False, indent=2)
            messagebox.showinfo("完了", f"保存: {path}")
        except Exception as e:
            messagebox.showerror("エラー", str(e))


if __name__ == "__main__":
    root = tk.Tk()
    app = App068(root)
    root.mainloop()

イベント処理

ボタンのcommand引数でクリックイベントを、bind('')でEnterキーイベントを処理します。どちらの操作でも同じprocess()が呼ばれ、コードの重複を避けられます。

import tkinter as tk
from tkinter import ttk, messagebox, filedialog
import urllib.request
import re
import html
import csv
import threading
import json

try:
    from html.parser import HTMLParser
    HTML_AVAILABLE = True
except ImportError:
    HTML_AVAILABLE = False


class App068:
    """Webスクレイピングツール"""

    def __init__(self, root):
        self.root = root
        self.root.title("Webスクレイピングツール")
        self.root.geometry("1100x680")
        self.root.configure(bg="#0d1117")
        self._results = []
        self._build_ui()

    def _build_ui(self):
        header = tk.Frame(self.root, bg="#161b22", pady=6)
        header.pack(fill=tk.X)
        tk.Label(header, text="🕷 Webスクレイピングツール",
                 font=("Noto Sans JP", 12, "bold"),
                 bg="#161b22", fg="#f0f6fc").pack(side=tk.LEFT, padx=12)

        # URL入力
        url_f = tk.Frame(self.root, bg="#0d1117", pady=4)
        url_f.pack(fill=tk.X, padx=8)
        tk.Label(url_f, text="URL:", bg="#0d1117", fg="#c9d1d9",
                 font=("Arial", 9)).pack(side=tk.LEFT)
        self.url_var = tk.StringVar(
            value="https://quotes.toscrape.com/")
        ttk.Entry(url_f, textvariable=self.url_var,
                  width=55).pack(side=tk.LEFT, padx=4)
        ttk.Button(url_f, text="▶ 取得",
                   command=self._fetch).pack(side=tk.LEFT)

        # 抽出設定
        extract_f = tk.LabelFrame(self.root, text="抽出設定",
                                   bg="#161b22", fg="#c9d1d9",
                                   font=("Arial", 9), padx=8, pady=4)
        extract_f.pack(fill=tk.X, padx=8, pady=4)

        tk.Label(extract_f, text="モード:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=0, column=0, sticky="w")
        self.mode_var = tk.StringVar(value="links")
        modes = [("リンク", "links"), ("テキスト", "text"),
                 ("見出し", "headings"), ("テーブル", "tables"),
                 ("メタタグ", "meta"), ("CSSセレクタ", "css")]
        for i, (label, val) in enumerate(modes):
            tk.Radiobutton(extract_f, text=label, variable=self.mode_var,
                           value=val, bg="#161b22", fg="#c9d1d9",
                           selectcolor="#0d1117",
                           activebackground="#161b22").grid(
                row=0, column=i+1, padx=4)

        tk.Label(extract_f, text="CSSセレクタ:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=1, column=0, sticky="w", pady=2)
        self.css_var = tk.StringVar(value=".quote")
        ttk.Entry(extract_f, textvariable=self.css_var,
                  width=30).grid(row=1, column=1, columnspan=3, sticky="w", padx=4)
        tk.Label(extract_f, text="属性:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=1, column=4, sticky="w")
        self.attr_var = tk.StringVar(value="href")
        ttk.Entry(extract_f, textvariable=self.attr_var,
                  width=10).grid(row=1, column=5, sticky="w")
        ttk.Button(extract_f, text="🔍 抽出",
                   command=self._extract).grid(row=1, column=6, padx=8)

        # 結果エリア
        paned = ttk.PanedWindow(self.root, orient=tk.HORIZONTAL)
        paned.pack(fill=tk.BOTH, expand=True, padx=4, pady=4)

        # 左: 結果リスト
        left = tk.Frame(paned, bg="#0d1117")
        paned.add(left, weight=1)
        tk.Label(left, text="抽出結果", bg="#0d1117", fg="#8b949e",
                 font=("Arial", 9)).pack(anchor="w")
        self.result_list = tk.Listbox(
            left, bg="#161b22", fg="#c9d1d9",
            selectbackground="#1f6feb", font=("Arial", 9),
            relief=tk.FLAT, activestyle="none")
        lsb = ttk.Scrollbar(left, command=self.result_list.yview)
        self.result_list.configure(yscrollcommand=lsb.set)
        lsb.pack(side=tk.RIGHT, fill=tk.Y)
        self.result_list.pack(fill=tk.BOTH, expand=True)
        self.result_list.bind("<<ListboxSelect>>", self._on_result_select)

        btn_f = tk.Frame(left, bg="#0d1117", pady=2)
        btn_f.pack(fill=tk.X)
        ttk.Button(btn_f, text="📋 コピー",
                   command=self._copy).pack(side=tk.LEFT, padx=2)
        ttk.Button(btn_f, text="💾 CSV保存",
                   command=self._save_csv).pack(side=tk.LEFT, padx=2)
        ttk.Button(btn_f, text="JSON保存",
                   command=self._save_json).pack(side=tk.LEFT, padx=2)

        # 右: HTMLプレビュー
        right = tk.Frame(paned, bg="#0d1117")
        paned.add(right, weight=1)
        tk.Label(right, text="HTML ソース", bg="#0d1117", fg="#8b949e",
                 font=("Arial", 9)).pack(anchor="w")
        self.html_text = tk.Text(right, bg="#161b22", fg="#6e7681",
                                  font=("Courier New", 8), relief=tk.FLAT,
                                  wrap=tk.NONE, state=tk.DISABLED)
        hxsb = ttk.Scrollbar(right, orient=tk.HORIZONTAL,
                              command=self.html_text.xview)
        hysb = ttk.Scrollbar(right, command=self.html_text.yview)
        self.html_text.configure(xscrollcommand=hxsb.set,
                                  yscrollcommand=hysb.set)
        hysb.pack(side=tk.RIGHT, fill=tk.Y)
        self.html_text.pack(fill=tk.BOTH, expand=True)
        hxsb.pack(fill=tk.X)

        self.progress = ttk.Progressbar(self.root, mode="indeterminate")
        self.progress.pack(fill=tk.X, padx=8)
        self.status_var = tk.StringVar(value="URLを入力して取得してください")
        tk.Label(self.root, textvariable=self.status_var,
                 bg="#21262d", fg="#8b949e", font=("Arial", 9),
                 anchor="w", padx=8).pack(fill=tk.X, side=tk.BOTTOM)

        self._raw_html = ""

    def _fetch(self):
        url = self.url_var.get().strip()
        if not url.startswith("http"):
            messagebox.showerror("エラー", "有効なURLを入力してください")
            return
        self.progress.start()
        self.status_var.set("取得中...")
        threading.Thread(target=self._do_fetch, args=(url,),
                          daemon=True).start()

    def _do_fetch(self, url):
        try:
            req = urllib.request.Request(
                url, headers={"User-Agent": "Mozilla/5.0"})
            with urllib.request.urlopen(req, timeout=10) as resp:
                raw = resp.read()
            # エンコード検出
            for enc in ("utf-8", "shift-jis", "euc-jp", "latin-1"):
                try:
                    text = raw.decode(enc)
                    break
                except Exception:
                    text = raw.decode("utf-8", errors="replace")
            self.root.after(0, self._on_fetched, text)
        except Exception as e:
            self.root.after(0, self.status_var.set, f"エラー: {e}")
            self.root.after(0, self.progress.stop)

    def _on_fetched(self, html_text):
        self.progress.stop()
        self._raw_html = html_text
        self.html_text.configure(state=tk.NORMAL)
        self.html_text.delete("1.0", tk.END)
        self.html_text.insert("1.0", html_text[:100000])
        self.html_text.configure(state=tk.DISABLED)
        self.status_var.set(f"取得完了: {len(html_text):,} 文字")
        self._extract()

    def _extract(self):
        if not self._raw_html:
            messagebox.showwarning("警告", "先にHTMLを取得してください")
            return
        mode = self.mode_var.get()
        results = []
        h = self._raw_html

        if mode == "links":
            results = re.findall(r'href=["\']([^"\']+)["\']', h)
            results = [r for r in results if r.startswith("http")]
        elif mode == "text":
            clean = re.sub(r"<[^>]+>", " ", h)
            clean = re.sub(r"\s+", " ", html.unescape(clean)).strip()
            results = [clean[:5000]]
        elif mode == "headings":
            for tag in ("h1", "h2", "h3", "h4", "h5", "h6"):
                for m in re.finditer(
                        f"<{tag}[^>]*>(.*?)</{tag}>", h,
                        re.IGNORECASE | re.DOTALL):
                    text = re.sub(r"<[^>]+>", "", m.group(1)).strip()
                    if text:
                        results.append(f"[{tag.upper()}] {text[:100]}")
        elif mode == "tables":
            for m in re.finditer(r"<table[^>]*>(.*?)</table>", h,
                                  re.IGNORECASE | re.DOTALL):
                rows = re.findall(r"<tr[^>]*>(.*?)</tr>", m.group(1),
                                   re.IGNORECASE | re.DOTALL)
                for row in rows:
                    cells = re.findall(r"<t[dh][^>]*>(.*?)</t[dh]>", row,
                                       re.IGNORECASE | re.DOTALL)
                    row_text = " | ".join(
                        re.sub(r"<[^>]+>", "", c).strip() for c in cells)
                    if row_text.strip():
                        results.append(row_text)
        elif mode == "meta":
            for m in re.finditer(r"<meta[^>]+>", h, re.IGNORECASE):
                results.append(m.group(0)[:120])
        elif mode == "css":
            # 簡易CSSセレクタ (tag.class / #id / タグのみ)
            selector = self.css_var.get().strip()
            attr = self.attr_var.get().strip()
            tag_match = re.match(r"^([a-z]+)?(?:\.([a-z0-9_-]+))?$",
                                  selector, re.IGNORECASE)
            if tag_match:
                tag = tag_match.group(1) or r"[a-z]+"
                cls = tag_match.group(2)
                if cls:
                    pattern = (f'<{tag}[^>]*class=["\'][^"\']*{cls}[^"\']*["\']'
                                f'[^>]*>(.*?)</{tag}>')
                else:
                    pattern = f"<{tag}[^>]*>(.*?)</{tag}>"
                for m in re.finditer(pattern, h,
                                      re.IGNORECASE | re.DOTALL):
                    inner = re.sub(r"<[^>]+>", "", m.group(1)).strip()
                    inner = re.sub(r"\s+", " ", inner)
                    if attr:
                        # 属性値を取得
                        attr_m = re.search(
                            f'{attr}=["\']([^"\']+)["\']', m.group(0))
                        if attr_m:
                            results.append(attr_m.group(1))
                    else:
                        if inner:
                            results.append(inner[:150])
            else:
                results.append("セレクタの形式が非対応です")

        self._results = results
        self.result_list.delete(0, tk.END)
        for r in results:
            self.result_list.insert(tk.END, r[:120])
        self.status_var.set(f"抽出完了: {len(results)} 件")

    def _on_result_select(self, event=None):
        sel = self.result_list.curselection()
        if not sel or sel[0] >= len(self._results):
            return

    def _copy(self):
        text = "\n".join(self._results)
        self.root.clipboard_clear()
        self.root.clipboard_append(text)
        self.status_var.set("コピーしました")

    def _save_csv(self):
        path = filedialog.asksaveasfilename(
            defaultextension=".csv", filetypes=[("CSV", "*.csv")])
        if not path:
            return
        try:
            with open(path, "w", newline="", encoding="utf-8-sig") as f:
                writer = csv.writer(f)
                writer.writerow(["結果"])
                for r in self._results:
                    writer.writerow([r])
            messagebox.showinfo("完了", f"保存: {path}")
        except Exception as e:
            messagebox.showerror("エラー", str(e))

    def _save_json(self):
        path = filedialog.asksaveasfilename(
            defaultextension=".json", filetypes=[("JSON", "*.json")])
        if not path:
            return
        try:
            with open(path, "w", encoding="utf-8") as f:
                json.dump({"url": self.url_var.get(),
                           "results": self._results},
                           f, ensure_ascii=False, indent=2)
            messagebox.showinfo("完了", f"保存: {path}")
        except Exception as e:
            messagebox.showerror("エラー", str(e))


if __name__ == "__main__":
    root = tk.Tk()
    app = App068(root)
    root.mainloop()

Textウィジェットでの結果表示

tk.Textウィジェットをstate=DISABLED(読み取り専用)で作成し、更新時はNORMALに変更してinsert()で内容を書き込み、再びDISABLEDに戻します。

import tkinter as tk
from tkinter import ttk, messagebox, filedialog
import urllib.request
import re
import html
import csv
import threading
import json

try:
    from html.parser import HTMLParser
    HTML_AVAILABLE = True
except ImportError:
    HTML_AVAILABLE = False


class App068:
    """Webスクレイピングツール"""

    def __init__(self, root):
        self.root = root
        self.root.title("Webスクレイピングツール")
        self.root.geometry("1100x680")
        self.root.configure(bg="#0d1117")
        self._results = []
        self._build_ui()

    def _build_ui(self):
        header = tk.Frame(self.root, bg="#161b22", pady=6)
        header.pack(fill=tk.X)
        tk.Label(header, text="🕷 Webスクレイピングツール",
                 font=("Noto Sans JP", 12, "bold"),
                 bg="#161b22", fg="#f0f6fc").pack(side=tk.LEFT, padx=12)

        # URL入力
        url_f = tk.Frame(self.root, bg="#0d1117", pady=4)
        url_f.pack(fill=tk.X, padx=8)
        tk.Label(url_f, text="URL:", bg="#0d1117", fg="#c9d1d9",
                 font=("Arial", 9)).pack(side=tk.LEFT)
        self.url_var = tk.StringVar(
            value="https://quotes.toscrape.com/")
        ttk.Entry(url_f, textvariable=self.url_var,
                  width=55).pack(side=tk.LEFT, padx=4)
        ttk.Button(url_f, text="▶ 取得",
                   command=self._fetch).pack(side=tk.LEFT)

        # 抽出設定
        extract_f = tk.LabelFrame(self.root, text="抽出設定",
                                   bg="#161b22", fg="#c9d1d9",
                                   font=("Arial", 9), padx=8, pady=4)
        extract_f.pack(fill=tk.X, padx=8, pady=4)

        tk.Label(extract_f, text="モード:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=0, column=0, sticky="w")
        self.mode_var = tk.StringVar(value="links")
        modes = [("リンク", "links"), ("テキスト", "text"),
                 ("見出し", "headings"), ("テーブル", "tables"),
                 ("メタタグ", "meta"), ("CSSセレクタ", "css")]
        for i, (label, val) in enumerate(modes):
            tk.Radiobutton(extract_f, text=label, variable=self.mode_var,
                           value=val, bg="#161b22", fg="#c9d1d9",
                           selectcolor="#0d1117",
                           activebackground="#161b22").grid(
                row=0, column=i+1, padx=4)

        tk.Label(extract_f, text="CSSセレクタ:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=1, column=0, sticky="w", pady=2)
        self.css_var = tk.StringVar(value=".quote")
        ttk.Entry(extract_f, textvariable=self.css_var,
                  width=30).grid(row=1, column=1, columnspan=3, sticky="w", padx=4)
        tk.Label(extract_f, text="属性:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=1, column=4, sticky="w")
        self.attr_var = tk.StringVar(value="href")
        ttk.Entry(extract_f, textvariable=self.attr_var,
                  width=10).grid(row=1, column=5, sticky="w")
        ttk.Button(extract_f, text="🔍 抽出",
                   command=self._extract).grid(row=1, column=6, padx=8)

        # 結果エリア
        paned = ttk.PanedWindow(self.root, orient=tk.HORIZONTAL)
        paned.pack(fill=tk.BOTH, expand=True, padx=4, pady=4)

        # 左: 結果リスト
        left = tk.Frame(paned, bg="#0d1117")
        paned.add(left, weight=1)
        tk.Label(left, text="抽出結果", bg="#0d1117", fg="#8b949e",
                 font=("Arial", 9)).pack(anchor="w")
        self.result_list = tk.Listbox(
            left, bg="#161b22", fg="#c9d1d9",
            selectbackground="#1f6feb", font=("Arial", 9),
            relief=tk.FLAT, activestyle="none")
        lsb = ttk.Scrollbar(left, command=self.result_list.yview)
        self.result_list.configure(yscrollcommand=lsb.set)
        lsb.pack(side=tk.RIGHT, fill=tk.Y)
        self.result_list.pack(fill=tk.BOTH, expand=True)
        self.result_list.bind("<<ListboxSelect>>", self._on_result_select)

        btn_f = tk.Frame(left, bg="#0d1117", pady=2)
        btn_f.pack(fill=tk.X)
        ttk.Button(btn_f, text="📋 コピー",
                   command=self._copy).pack(side=tk.LEFT, padx=2)
        ttk.Button(btn_f, text="💾 CSV保存",
                   command=self._save_csv).pack(side=tk.LEFT, padx=2)
        ttk.Button(btn_f, text="JSON保存",
                   command=self._save_json).pack(side=tk.LEFT, padx=2)

        # 右: HTMLプレビュー
        right = tk.Frame(paned, bg="#0d1117")
        paned.add(right, weight=1)
        tk.Label(right, text="HTML ソース", bg="#0d1117", fg="#8b949e",
                 font=("Arial", 9)).pack(anchor="w")
        self.html_text = tk.Text(right, bg="#161b22", fg="#6e7681",
                                  font=("Courier New", 8), relief=tk.FLAT,
                                  wrap=tk.NONE, state=tk.DISABLED)
        hxsb = ttk.Scrollbar(right, orient=tk.HORIZONTAL,
                              command=self.html_text.xview)
        hysb = ttk.Scrollbar(right, command=self.html_text.yview)
        self.html_text.configure(xscrollcommand=hxsb.set,
                                  yscrollcommand=hysb.set)
        hysb.pack(side=tk.RIGHT, fill=tk.Y)
        self.html_text.pack(fill=tk.BOTH, expand=True)
        hxsb.pack(fill=tk.X)

        self.progress = ttk.Progressbar(self.root, mode="indeterminate")
        self.progress.pack(fill=tk.X, padx=8)
        self.status_var = tk.StringVar(value="URLを入力して取得してください")
        tk.Label(self.root, textvariable=self.status_var,
                 bg="#21262d", fg="#8b949e", font=("Arial", 9),
                 anchor="w", padx=8).pack(fill=tk.X, side=tk.BOTTOM)

        self._raw_html = ""

    def _fetch(self):
        url = self.url_var.get().strip()
        if not url.startswith("http"):
            messagebox.showerror("エラー", "有効なURLを入力してください")
            return
        self.progress.start()
        self.status_var.set("取得中...")
        threading.Thread(target=self._do_fetch, args=(url,),
                          daemon=True).start()

    def _do_fetch(self, url):
        try:
            req = urllib.request.Request(
                url, headers={"User-Agent": "Mozilla/5.0"})
            with urllib.request.urlopen(req, timeout=10) as resp:
                raw = resp.read()
            # エンコード検出
            for enc in ("utf-8", "shift-jis", "euc-jp", "latin-1"):
                try:
                    text = raw.decode(enc)
                    break
                except Exception:
                    text = raw.decode("utf-8", errors="replace")
            self.root.after(0, self._on_fetched, text)
        except Exception as e:
            self.root.after(0, self.status_var.set, f"エラー: {e}")
            self.root.after(0, self.progress.stop)

    def _on_fetched(self, html_text):
        self.progress.stop()
        self._raw_html = html_text
        self.html_text.configure(state=tk.NORMAL)
        self.html_text.delete("1.0", tk.END)
        self.html_text.insert("1.0", html_text[:100000])
        self.html_text.configure(state=tk.DISABLED)
        self.status_var.set(f"取得完了: {len(html_text):,} 文字")
        self._extract()

    def _extract(self):
        if not self._raw_html:
            messagebox.showwarning("警告", "先にHTMLを取得してください")
            return
        mode = self.mode_var.get()
        results = []
        h = self._raw_html

        if mode == "links":
            results = re.findall(r'href=["\']([^"\']+)["\']', h)
            results = [r for r in results if r.startswith("http")]
        elif mode == "text":
            clean = re.sub(r"<[^>]+>", " ", h)
            clean = re.sub(r"\s+", " ", html.unescape(clean)).strip()
            results = [clean[:5000]]
        elif mode == "headings":
            for tag in ("h1", "h2", "h3", "h4", "h5", "h6"):
                for m in re.finditer(
                        f"<{tag}[^>]*>(.*?)</{tag}>", h,
                        re.IGNORECASE | re.DOTALL):
                    text = re.sub(r"<[^>]+>", "", m.group(1)).strip()
                    if text:
                        results.append(f"[{tag.upper()}] {text[:100]}")
        elif mode == "tables":
            for m in re.finditer(r"<table[^>]*>(.*?)</table>", h,
                                  re.IGNORECASE | re.DOTALL):
                rows = re.findall(r"<tr[^>]*>(.*?)</tr>", m.group(1),
                                   re.IGNORECASE | re.DOTALL)
                for row in rows:
                    cells = re.findall(r"<t[dh][^>]*>(.*?)</t[dh]>", row,
                                       re.IGNORECASE | re.DOTALL)
                    row_text = " | ".join(
                        re.sub(r"<[^>]+>", "", c).strip() for c in cells)
                    if row_text.strip():
                        results.append(row_text)
        elif mode == "meta":
            for m in re.finditer(r"<meta[^>]+>", h, re.IGNORECASE):
                results.append(m.group(0)[:120])
        elif mode == "css":
            # 簡易CSSセレクタ (tag.class / #id / タグのみ)
            selector = self.css_var.get().strip()
            attr = self.attr_var.get().strip()
            tag_match = re.match(r"^([a-z]+)?(?:\.([a-z0-9_-]+))?$",
                                  selector, re.IGNORECASE)
            if tag_match:
                tag = tag_match.group(1) or r"[a-z]+"
                cls = tag_match.group(2)
                if cls:
                    pattern = (f'<{tag}[^>]*class=["\'][^"\']*{cls}[^"\']*["\']'
                                f'[^>]*>(.*?)</{tag}>')
                else:
                    pattern = f"<{tag}[^>]*>(.*?)</{tag}>"
                for m in re.finditer(pattern, h,
                                      re.IGNORECASE | re.DOTALL):
                    inner = re.sub(r"<[^>]+>", "", m.group(1)).strip()
                    inner = re.sub(r"\s+", " ", inner)
                    if attr:
                        # 属性値を取得
                        attr_m = re.search(
                            f'{attr}=["\']([^"\']+)["\']', m.group(0))
                        if attr_m:
                            results.append(attr_m.group(1))
                    else:
                        if inner:
                            results.append(inner[:150])
            else:
                results.append("セレクタの形式が非対応です")

        self._results = results
        self.result_list.delete(0, tk.END)
        for r in results:
            self.result_list.insert(tk.END, r[:120])
        self.status_var.set(f"抽出完了: {len(results)} 件")

    def _on_result_select(self, event=None):
        sel = self.result_list.curselection()
        if not sel or sel[0] >= len(self._results):
            return

    def _copy(self):
        text = "\n".join(self._results)
        self.root.clipboard_clear()
        self.root.clipboard_append(text)
        self.status_var.set("コピーしました")

    def _save_csv(self):
        path = filedialog.asksaveasfilename(
            defaultextension=".csv", filetypes=[("CSV", "*.csv")])
        if not path:
            return
        try:
            with open(path, "w", newline="", encoding="utf-8-sig") as f:
                writer = csv.writer(f)
                writer.writerow(["結果"])
                for r in self._results:
                    writer.writerow([r])
            messagebox.showinfo("完了", f"保存: {path}")
        except Exception as e:
            messagebox.showerror("エラー", str(e))

    def _save_json(self):
        path = filedialog.asksaveasfilename(
            defaultextension=".json", filetypes=[("JSON", "*.json")])
        if not path:
            return
        try:
            with open(path, "w", encoding="utf-8") as f:
                json.dump({"url": self.url_var.get(),
                           "results": self._results},
                           f, ensure_ascii=False, indent=2)
            messagebox.showinfo("完了", f"保存: {path}")
        except Exception as e:
            messagebox.showerror("エラー", str(e))


if __name__ == "__main__":
    root = tk.Tk()
    app = App068(root)
    root.mainloop()

例外処理とエラーハンドリング

try-exceptでValueErrorとExceptionを捕捉し、messagebox.showerror()でエラーメッセージを表示します。予期しないエラーも処理することで、アプリの堅牢性が向上します。

import tkinter as tk
from tkinter import ttk, messagebox, filedialog
import urllib.request
import re
import html
import csv
import threading
import json

try:
    from html.parser import HTMLParser
    HTML_AVAILABLE = True
except ImportError:
    HTML_AVAILABLE = False


class App068:
    """Webスクレイピングツール"""

    def __init__(self, root):
        self.root = root
        self.root.title("Webスクレイピングツール")
        self.root.geometry("1100x680")
        self.root.configure(bg="#0d1117")
        self._results = []
        self._build_ui()

    def _build_ui(self):
        header = tk.Frame(self.root, bg="#161b22", pady=6)
        header.pack(fill=tk.X)
        tk.Label(header, text="🕷 Webスクレイピングツール",
                 font=("Noto Sans JP", 12, "bold"),
                 bg="#161b22", fg="#f0f6fc").pack(side=tk.LEFT, padx=12)

        # URL入力
        url_f = tk.Frame(self.root, bg="#0d1117", pady=4)
        url_f.pack(fill=tk.X, padx=8)
        tk.Label(url_f, text="URL:", bg="#0d1117", fg="#c9d1d9",
                 font=("Arial", 9)).pack(side=tk.LEFT)
        self.url_var = tk.StringVar(
            value="https://quotes.toscrape.com/")
        ttk.Entry(url_f, textvariable=self.url_var,
                  width=55).pack(side=tk.LEFT, padx=4)
        ttk.Button(url_f, text="▶ 取得",
                   command=self._fetch).pack(side=tk.LEFT)

        # 抽出設定
        extract_f = tk.LabelFrame(self.root, text="抽出設定",
                                   bg="#161b22", fg="#c9d1d9",
                                   font=("Arial", 9), padx=8, pady=4)
        extract_f.pack(fill=tk.X, padx=8, pady=4)

        tk.Label(extract_f, text="モード:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=0, column=0, sticky="w")
        self.mode_var = tk.StringVar(value="links")
        modes = [("リンク", "links"), ("テキスト", "text"),
                 ("見出し", "headings"), ("テーブル", "tables"),
                 ("メタタグ", "meta"), ("CSSセレクタ", "css")]
        for i, (label, val) in enumerate(modes):
            tk.Radiobutton(extract_f, text=label, variable=self.mode_var,
                           value=val, bg="#161b22", fg="#c9d1d9",
                           selectcolor="#0d1117",
                           activebackground="#161b22").grid(
                row=0, column=i+1, padx=4)

        tk.Label(extract_f, text="CSSセレクタ:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=1, column=0, sticky="w", pady=2)
        self.css_var = tk.StringVar(value=".quote")
        ttk.Entry(extract_f, textvariable=self.css_var,
                  width=30).grid(row=1, column=1, columnspan=3, sticky="w", padx=4)
        tk.Label(extract_f, text="属性:", bg="#161b22", fg="#c9d1d9",
                 font=("Arial", 9)).grid(row=1, column=4, sticky="w")
        self.attr_var = tk.StringVar(value="href")
        ttk.Entry(extract_f, textvariable=self.attr_var,
                  width=10).grid(row=1, column=5, sticky="w")
        ttk.Button(extract_f, text="🔍 抽出",
                   command=self._extract).grid(row=1, column=6, padx=8)

        # 結果エリア
        paned = ttk.PanedWindow(self.root, orient=tk.HORIZONTAL)
        paned.pack(fill=tk.BOTH, expand=True, padx=4, pady=4)

        # 左: 結果リスト
        left = tk.Frame(paned, bg="#0d1117")
        paned.add(left, weight=1)
        tk.Label(left, text="抽出結果", bg="#0d1117", fg="#8b949e",
                 font=("Arial", 9)).pack(anchor="w")
        self.result_list = tk.Listbox(
            left, bg="#161b22", fg="#c9d1d9",
            selectbackground="#1f6feb", font=("Arial", 9),
            relief=tk.FLAT, activestyle="none")
        lsb = ttk.Scrollbar(left, command=self.result_list.yview)
        self.result_list.configure(yscrollcommand=lsb.set)
        lsb.pack(side=tk.RIGHT, fill=tk.Y)
        self.result_list.pack(fill=tk.BOTH, expand=True)
        self.result_list.bind("<<ListboxSelect>>", self._on_result_select)

        btn_f = tk.Frame(left, bg="#0d1117", pady=2)
        btn_f.pack(fill=tk.X)
        ttk.Button(btn_f, text="📋 コピー",
                   command=self._copy).pack(side=tk.LEFT, padx=2)
        ttk.Button(btn_f, text="💾 CSV保存",
                   command=self._save_csv).pack(side=tk.LEFT, padx=2)
        ttk.Button(btn_f, text="JSON保存",
                   command=self._save_json).pack(side=tk.LEFT, padx=2)

        # 右: HTMLプレビュー
        right = tk.Frame(paned, bg="#0d1117")
        paned.add(right, weight=1)
        tk.Label(right, text="HTML ソース", bg="#0d1117", fg="#8b949e",
                 font=("Arial", 9)).pack(anchor="w")
        self.html_text = tk.Text(right, bg="#161b22", fg="#6e7681",
                                  font=("Courier New", 8), relief=tk.FLAT,
                                  wrap=tk.NONE, state=tk.DISABLED)
        hxsb = ttk.Scrollbar(right, orient=tk.HORIZONTAL,
                              command=self.html_text.xview)
        hysb = ttk.Scrollbar(right, command=self.html_text.yview)
        self.html_text.configure(xscrollcommand=hxsb.set,
                                  yscrollcommand=hysb.set)
        hysb.pack(side=tk.RIGHT, fill=tk.Y)
        self.html_text.pack(fill=tk.BOTH, expand=True)
        hxsb.pack(fill=tk.X)

        self.progress = ttk.Progressbar(self.root, mode="indeterminate")
        self.progress.pack(fill=tk.X, padx=8)
        self.status_var = tk.StringVar(value="URLを入力して取得してください")
        tk.Label(self.root, textvariable=self.status_var,
                 bg="#21262d", fg="#8b949e", font=("Arial", 9),
                 anchor="w", padx=8).pack(fill=tk.X, side=tk.BOTTOM)

        self._raw_html = ""

    def _fetch(self):
        url = self.url_var.get().strip()
        if not url.startswith("http"):
            messagebox.showerror("エラー", "有効なURLを入力してください")
            return
        self.progress.start()
        self.status_var.set("取得中...")
        threading.Thread(target=self._do_fetch, args=(url,),
                          daemon=True).start()

    def _do_fetch(self, url):
        try:
            req = urllib.request.Request(
                url, headers={"User-Agent": "Mozilla/5.0"})
            with urllib.request.urlopen(req, timeout=10) as resp:
                raw = resp.read()
            # エンコード検出
            for enc in ("utf-8", "shift-jis", "euc-jp", "latin-1"):
                try:
                    text = raw.decode(enc)
                    break
                except Exception:
                    text = raw.decode("utf-8", errors="replace")
            self.root.after(0, self._on_fetched, text)
        except Exception as e:
            self.root.after(0, self.status_var.set, f"エラー: {e}")
            self.root.after(0, self.progress.stop)

    def _on_fetched(self, html_text):
        self.progress.stop()
        self._raw_html = html_text
        self.html_text.configure(state=tk.NORMAL)
        self.html_text.delete("1.0", tk.END)
        self.html_text.insert("1.0", html_text[:100000])
        self.html_text.configure(state=tk.DISABLED)
        self.status_var.set(f"取得完了: {len(html_text):,} 文字")
        self._extract()

    def _extract(self):
        if not self._raw_html:
            messagebox.showwarning("警告", "先にHTMLを取得してください")
            return
        mode = self.mode_var.get()
        results = []
        h = self._raw_html

        if mode == "links":
            results = re.findall(r'href=["\']([^"\']+)["\']', h)
            results = [r for r in results if r.startswith("http")]
        elif mode == "text":
            clean = re.sub(r"<[^>]+>", " ", h)
            clean = re.sub(r"\s+", " ", html.unescape(clean)).strip()
            results = [clean[:5000]]
        elif mode == "headings":
            for tag in ("h1", "h2", "h3", "h4", "h5", "h6"):
                for m in re.finditer(
                        f"<{tag}[^>]*>(.*?)</{tag}>", h,
                        re.IGNORECASE | re.DOTALL):
                    text = re.sub(r"<[^>]+>", "", m.group(1)).strip()
                    if text:
                        results.append(f"[{tag.upper()}] {text[:100]}")
        elif mode == "tables":
            for m in re.finditer(r"<table[^>]*>(.*?)</table>", h,
                                  re.IGNORECASE | re.DOTALL):
                rows = re.findall(r"<tr[^>]*>(.*?)</tr>", m.group(1),
                                   re.IGNORECASE | re.DOTALL)
                for row in rows:
                    cells = re.findall(r"<t[dh][^>]*>(.*?)</t[dh]>", row,
                                       re.IGNORECASE | re.DOTALL)
                    row_text = " | ".join(
                        re.sub(r"<[^>]+>", "", c).strip() for c in cells)
                    if row_text.strip():
                        results.append(row_text)
        elif mode == "meta":
            for m in re.finditer(r"<meta[^>]+>", h, re.IGNORECASE):
                results.append(m.group(0)[:120])
        elif mode == "css":
            # 簡易CSSセレクタ (tag.class / #id / タグのみ)
            selector = self.css_var.get().strip()
            attr = self.attr_var.get().strip()
            tag_match = re.match(r"^([a-z]+)?(?:\.([a-z0-9_-]+))?$",
                                  selector, re.IGNORECASE)
            if tag_match:
                tag = tag_match.group(1) or r"[a-z]+"
                cls = tag_match.group(2)
                if cls:
                    pattern = (f'<{tag}[^>]*class=["\'][^"\']*{cls}[^"\']*["\']'
                                f'[^>]*>(.*?)</{tag}>')
                else:
                    pattern = f"<{tag}[^>]*>(.*?)</{tag}>"
                for m in re.finditer(pattern, h,
                                      re.IGNORECASE | re.DOTALL):
                    inner = re.sub(r"<[^>]+>", "", m.group(1)).strip()
                    inner = re.sub(r"\s+", " ", inner)
                    if attr:
                        # 属性値を取得
                        attr_m = re.search(
                            f'{attr}=["\']([^"\']+)["\']', m.group(0))
                        if attr_m:
                            results.append(attr_m.group(1))
                    else:
                        if inner:
                            results.append(inner[:150])
            else:
                results.append("セレクタの形式が非対応です")

        self._results = results
        self.result_list.delete(0, tk.END)
        for r in results:
            self.result_list.insert(tk.END, r[:120])
        self.status_var.set(f"抽出完了: {len(results)} 件")

    def _on_result_select(self, event=None):
        sel = self.result_list.curselection()
        if not sel or sel[0] >= len(self._results):
            return

    def _copy(self):
        text = "\n".join(self._results)
        self.root.clipboard_clear()
        self.root.clipboard_append(text)
        self.status_var.set("コピーしました")

    def _save_csv(self):
        path = filedialog.asksaveasfilename(
            defaultextension=".csv", filetypes=[("CSV", "*.csv")])
        if not path:
            return
        try:
            with open(path, "w", newline="", encoding="utf-8-sig") as f:
                writer = csv.writer(f)
                writer.writerow(["結果"])
                for r in self._results:
                    writer.writerow([r])
            messagebox.showinfo("完了", f"保存: {path}")
        except Exception as e:
            messagebox.showerror("エラー", str(e))

    def _save_json(self):
        path = filedialog.asksaveasfilename(
            defaultextension=".json", filetypes=[("JSON", "*.json")])
        if not path:
            return
        try:
            with open(path, "w", encoding="utf-8") as f:
                json.dump({"url": self.url_var.get(),
                           "results": self._results},
                           f, ensure_ascii=False, indent=2)
            messagebox.showinfo("完了", f"保存: {path}")
        except Exception as e:
            messagebox.showerror("エラー", str(e))


if __name__ == "__main__":
    root = tk.Tk()
    app = App068(root)
    root.mainloop()

6. ステップバイステップガイド

このアプリをゼロから自分で作る手順を解説します。コードをコピーするだけでなく、実際に手順を追って自分で書いてみましょう。

  1. 1
    ファイルを作成する

    新しいファイルを作成して app068.py と保存します。

  2. 2
    クラスの骨格を作る

    App068クラスを定義し、__init__とmainloop()の最小構成を作ります。

  3. 3
    タイトルバーを作る

    Frameを使ってカラーバー付きのタイトルエリアを作ります。

  4. 4
    入力フォームを実装する

    LabelFrameとEntryウィジェットで入力エリアを作ります。

  5. 5
    処理ロジックを実装する

    _execute()メソッドにメインロジックを実装します。

  6. 6
    結果表示を実装する

    TextウィジェットかLabelに結果を表示する_show_result()を実装します。

  7. 7
    エラー処理を追加する

    try-exceptとmessageboxでエラーハンドリングを追加します。

7. カスタマイズアイデア

基本機能を習得したら、以下のカスタマイズに挑戦してみましょう。

💡 ダークモードを追加する

bg色・fg色を辞書で管理し、ボタン1つでダークモード・ライトモードを切り替えられるようにしましょう。

💡 データの保存機能

処理結果をCSV・TXTファイルに保存する機能を追加しましょう。filedialog.asksaveasfilename()でファイル保存ダイアログが使えます。

💡 設定ダイアログ

フォントサイズや色などの設定をユーザーが変更できるオプションダイアログを追加しましょう。

8. よくある問題と解決法

❌ 日本語フォントが表示されない

原因:システムに日本語フォントが見つからない場合があります。

解決法:font引数を省略するかシステムに合ったフォントを指定してください。

❌ ライブラリのインポートエラー

原因:必要なライブラリがインストールされていません。

解決法:pip install コマンドで必要なライブラリをインストールしてください。 (pip install requests)

❌ ウィンドウサイズが合わない

原因:画面解像度や表示スケールによって異なる場合があります。

解決法:root.geometry()で適切なサイズに調整してください。

9. 練習問題

アプリの理解を深めるための練習問題です。

  1. 課題1:機能拡張

    Webスクレイピングツールに新しい機能を1つ追加してみましょう。

  2. 課題2:UIの改善

    色・フォント・レイアウトを変更して、より使いやすいUIにカスタマイズしましょう。

  3. 課題3:保存機能の追加

    処理結果をファイルに保存する機能を追加しましょう。

🚀
次に挑戦するアプリ

このアプリをマスターしたら、次のアプリに挑戦しましょう。