对抗搜索和博弈 - 查表和缓存

作者: 西部小笼包 | 来源:发表于2023-09-26 22:11 被阅读0次

查表

我们的AI算法在开局会有搜索空间过大的情况，所以不能及时去发现比较好的解。这个时候我们可以用人类的经验去辅助它。因为开局的格式比较小，我们可以很容易枚举完开局的前3步的可能性。这样可以很好的提高它在开局时的速度。
我们为了可以减少枚举的状态，可以做一个对旋转，翻转，平移的映射。这样可以使得本来需要枚举的情况可以极大的减少。比如说开局下斜着的4个角是等价。开局下水平和竖直也是等价。因为他们都可以通过旋转，平移，和翻转操作变化得到。

public enum PositionTranslator {
    CLOCKWISE_0{
        @Override
        Position translate(int y, int x) {
            return new Position(y, x);
        }

        @Override
        Position deTranslate(int y, int x) {
            return CLOCKWISE_0.translate(y, x);
        }
    }, CLOCKWISE_90{
        @Override
        Position translate(int y, int x) {
            return new Position(-x, y);
        }

        @Override
        Position deTranslate(int y, int x) {
            return CLOCKWISE_270.translate(y, x);
        }
    }, CLOCKWISE_180{
        @Override
        Position translate(int y, int x) {
            return new Position(-y,-x);
        }

        @Override
        Position deTranslate(int y, int x) {
            return CLOCKWISE_180.translate(y, x);
        }
    }, CLOCKWISE_270{
        @Override
        Position translate(int y, int x) {
            return new Position(x, -y);
        }

        @Override
        Position deTranslate(int y, int x) {
            return CLOCKWISE_90.translate(y, x);
        }
    };

    abstract Position translate(int y, int x);
    abstract Position deTranslate(int y, int x);

    public static int toDeltaX(int x) {
        return x - 7;
    }
    public static int toDeltaY(int y) {
        return 7 - y;
    }
    public static int toOriginX(int dx) {
        return dx + 7;
    }
    public static int toOriginY(int dy) {
        return 7 - dy;
    }

    public Position originDeTranslateThenOrigin(Position p) {
        Position res = deTranslate(toDeltaY(p.y), toDeltaX(p.x));
        return new Position(toOriginY(res.y), toOriginX(res.x), p.winning);
    }

    public Position originTranslateThenOrigin(int y, int x) {
        Position res = translate(toDeltaY(y), toDeltaX(x));
        return new Position(toOriginY(res.y), toOriginX(res.x));
    }

    public Piece originTranslateThenOrigin(Piece p) {
        int y = p.getY(), x = p.getX();
        Position res = originTranslateThenOrigin(y, x);
        return new Piece(res.x, res.y, p.getPlayer());
    }

    public String translateToString(int dy, int dx) {
        Position pos = translate(dy, dx);
        return toString(toOriginY(pos.y) , toOriginX(pos.x));
    }

    public static String toString(int y, int x) {
        assert y >= 0 && y < 15;
        assert x >= 0 && x < 15;
        String ys = (y + 1) + "";
        char xs = (char) ('a' + x);
        return xs + ys;
    }

    public static PositionTranslator select(int dy, int dx) {
        if (dx > 0 && dy <= 0) return CLOCKWISE_0;
        else if (dx >= 0 && dy > 0) return CLOCKWISE_90;
        else if (dx < 0 && dy >= 0) return CLOCKWISE_180;
        else if (dx <= 0 && dy < 0) return CLOCKWISE_270;
        return CLOCKWISE_0;
    }

    public static PositionTranslator selectByOrigin(int y, int x) {
        return select(toDeltaY(y), toDeltaX(x));
    }
}

下面我们可以在开局的时候避免大量的搜索。这里我们可以去查一些基本棋谱，找到比较好的开局下法。然后这需要用一个坐标象限的枚举，然后算法会根据旋转，平移，翻转等操作找到正确的落子位置。
这个时候我们必须要维护2套棋盘，一个是内部计算用的棋盘，我们统一翻转到右下角的象限。另一个是展现给用户看的棋盘。是回归到用户下棋位置对应的AI落子。
举个例子，如果用户开始下在，中心点的左上。我们会在实际棋盘里，下在右下，然后去查棋谱，应该怎么下。如果查到应该下在正下，在实际棋盘里下正下。但是我们呈现给用户看的就是正上。

public class ChessboardByteArrayAlgoConsistent extends ChessboardByteArrayAlgo implements IConsistentAlgo {
    @Getter
    private PositionTranslator positionTranslator;
    private Deque<Position> originAllSteps = new ArrayDeque<>();
    public ChessboardByteArrayAlgoConsistent(int size) {
        super(size);
    }

    @Override
    public void setPiece(Piece piece) {
        originAllSteps.offerLast(new Position(piece.getY(), piece.getX()));
        if (allSteps.isEmpty()) {
            if (piece.getY() != 7 || piece.getX() != 7) {
                positionTranslator = PositionTranslator.selectByOrigin(piece.getY(), piece.getX());
            } else {
                super.setPiece(piece);
                return;
            }
        }
        if (positionTranslator == null && allSteps.size() == 1) {
            positionTranslator = PositionTranslator.selectByOrigin(piece.getY(), piece.getX());
        }
        Piece np = positionTranslator.originTranslateThenOrigin(piece);
        allSteps.offerLast(new Position(np.getY(), np.getX()));
        location[getIdx(np.getY(),np.getX())] = (byte) piece.getPlayer().getId();
    }

    @Override
    public boolean isTerminal(Piece piece) {
        if (positionTranslator == null) return false;
        Piece np = positionTranslator.originTranslateThenOrigin(piece);
        return isTerminal(np.getX(), np.getY());
    }

    @Override
    public boolean isLegalMove(Piece piece) {
        if (positionTranslator == null) return true;
        Piece np = positionTranslator.originTranslateThenOrigin(piece);
        return isLegalMove(np.getX(), np.getY());
    }

    @Override
    public ChessboardByteArrayAlgoConsistent clone() {
        ChessboardByteArrayAlgoConsistent cloned = new ChessboardByteArrayAlgoConsistent(size);
        for (int i = 0; i < size; i++) {
            for (int j = 0; j < size; j++) {
                cloned.setPiece(j, i, getValInBoard(j, i));
            }
        }
        cloned.allSteps = new ArrayDeque<>(allSteps);
        cloned.originAllSteps = new ArrayDeque<>(originAllSteps);
        cloned.positionTranslator = positionTranslator;
        return cloned;
    }

    @Override
    public String generateStepsCode() {
        return BoardToString.serialize(originAllSteps);
    }
}

同时我们写一个算法，要求做到，只要满足表里的棋子样式（就是经过平移，翻转或旋转都是一个图案的情况下），找到对应的落子。
举个例子，比如表里的落子为：

1695821880828.png

当用户下成下图，我们应该可以查表找到对应红圈的位置

1695821942914.png
下面是算法实现

public class ConsistentPattern {

    public static Position findBestMove(List<Position> shape, Map<String, Position> patchedKey) {
        return findBestMoveInternal(shape.stream().map(i -> new int[]{i.x, i.y}).collect(Collectors.toList()), patchedKey);
    }

    private static Position findBestMoveInternal(List<int[]> shape, Map<String, Position> patchedKey) {
        int[] blackXs = new int[(shape.size() + 1) >> 1];
        int[] blackYs = new int[(shape.size() + 1) >> 1];
        int[] whiteXs = new int[shape.size() >> 1];
        int[] whiteYs = new int[shape.size() >> 1];
        int[] blackOuts = new int[blackXs.length];
        int[] whiteOuts = new int[whiteXs.length];

        for (int rotateType = 0; rotateType < 8; ++rotateType) {
            // 先旋转
            fulfillXYs(shape, rotateType, blackXs, blackYs, whiteXs, whiteYs);
            // 再位移
            int[] myx = fulfillOuts(blackXs, blackYs, whiteXs, whiteYs, blackOuts, whiteOuts);

            String candidate = getString(blackOuts, whiteOuts);
            if (!patchedKey.containsKey(candidate)) continue;
            Position bestRawMovePos = patchedKey.get(candidate);
            int[] bestRawMove = new int[]{bestRawMovePos.x, bestRawMovePos.y};
            // 先位移
            bestRawMove[0] += myx[0];
            bestRawMove[1] += myx[1];
            // 再旋转
            deRotate(rotateType, bestRawMove);

            return new Position(bestRawMove[1], bestRawMove[0]);
        }
        return null;
    }

    private static void rotate(int rotateType, int[] bestRawMove) {
        //x y, x -y, -x y, -x -y, y x, y -x, -y x, -y -x
        int x = bestRawMove[0], y = bestRawMove[1];
        int nx = rotateType <=1 ? x : rotateType <=3 ? -x : rotateType <=5 ? y : -y;
        int ny = rotateType <=3 ? (rotateType %2==0 ? y : -y) : (rotateType %2==0 ? x : -x);
        bestRawMove[0] = nx;
        bestRawMove[1] = ny;
    }

    private static void deRotate(int rotateType, int[] bestRawMove) {
        //after: x y, x -y, -x y, -x -y, y x, y -x, -y x, -y -x
        //apply: x y, x -y, -x y, -x -y, y x, -y x, y -x, -y -x
        //res  : x y, x  y,  x y,  x  y, x y,  x y, x  y, x  y
        int x = bestRawMove[0], y = bestRawMove[1];
        int nx = rotateType <=1 ? x : rotateType <=3 ? -x : rotateType %2 == 0 ? y : -y;
        int ny = rotateType <=3 ? (rotateType %2==0 ? y : -y) : (rotateType <= 5 ? x : -x);
        bestRawMove[0] = nx;
        bestRawMove[1] = ny;
    }

    public static String canonical(List<Position> shape) {
        return canonicalInternal(shape.stream().map(i -> new int[]{i.x, i.y}).collect(Collectors.toList()), new int[2]);
    }

    public static String canonical(List<Position> shape, int[] bestXandY) {
        return canonicalInternal(shape.stream().map(i -> new int[]{i.x, i.y}).collect(Collectors.toList()), bestXandY);
    }

    private static String canonicalInternal(List<int[]> shape, int[] bestXandY) {
        String ans = "";
        int[] blackXs = new int[(shape.size() + 1) >> 1];
        int[] blackYs = new int[(shape.size() + 1) >> 1];
        int[] whiteXs = new int[shape.size() >> 1];
        int[] whiteYs = new int[shape.size() >> 1];
        int[] blackOuts = new int[blackXs.length];
        int[] whiteOuts = new int[whiteXs.length];
        int[] tmpBest = bestXandY.clone();
        for (int rotateType = 0; rotateType < 8; ++rotateType) {
            fulfillXYs(shape, rotateType, blackXs, blackYs, whiteXs, whiteYs);

            int[] myx = fulfillOuts(blackXs, blackYs, whiteXs, whiteYs, blackOuts, whiteOuts);

            String candidate = getString(blackOuts, whiteOuts);
            if (ans.compareTo(candidate) < 0) {
                ans = candidate;
                int[] tmp = bestXandY.clone();
                rotate(rotateType, tmp);
                tmp[0] -= myx[0];
                tmp[1] -= myx[1];
                tmpBest = tmp;
            }
        }
        bestXandY[0] = tmpBest[0];
        bestXandY[1] = tmpBest[1];
        return ans;
    }

    private static String getString(int[] blackOuts, int[] whiteOuts) {
        String candidate = new StringBuilder(Arrays.toString(blackOuts)).append(":").append(Arrays.toString(whiteOuts)).toString() ;
        return candidate;
    }

    private static int[] fulfillOuts(int[] blackXs, int[] blackYs, int[] whiteXs, int[] whiteYs, int[] blackOuts, int[] whiteOuts) {
        int mx = blackXs[0], my = blackYs[0];
        for (int x: blackXs) mx = Math.min(mx, x);
        for (int x: whiteXs) mx = Math.min(mx, x);

        for (int y: blackYs) my = Math.min(my, y);
        for (int y: whiteYs) my = Math.min(my, y);

        for (int j = 0; j < blackOuts.length; j++) {
            blackOuts[j] = (blackYs[j] - my) * 15 + (blackXs[j] - mx);
        }
        for (int j = 0; j < whiteOuts.length; j++) {
            whiteOuts[j] = (whiteYs[j] - my) * 15 + (whiteXs[j] - mx);
        }
        Arrays.sort(blackOuts);
        Arrays.sort(whiteOuts);
        return new int[]{mx, my};
    }


    private static void fulfillXYs(List<int[]> shape, int c, int[] blackXs,  int[] blackYs, int[] whiteXs, int[] whiteYs) {
        int t = 0;
        for (int i = 0; i < shape.size(); i += 2) {
            int[] z = shape.get(i);
            int x = z[0], y = z[1];
            //x y, x -y, -x y, -x -y, y x, y -x, -y x, -y -x
            blackXs[t] = c <=1 ? x : c <=3 ? -x : c <=5 ? y : -y;
            blackYs[t++] = c <=3 ? (c %2==0 ? y : -y) : (c %2==0 ? x : -x);
        }
        t = 0;
        for (int i = 1; i < shape.size(); i += 2) {
            int[] z = shape.get(i);
            int x = z[0], y = z[1];
            //x y, x -y, -x y, -x -y, y x, y -x, -y x, -y -x
            whiteXs[t] = c <=1 ? x : c <=3 ? -x : c <=5 ? y : -y;
            whiteYs[t++] = c <=3 ? (c %2==0 ? y : -y) : (c %2==0 ? x : -x);
        }
    }
}

有了这个我们可以快速的把一些开局棋谱，运用到各种维度上了。

Zobrist缓存

最后说一下缓存的优化。棋盘类里有个非常有名的缓存叫zobrist

Zobrist缓存（Zobrist Hashing）是一种在计算机博弈和搜索算法中常用的技术，用于高效地存储和检索游戏局面的信息，以提高搜索算法的性能。它的名称来自于其发明者Albert L. Zobrist，他是计算机博弈领域的先驱之一。

Zobrist缓存的主要思想是使用随机生成的哈希键（称为Zobrist键）来表示棋盘上每个局部位置的状态。这个哈希键通常是一个固定位数的二进制数或整数，它可以唯一地标识棋盘上的每种可能局面。Zobrist键的生成是随机的，但在程序的每次运行中都是固定的，以确保相同的局面生成相同的键。

Zobrist缓存的工作流程如下：

初始化：在程序启动时，为每个可能的棋盘局面生成随机的Zobrist键，通常是一个固定位数的二进制数。这些键被存储在一个称为Zobrist表的数据结构中。

计算局面哈希：对于给定的棋盘局面，通过将每个局部位置的Zobrist键按位异或（XOR）在一起来计算整个局面的哈希键。这个哈希键可以用来唯一地标识该局面。

缓存和检索：在搜索算法中，每次评估一个新的局面时，可以使用局面的哈希键来检查Zobrist表，看是否已经计算过这个局面的评估值。如果已经计算过，就可以直接从缓存中获取评估值，而不必重新计算。这大大提高了搜索算法的效率，特别是在深度搜索中。

更新缓存：如果在搜索过程中发现了一个新的局面，可以计算其评估值并将局面的哈希键与评估值存储在Zobrist表中，以供后续检索使用。

Zobrist缓存在博弈树搜索算法（如博弈树搜索、Alpha-Beta剪枝、Minimax等）中广泛应用，因为它能够有效地减少了冗余计算，提高了搜索的速度。然而，需要注意的是，Zobrist缓存可能存在哈希冲突，即不同的局面生成相同的哈希键，因此需要一种方法来处理冲突，通常是通过使用开放寻址法或链表来解决。

下面是代码实现：

public class Zobrist {
    public static int DISABLE_CACHE_MASK = 100_000_000;
    private final long[] zobristTable;
    @Getter
    private Map<Long, ResultAndDepth> transpositionTable;

    @Getter
    private long hash = 0;

    @Getter
    private int cacheMatch = 0;

    private int size;
    public Zobrist(IChessboardAIAlgo chessBoardAlgo) {
        size = chessBoardAlgo.getSize();
        zobristTable = new long[size * size * 2];
        transpositionTable = new HashMap<>();
        SecureRandom secureRandom = new SecureRandom();
        for (int i = 0; i < zobristTable.length; i++) {
            zobristTable[i] = secureRandom.nextLong();
        }
        calculateInitialHash(chessBoardAlgo);
    }

    public void updateHash(int y, int x, int role) {
        if (role < 1 || role > 2) {
            throw new IllegalArgumentException("Invalid move");
        }
        hash ^= zobristTable[(y * size + x) * 2 + (role - 1)];
    }

    private void calculateInitialHash(IChessboardAIAlgo chessBoardAlgo) {
        for (int y = 0; y < size; y++) {
            for (int x = 0; x < size; x++) {
                int i = y * size + x;
                int val = chessBoardAlgo.getValInBoard(x, y);
                if (val != 0) {
                    hash ^= zobristTable[(i << 1) + (val - 1)];
                }
            }
        }
    }

    public Optional<Integer> tryGet(int depth) {
        if (!transpositionTable.containsKey(hash)) return Optional.empty();
        ResultAndDepth scoreAndDepth = transpositionTable.get(hash);
        if (scoreAndDepth.depth >= depth) {
            cacheMatch++;
            return Optional.of(scoreAndDepth.result);
        }
        return Optional.empty();
    }

    public int setAndReturnScore(int result, int depth) {
        if (Math.abs(result) != DISABLE_CACHE_MASK)
            transpositionTable.put(hash, new ResultAndDepth(result, depth));
        return result;
    }

    private class ResultAndDepth {
        int result;
        int depth;

        public ResultAndDepth(int result, int depth) {
            this.result = result;
            this.depth = depth;
        }
    }

}

优化算杀深度

要做出五子棋强力AI，算杀模块真的非常重要；如果他能在规定时间算出杀解，那么基本就很早可以胜券在握了。五子棋无禁手黑棋有着很大优势。所以我们要利用好查表和算杀，就可以基本做出一个黑棋先手必胜的AI了。
那么我们上一版设计的算杀AI会有2个缺陷，他可以算出杀解，但是我们指定深度，对手可以通过不断冲四，去消耗掉我们的深度，最终会让我们的算法无法判断是否可以杀棋，而返回不能算杀的结论。
那么要优化这一个缺陷的技术，叫单步拓展。也就是如果对手步数是冲四，我们就不消耗递归的层数。但是这样就会造成时间上的极大退化。
原因是因为，假设对手有2步冲四，算杀过程中，这2步冲四，可以插进其他任意的算杀步子里。这样排列组合的结果非常大。假设算杀需要15步棋，那么2个冲四，可以任意使用，大概会多17 * 18 的复杂度的提高。让原来1秒的计算，变成5分钟。当然冲四更多，则计算让费更大。
这时ZOBRIST缓存就发挥了很好的效果。因为他只记录棋盘状态，不管你算杀是发生在哪一步，如果这个局面之前计算过，他就可以提早剪枝。
下面我们来看下带ZOBRIST缓存的算杀代码。

public class VCX extends RecursiveBaseAIAlgo implements IWinningAlgo {

    private final int TIME_LIMIT_MS = 55_000;
    private int nextPoint = -1;
    private final long[] zobristTable;
    private long hash = 0;
    private long startTime = 0;
    private Map<Long, Boolean> zobristCache;
    private int firstDepth;
    private int timeFactor;
    private VCXCachedScoreManager vcxCachedScoreManager;

    @Setter
    private DebugContext debugContext = DebugContext.DISABLE;

    public VCX(IChessboardAIAlgo chessBoardAlgo, int humanColor) {
        this(chessBoardAlgo, humanColor, 23, VCXOptimization.FAST);
    }
    public VCX(IChessboardAIAlgo chessBoardAlgo, int humanColor, int firstDepth) {
        this(chessBoardAlgo, humanColor, firstDepth, VCXOptimization.FAST);
    }
    public VCX(IChessboardAIAlgo chessBoardAlgo, int humanColor, int firstDepth, VCXOptimization killOptimization) {
        super(chessBoardAlgo, new VCXCachedScoreManager(chessBoardAlgo, humanColor, killOptimization), humanColor);
        this.vcxCachedScoreManager = (VCXCachedScoreManager) scoreManager;
        this.firstDepth = firstDepth;
        this.timeFactor = killOptimization.factor;
        zobristTable = new long[size * size * 2];
        SecureRandom secureRandom = new SecureRandom();
        for (int i = 0; i < zobristTable.length; i++)
            zobristTable[i] = secureRandom.nextLong();
    }

    @Override
    public boolean isAbsoluteForcedWin() {
        return true;
    }

    @Override
    public Position aiFindPos() {
        nextPoint = -1;
        int oriFirstDepth = firstDepth;
        startTime = System.currentTimeMillis();
        // 迭代加深
        for (int i = Math.max(oriFirstDepth - 16, 5); i <= oriFirstDepth; i += 4) {
            firstDepth = i;
            zobristCache = new HashMap<>();
            if (aiWin(aiColor, i, -1)) break;
            if ((System.currentTimeMillis() - startTime) * timeFactor > TIME_LIMIT_MS) break;
        }
        firstDepth = oriFirstDepth;
        if (nextPoint == -1)
            return Position.EMPTY;
        return new Position(getY(nextPoint), getX(nextPoint), true);
    }

    // lastMaxPoint, 为了优化性能, 方便剪枝
    boolean aiWin(int role, int depth, int lastMaxPoint) {
        // 因为会根据lastMaxPoint 进行进攻剪枝，所以CACHE里的输不一定是输
        Boolean cached = zobristCache.get(hash);
        if (cached != null) return cached;

        if (depth <= 0) return setAndReturn(false);

        List<Long> candidates = vcxCachedScoreManager.findAIKillSteps(lastMaxPoint);
        if (!candidates.isEmpty() && score(candidates.get(0)) >= Score.FOUR.value) {
            if (depth == firstDepth)
                nextPoint = pos(candidates.get(0));
            return setAndReturn(true);
        }

        if (candidates.isEmpty()) return setAndReturn(false);

        debugContext.debugStartInfo(depth, firstDepth);

        int maxPoint = -1;
        for (long p : candidates) {
            if (Thread.currentThread().isInterrupted()) return false;
            if (System.currentTimeMillis() - startTime > TIME_LIMIT_MS) return false;
            int pos = pos(p), score = score(p);

            if (!debugContext.isInDebugStep(depth, firstDepth, pos)) continue;

            int y = getY(pos), x = getX(pos);
            addPiece(y, x, pos,true);
            if (score > -Score.FIVE.value) {
                maxPoint = pos;
            }
            boolean humanLose = humanLose(Player.enemyColor(role), depth - 1, maxPoint);

            debugContext.debugResultInfo(depth, y, x, firstDepth, String.format("human lose: %s", humanLose));

            removePiece(y, x, pos, true);
            if (humanLose) {
                if (depth == firstDepth)
                    nextPoint = pos;
                return setAndReturn(true);
            }
        }
        return setAndReturn(false);
    }

    private boolean humanLose(int role, int depth, int lastMaxPoint) {
        Boolean cached = zobristCache.get(hash);
        if (cached != null) return cached;
        // 超过回合数，代表防守成功
        if (depth <= 0) return setAndReturn(false);
        List<Long> candidates = vcxCachedScoreManager.findHumanDefendSteps();
        // 如果发现对面没有进攻手段（活三，冲四，活四），则代表防守成功
        if (candidates.isEmpty()) return setAndReturn(false);
        // 如果对面能成五，发现自己有成五；
        // 如果对面不能成五，发现自己有活四；
        if (-1 * score(candidates.get(0)) >= Score.FOUR.value)
            return setAndReturn(false);
        debugContext.debugStartInfo(depth, firstDepth);
        for (long p : candidates) {
            int pos = pos(p);
            if (!debugContext.isInDebugStep(depth, firstDepth, pos)) continue;
            int y = getY(pos), x = getX(pos);
            addPiece(y, x, pos, false);
            boolean aiWin = aiWin(Player.enemyColor(role), depth - 1, lastMaxPoint);
            debugContext.debugResultInfo(depth, y, x, firstDepth, String.format("ai Win:%s",aiWin));
            removePiece(y, x, pos, false);
            if (!aiWin) return setAndReturn(false);
        }
        return setAndReturn(true);
    }


    protected void addPiece(int y, int x, int nextStep, boolean isAI) {
        int color = isAI ? aiColor : humanColor;
        hash ^= zobristTable[(nextStep << 1) | (color - 1)];
        super.addPiece(y, x, isAI);

    }

    protected void removePiece(int y, int x, int nextStep, boolean isAI) {
        int color = isAI ? aiColor : humanColor;
        hash ^= zobristTable[(nextStep << 1) | (color - 1)];
        super.removePiece(y, x, isAI);
    }

    private boolean setAndReturn(boolean res) {
        zobristCache.put(hash, res);
        return res;
    }
}

迭代加深

上面算杀过程中还运用了一个提速的技巧，如果我们7步可以算出杀解，要是一开始走错了路子。他会搜的非常深，其实那个正确的路子只要7步。这时可以用迭代加深的搜索来解决这个问题。

迭代加深算法（Iterative Deepening Search，简称IDS）是一种用于解决搜索问题的深度优先搜索（Depth-First Search，DFS）算法的改进版本。它的主要思想是通过不断增加搜索深度来逐渐扩展搜索范围，直到找到解决方案或达到最大搜索深度为止。IDS结合了DFS的简单性和广度优先搜索（Breadth-First Search，BFS）的逐层扩展特性，因此通常用于在有限搜索空间中找到解决方案。

IDS的基本工作流程如下：

从初始节点开始，设置初始搜索深度为1。

使用深度优先搜索（DFS）来探索搜索树，但限制搜索深度不超过当前设定的深度。

如果在当前深度下找到了解决方案，就停止搜索并返回解决方案。

如果在当前深度下没有找到解决方案，增加搜索深度，并回到步骤2，继续搜索。

重复步骤2到步骤4，直到找到解决方案或达到最大搜索深度为止。

IDS的优点包括：

完备性：IDS保证会找到解决方案，如果解存在于搜索空间中。这是因为它逐渐增加搜索深度，最终会探索整个搜索树。

空间效率：与BFS不同，IDS不需要在内存中存储整个搜索树，只需要存储当前深度的节点，因此对内存的要求较低。

可控制的搜索深度：IDS允许你在搜索过程中控制搜索的深度，这在处理搜索空间大小未知或不确定的情况下很有用。

然而，IDS的主要缺点是其多次重复搜索相同的节点，因为每次增加深度时都会重新搜索之前的部分。这可能会导致性能问题，特别是在搜索空间庞大的情况下。

但是这个额外性能开销不算高。原因是因为对于许多状态空间，大多数节点位于底层。所以上层的重复其实并不重要。经过我自己测试，如果搜不到解，使用了迭代加深的时间开销只多15%，但是在有解的情况下，他能极快的返回结果。

对抗搜索和博弈 - 查表和缓存

查表

Zobrist缓存

优化算杀深度

迭代加深

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读