1P5 : 단어 교환기

이것은 첫 번째 정기 프리미어 프로그래밍 퍼즐 푸시의 일부로 작성되었습니다 .

게임

길이가 같은 시작 및 끝 단어가 제공됩니다. 이 게임의 목적은 시작 단어에서 한 글자를 변경하여 다른 유효한 단어를 형성하고, 최소 단계를 사용하여 끝 단어에 도달 할 때까지이 단계를 반복하는 것입니다. 예를 들어 TREE 및 FLED라는 단어가 제공되면 출력은 다음과 같습니다.

TREE
FREE
FLEE
FLED
2

명세서

OWL 또는 SOWPODS에 대한 Wikipedia 기사 는 단어 목록에 관한 한 유용한 출발점이 될 수 있습니다.
프로그램은 시작 및 종료 단어를 선택하는 두 가지 방법을 지원해야합니다.
1. 명령 행, stdin 또는 선택한 언어에 적합한 것을 통해 사용자가 지정합니다 (작업 내용 만 언급하십시오).
2. 파일에서 무작위로 2 단어를 선택합니다.
시작 및 끝 단어와 모든 중간 단어의 길이는 같아야합니다.
각 단계는 해당 줄에 인쇄해야합니다.
출력의 마지막 줄은 시작 단어와 끝 단어 사이에 도달하는 데 필요한 중간 단계 수입니다.
시작 단어와 끝 단어 사이에 일치하는 항목이 없으면 출력은 시작 단어, 끝 단어 및 단어 OY의 3 줄로 구성되어야합니다.
답에 솔루션에 대한 Big O 표기법 포함
프로그램이 생성하는 단계를 보여주기 위해 10 개의 고유 한 시작 및 끝 단어 쌍 (물론 출력과 함께)을 포함 시키십시오. (공간을 절약하기 위해 프로그램에서 개별 행으로 출력해야하는 반면 새 행을 공백으로 바꾸고 각 실행 사이에 쉼표로 바꾸면 게시 할 단일 행으로 통합 할 수 있습니다.

목표 / 승리 기준

일주일 후 가장 짧은 중간 단계를 생성하는 가장 빠르고 최고의 Big O 솔루션이 승리합니다.
동점이 Big O 기준에서 비롯된 경우 가장 짧은 코드가 승리합니다.
여전히 동점이 있다면, 가장 빠르고 가장 짧은 개정판에 도달하는 첫 번째 솔루션이 이길 것입니다.

테스트 / 샘플 출력

DIVE
DIME
DAME
NAME
2

PEACE
PLACE
PLATE
SLATE
2

HOUSE
HORSE
GORSE
GORGE
2

POLE
POSE
POST
PAST
FAST
3

확인

출력의 유효성을 검사하는 데 사용할 수있는 스크립트를 작성 중입니다.

그것은 :

각 단어가 유효한지 확인하십시오.
각 단어가 이전 단어와 정확히 1 글자가 다른지 확인하십시오.

그건 그렇지 않을거야:

가장 짧은 단계 수가 사용되었는지 확인하십시오.

그 글을 받으면 물론이 게시물을 업데이트 할 것입니다. (:

code-challenge word-puzzle 1p5

— 레베카 처 노프
소스

그것은 3 개 작업을 수행하는로부터 얻을 것을 나에게 이상한 것 HOUSE까지 GORGE내가이 개 중간 단어가 실현 2로는 메이크업 감각을한다, 그래서보고 있지만, 작업 # 더 직관적 인 것입니다.

— Matthew 읽기

@Peter, sowpods Wikipedia 페이지에 따르면 13 글자를 초과하는 최대 15k 단어가 있습니다

— gnibbler

나는 그것을 모두 알고 있다는 것을 의미하지는 않지만 퍼즐에는 실제로 이름이 있습니다. Lewis Carroll이 발명했습니다. en.wikipedia.org/wiki/Word_ladder

— st0le

질문에 대한 결정 불가능한 목표가 The fastest/best Big O solution producing the shortest interim steps after one week will win.있습니다. 가장 빠른 솔루션이 가장 적은 단계를 사용하는 솔루션이라는 것을 보장 할 수는 없으므로 한 솔루션이 더 적은 단계를 사용하지만 나중에 목표에 도달하면 선호도를 제공해야합니다.

— 사용자가 알 수 없음

그냥 확인 BAT하고 싶고 CAT0 단계가 있습니다.

— st0le

답변:

길이가 기준으로 표시되므로 다음은 1681 자로 된 골프 버전입니다 (아마도 10 % 향상 될 수 있음).

import java.io.*;import java.util.*;public class W{public static void main(String[]
a)throws Exception{int n=a.length<1?5:a[0].length(),p,q;String f,t,l;S w=new S();Scanner
s=new Scanner(new
File("sowpods"));while(s.hasNext()){f=s.next();if(f.length()==n)w.add(f);}if(a.length<1){String[]x=w.toArray(new
String[0]);Random
r=new Random();q=x.length;p=r.nextInt(q);q=r.nextInt(q-1);f=x[p];t=x[p>q?q:q+1];}else{f=a[0];t=a[1];}H<S>
A=new H(),B=new H(),C=new H();for(String W:w){A.put(W,new
S());for(p=0;p<n;p++){char[]c=W.toCharArray();c[p]='.';l=new
String(c);A.get(W).add(l);S z=B.get(l);if(z==null)B.put(l,z=new
S());z.add(W);}}for(String W:A.keySet()){C.put(W,w=new S());for(String
L:A.get(W))for(String b:B.get(L))if(b!=W)w.add(b);}N m,o,ñ;H<N> N=new H();N.put(f,m=new
N(f,t));N.put(t,o=new N(t,t));m.k=0;N[]H=new
N[3];H[0]=m;p=H[0].h;while(0<1){if(H[0]==null){if(H[1]==H[2])break;H[0]=H[1];H[1]=H[2];H[2]=null;p++;continue;}if(p>=o.k-1)break;m=H[0];H[0]=m.x();if(H[0]==m)H[0]=null;for(String
v:C.get(m.s)){ñ=N.get(v);if(ñ==null)N.put(v,ñ=new N(v,t));if(m.k+1<ñ.k){if(ñ.k<ñ.I){q=ñ.k+ñ.h-p;N
Ñ=ñ.x();if(H[q]==ñ)H[q]=Ñ==ñ?null:Ñ;}ñ.b=m;ñ.k=m.k+1;q=ñ.k+ñ.h-p;if(H[q]==null)H[q]=ñ;else{ñ.n=H[q];ñ.p=ñ.n.p;ñ.n.p=ñ.p.n=ñ;}}}}if(o.b==null)System.out.println(f+"\n"+t+"\nOY");else{String[]P=new
String[o.k+2];P[o.k+1]=o.k-1+"";m=o;for(q=m.k;q>=0;q--){P[q]=m.s;m=m.b;}for(String
W:P)System.out.println(W);}}}class N{String s;int k,h,I=(1<<30)-1;N b,p,n;N(String S,String
d){s=S;for(k=0;k<d.length();k++)if(d.charAt(k)!=S.charAt(k))h++;k=I;p=n=this;}N
x(){N r=n;n.p=p;p.n=n;n=p=this;return r;}}class S extends HashSet<String>{}class H<V>extends
HashMap<String,V>{}

패키지 이름과 메소드를 사용하고 별명으로 경고하거나 클래스를 확장하지 않는 ungolfed 버전은 다음과 같습니다.

package com.akshor.pjt33;

import java.io.*;
import java.util.*;

// WordLadder partially golfed and with reduced dependencies
//
// Variables used in complexity analysis:
// n is the word length
// V is the number of words (vertex count of the graph)
// E is the number of edges
// hash is the cost of a hash insert / lookup - I will assume it's constant, but without completely brushing it under the carpet
public class WordLadder2
{
    private Map<String, Set<String>> wordsToWords = new HashMap<String, Set<String>>();

    // Initialisation cost: O(V * n * (n + hash) + E * hash)
    private WordLadder2(Set<String> words)
    {
        Map<String, Set<String>> wordsToLinks = new HashMap<String, Set<String>>();
        Map<String, Set<String>> linksToWords = new HashMap<String, Set<String>>();

        // Cost: O(Vn * (n + hash))
        for (String word : words)
        {
            // Cost: O(n*(n + hash))
            for (int i = 0; i < word.length(); i++)
            {
                // Cost: O(n + hash)
                char[] ch = word.toCharArray();
                ch[i] = '.';
                String link = new String(ch).intern();
                add(wordsToLinks, word, link);
                add(linksToWords, link, word);
            }
        }

        // Cost: O(V * n * hash + E * hash)
        for (Map.Entry<String, Set<String>> from : wordsToLinks.entrySet()) {
            String src = from.getKey();
            wordsToWords.put(src, new HashSet<String>());
            for (String link : from.getValue()) {
                Set<String> to = linksToWords.get(link);
                for (String snk : to) {
                    // Note: equality test is safe here. Cost is O(hash)
                    if (snk != src) add(wordsToWords, src, snk);
                }
            }
        }
    }

    public static void main(String[] args) throws IOException
    {
        // Cost: O(filelength + num_words * hash)
        Map<Integer, Set<String>> wordsByLength = new HashMap<Integer, Set<String>>();
        BufferedReader br = new BufferedReader(new FileReader("sowpods"), 8192);
        String line;
        while ((line = br.readLine()) != null) add(wordsByLength, line.length(), line);

        if (args.length == 2) {
            String from = args[0].toUpperCase();
            String to = args[1].toUpperCase();
            new WordLadder2(wordsByLength.get(from.length())).findPath(from, to);
        }
        else {
            // 5-letter words are the most interesting.
            String[] _5 = wordsByLength.get(5).toArray(new String[0]);
            Random rnd = new Random();
            int f = rnd.nextInt(_5.length), g = rnd.nextInt(_5.length - 1);
            if (g >= f) g++;
            new WordLadder2(wordsByLength.get(5)).findPath(_5[f], _5[g]);
        }
    }

    // O(E * hash)
    private void findPath(String start, String dest) {
        Node startNode = new Node(start, dest);
        startNode.cost = 0; startNode.backpointer = startNode;

        Node endNode = new Node(dest, dest);

        // Node lookup
        Map<String, Node> nodes = new HashMap<String, Node>();
        nodes.put(start, startNode);
        nodes.put(dest, endNode);

        // Heap
        Node[] heap = new Node[3];
        heap[0] = startNode;
        int base = heap[0].heuristic;

        // O(E * hash)
        while (true) {
            if (heap[0] == null) {
                if (heap[1] == heap[2]) break;
                heap[0] = heap[1]; heap[1] = heap[2]; heap[2] = null; base++;
                continue;
            }

            // If the lowest cost isn't at least 1 less than the current cost for the destination,
            // it can't improve the best path to the destination.
            if (base >= endNode.cost - 1) break;

            // Get the cheapest node from the heap.
            Node v0 = heap[0];
            heap[0] = v0.remove();
            if (heap[0] == v0) heap[0] = null;

            // Relax the edges from v0.
            int g_v0 = v0.cost;
            // O(hash * #neighbours)
            for (String v1Str : wordsToWords.get(v0.key))
            {
                Node v1 = nodes.get(v1Str);
                if (v1 == null) {
                    v1 = new Node(v1Str, dest);
                    nodes.put(v1Str, v1);
                }

                // If it's an improvement, use it.
                if (g_v0 + 1 < v1.cost)
                {
                    // Update the heap.
                    if (v1.cost < Node.INFINITY)
                    {
                        int bucket = v1.cost + v1.heuristic - base;
                        Node t = v1.remove();
                        if (heap[bucket] == v1) heap[bucket] = t == v1 ? null : t;
                    }

                    // Next update the backpointer and the costs map.
                    v1.backpointer = v0;
                    v1.cost = g_v0 + 1;

                    int bucket = v1.cost + v1.heuristic - base;
                    if (heap[bucket] == null) {
                        heap[bucket] = v1;
                    }
                    else {
                        v1.next = heap[bucket];
                        v1.prev = v1.next.prev;
                        v1.next.prev = v1.prev.next = v1;
                    }
                }
            }
        }

        if (endNode.backpointer == null) {
            System.out.println(start);
            System.out.println(dest);
            System.out.println("OY");
        }
        else {
            String[] path = new String[endNode.cost + 1];
            Node t = endNode;
            for (int i = t.cost; i >= 0; i--) {
                path[i] = t.key;
                t = t.backpointer;
            }
            for (String str : path) System.out.println(str);
            System.out.println(path.length - 2);
        }
    }

    private static <K, V> void add(Map<K, Set<V>> map, K key, V value) {
        Set<V> vals = map.get(key);
        if (vals == null) map.put(key, vals = new HashSet<V>());
        vals.add(value);
    }

    private static class Node
    {
        public static int INFINITY = Integer.MAX_VALUE >> 1;

        public String key;
        public int cost;
        public int heuristic;
        public Node backpointer;

        public Node prev = this;
        public Node next = this;

        public Node(String key, String dest) {
            this.key = key;
            cost = INFINITY;
            for (int i = 0; i < dest.length(); i++) if (dest.charAt(i) != key.charAt(i)) heuristic++;
        }

        public Node remove() {
            Node rv = next;
            next.prev = prev;
            prev.next = next;
            next = prev = this;
            return rv;
        }
    }
}

보다시피, 실행 비용 분석은 O(filelength + num_words * hash + V * n * (n + hash) + E * hash)입니다. 해시 테이블 삽입 / 조회가 일정한 시간이라는 가정을 받아들이는 경우는 O(filelength + V n^2 + E)입니다. SOWPODS에서 그래프의 특정 통계는 대부분 이 O(V n^2)실제로 지배 한다는 것을 의미합니다 .O(E)n

샘플 출력 :

IDOLA, IDOLS, IDYLS, ODYLS, ODALS, OVALS, OVELS, OVENS, EVENS, ETENS, STENS, SKENS, SKINS, SPINS, SPINE, 13

WICCA, PROSY, 오우

BRINY, BRINS, TRINS, TAINS, TARNS, YARNS, YAWNS, YAWPS, YAPPS, 7

GALES, GASES, GASTS, GESTS, GESTE, GESSE, DESSE, 5

SURES, DURES, DUNES, DINES, DINGS, DINGY, 4

LICHT, LIGHT, BIGHT, BIGOT, BIGOS, BIROS, GIROS, GIRNS, GURNS, GUANS, GUANA, RUANA, 10

SARGE, SERGE, SERRE, SERRS, SEERS, DEERS, DYERS, OYERS, OVERS, OVELS, OVALS, ODALS, ODYLS, IDYLS, 12

키 어즈, 시어즈, 시어스, 맥주, 맥주, 브 르레, 브림, 크림, CREPE, 7

가장 짧은 경로를 가진 6 쌍 중 하나입니다.

GAINEST, FAINEST, FAIREST, SAIREST, SAIDEST, SADDEST, MADDEST, MIDDEST, MILDEST, WILDEST, WIREEST, WANIEST, CANIEST, CANTEST, CONTEST, CONFEST, CONFESS, CONFERS, CONKERS, COOKERS, 구리, 구리, 구리 POPPITS, POPPIES, POPSIES, MOPSIES, MOUSIES, MOUSSES, POISSE, PLUSSES, PLISSES, PRISSES, PRESSES, REFARES, UREASES, UNEASES, UNCASES, UNCASED, UNBASED, UNBATED, UNMATED, UNMEED, WEEDEDEDED, EDWEDED, EDWEDED, EDWEDED, EDWEDED, EDWEDED, EDWEDED, EDWEDED, EDWEDED, EDWEDED,웨어 하우스 INDEXES, INDENES, INDENTS, INCENTS, INCESTS, INFESTS, INFECTS, INJECTS 56

그리고 최악의 가용성 8 문자 쌍 중 하나 :

ENROBING, UNROBING, UNROPING, UNCOPING, UNCAPING, UNCAGING, ENCAGING, ENRAGING, ENRACING, UNLACING, UNLAYING, UPLAYING, SPLAYING, 스프레이, 스트레이 핑, 쓰다듬 기, 쓰다듬 기, 스톰 핑, 스톰 핑, 스톰 핑, 스톰 핑 크림 핑, 크림 핑, 크림, 크림, 크림 퍼, 크림 퍼, 크림 퍼, 클램퍼, 클래퍼, 클래퍼, 슬래셔, 슬래 더, 슬리퍼, 스터 더 런치, 린치, 린치, 린치, 52

이제 질문에 대한 모든 요구 사항을 해결해야한다고 생각합니다.

CompSci의 경우, 질문은 꼭짓점이 단어이고 하나의 문자가 다른 단어를 연결하는 모서리가있는 그래프 G에서 최단 경로로 분명히 줄어 듭니다. 그래프를 효율적으로 생성하는 것은 쉬운 일이 아닙니다. 실제로 복잡성을 O (V n hash + E)로 줄이기 위해 다시 방문해야한다는 생각이 있습니다. 내가하는 방법은 하나의 와일드 카드 문자가있는 단어에 해당하는 여분의 정점을 삽입하고 문제의 그래프에 동종 변형되는 그래프를 만드는 것입니다. 나는 G로 줄이지 않고 그래프를 사용하는 것을 고려했다. 골프 관점에서 내가해야 할 일은 3 개의 모서리를 가진 와일드 카드 노드가 그래프의 모서리 수를 줄이고 최단 경로 알고리즘의 표준 최악의 경우 실행 시간은입니다 O(V heap-op + E).

그러나 내가 한 첫 번째 일은 다른 단어 길이에 대한 그래프 G에 대한 분석을 실행하는 것이었고 5 개 이상의 글자에 대해서는 매우 희박하다는 것을 알았습니다. 5 자 그래프에는 12478 개의 정점과 40759 개의 모서리가 있습니다. 링크 노드를 추가하면 그래프가 더 나빠집니다. 최대 8 글자까지는 노드보다 가장자리가 적으며 3/7 단어는 "aloof"입니다. 그래서 나는 그 최적화 아이디어가 실제로 도움이되지 않는다는 것을 거부했습니다.

도움이 된 아이디어는 힙을 검사하는 것이 었습니다. 나는 솔직히 과거에 약간의 이국적인 힙을 구현했지만 이만큼 이국적인 것은 아니라고 말할 수 있습니다. 대상과 다른 문자 수의 명백한 휴리스틱으로 A-star (C는 사용중인 힙을 제공하지 않기 때문에)를 사용하며 약간의 분석에 따르면 언제든지 3 가지 이상의 우선 순위가 없음을 보여줍니다 힙에. 우선 순위가 (비용 + 휴리스틱) 인 노드를 팝하고 해당 이웃을 볼 때 고려중인 세 가지 경우가 있습니다. 1) 이웃의 비용은 비용 +1입니다. 이웃의 휴리스틱은 휴리스틱 -1 (변경된 문자가 "올바른"이되기 때문에); 2) 비용 +1 및 휴리스틱 +0 3) 비용 +1 및 휴리스틱 +1 (변경되는 문자가 "정확한"에서 "잘못된"으로 변경됨). 따라서 이웃을 이완 시키면 동일한 우선 순위, 우선 순위 +1 또는 우선 순위 +2로 삽입합니다. 결과적으로 힙에 대해 링크 된 목록의 3 요소 배열을 사용할 수 있습니다.

해시 조회가 일정하다는 가정에 대한 메모를 추가해야합니다. 잘 말하지만 해시 계산은 어떻습니까? 대답은 내가 그들을 상각하고 있다는 것입니다 : java.lang.String캐시합니다 hashCode(). 그래서 해시 계산에 소요 된 총 시간은 O(V n^2)(그래프 생성)입니다.

복잡성에 영향을 미치는 또 다른 변경 사항이 있지만 최적화 여부에 대한 질문은 통계에 대한 가정에 따라 다릅니다. (IMO는 "최고의 Big O 솔루션"을 기준으로 삼는 것은 실수가 아닙니다. 단순한 이유가 없기 때문에 최선의 복잡성이 없기 때문입니다. 단일 변수가 없습니다). 이 변경은 그래프 생성 단계에 영향을줍니다. 위의 코드에서 다음과 같습니다.

        Map<String, Set<String>> wordsToLinks = new HashMap<String, Set<String>>();
        Map<String, Set<String>> linksToWords = new HashMap<String, Set<String>>();

        // Cost: O(Vn * (n + hash))
        for (String word : words)
        {
            // Cost: O(n*(n + hash))
            for (int i = 0; i < word.length(); i++)
            {
                // Cost: O(n + hash)
                char[] ch = word.toCharArray();
                ch[i] = '.';
                String link = new String(ch).intern();
                add(wordsToLinks, word, link);
                add(linksToWords, link, word);
            }
        }

        // Cost: O(V * n * hash + E * hash)
        for (Map.Entry<String, Set<String>> from : wordsToLinks.entrySet()) {
            String src = from.getKey();
            wordsToWords.put(src, new HashSet<String>());
            for (String link : from.getValue()) {
                Set<String> to = linksToWords.get(link);
                for (String snk : to) {
                    // Note: equality test is safe here. Cost is O(hash)
                    if (snk != src) add(wordsToWords, src, snk);
                }
            }
        }

그렇습니다 O(V * n * (n + hash) + E * hash). 그러나 그 O(V * n^2)부분은 각 링크에 대해 새로운 n 문자 문자열을 생성 한 다음 해시 코드를 계산하는 것입니다. 도우미 클래스를 사용하면 피할 수 있습니다.

    private static class Link
    {
        private String str;
        private int hash;
        private int missingIdx;

        public Link(String str, int hash, int missingIdx) {
            this.str = str;
            this.hash = hash;
            this.missingIdx = missingIdx;
        }

        @Override
        public int hashCode() { return hash; }

        @Override
        public boolean equals(Object obj) {
            Link l = (Link)obj; // Unsafe, but I know the contexts where I'm using this class...
            if (this == l) return true; // Essential
            if (hash != l.hash || missingIdx != l.missingIdx) return false;
            for (int i = 0; i < str.length(); i++) {
                if (i != missingIdx && str.charAt(i) != l.str.charAt(i)) return false;
            }
            return true;
        }
    }

그러면 그래프 생성의 전반부는

        Map<String, Set<Link>> wordsToLinks = new HashMap<String, Set<Link>>();
        Map<Link, Set<String>> linksToWords = new HashMap<Link, Set<String>>();

        // Cost: O(V * n * hash)
        for (String word : words)
        {
            // apidoc: The hash code for a String object is computed as
            // s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
            // Cost: O(n * hash)
            int hashCode = word.hashCode();
            int pow = 1;
            for (int j = word.length() - 1; j >= 0; j--) {
                Link link = new Link(word, hashCode - word.charAt(j) * pow, j);
                add(wordsToLinks, word, link);
                add(linksToWords, link, word);
                pow *= 31;
            }
        }

해시 코드의 구조를 사용하여에 링크를 생성 할 수 있습니다 O(V * n). 그러나 이것은 두드려 효과가 있습니다. 해시 조회가 일정한 시간이라는 가정에서 본질적으로 객체를 동등하게 비교하는 것이 저렴하다는 가정입니다. 그러나 Link의 동등성 테스트 O(n)는 최악의 경우입니다. 최악의 경우는 서로 다른 단어에서 생성 된 두 개의 동일한 링크간에 해시 충돌이있을 때입니다. 즉 O(E), 그래프 생성 후반에 시간이 발생합니다. 동일하지 않은 링크 사이의 해시 충돌이 발생하는 경우를 제외하고는 그렇지 않습니다. 그래서 우리는에 거래 한 O(V * n^2)대한 O(E * n * hash). 통계에 대한 이전 포인트를 참조하십시오.

— 피터 테일러
소스

8192가 BufferedReader (SunVM)의 기본 버퍼 크기라고 생각합니다.

— st0le

@ st0le, 골프 버전에서는 해당 매개 변수를 생략했으며 ungolfed 버전에서는 해를 끼치 지 않습니다.

— 피터 테일러

자바

복잡성 : ?? (나는 CompSci 학위가 없으므로이 문제에 도움을 주셔서 감사합니다.)

입력 : 명령 행에 단어 쌍 (원하는 경우 1 쌍 이상)을 제공하십시오. 명령 행이 지정되지 않으면 두 개의 임의의 임의 단어가 선택됩니다.

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.util.HashMap;
import java.util.HashSet;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.Random;
import java.util.Set;

public class M {

    // for memoization
    private static Map<String, List<String>> memoEdits = new HashMap<String, List<String>>(); 
    private static Set<String> dict;

    private static List<String> edits(String word, Set<String> dict) {
        if(memoEdits.containsKey(word))
            return memoEdits.get(word);

        List<String> editsList = new LinkedList<String>();
        char[] letters = word.toCharArray();
        for(int i = 0; i < letters.length; i++) {
            char hold = letters[i];
            for(char ch = 'A'; ch <= 'Z'; ch++) {
                if(ch != hold) {
                    letters[i] = ch;
                    String nWord = new String(letters);
                    if(dict.contains(nWord)) {
                        editsList.add(nWord);
                    }
                }
            }
            letters[i] = hold;
        }
        memoEdits.put(word, editsList);
        return editsList;
    }

    private static Map<String, String> bfs(String wordFrom, String wordTo,
                                           Set<String> dict) {
        Set<String> visited = new HashSet<String>();
        List<String> queue = new LinkedList<String>();
        Map<String, String> pred = new HashMap<String, String>();
        queue.add(wordFrom);
        while(!queue.isEmpty()) {
            String word = queue.remove(0);
            if(word.equals(wordTo))
                break;

            for(String nWord: edits(word, dict)) {
                if(!visited.contains(nWord)) {
                    queue.add(nWord);
                    visited.add(nWord);
                    pred.put(nWord, word);
                }
            }
        }
        return pred;
    }

    public static void printPath(String wordTo, String wordFrom) {
        int c = 0;
        Map<String, String> pred = bfs(wordFrom, wordTo, dict);
        do {
            System.out.println(wordTo);
            c++;
            wordTo = pred.get(wordTo);
        }
        while(wordTo != null && !wordFrom.equals(wordTo));
        System.out.println(wordFrom);
        if(wordTo != null)
            System.out.println(c - 1);
        else
            System.out.println("OY");
        System.out.println();
    }

    public static void main(String[] args) throws Exception {
        BufferedReader scan = new BufferedReader(new FileReader(new File("c:\\332609\\dict.txt")),
                                                 40 * 1024);
        String line;
        dict = new HashSet<String>(); //the dictionary (1 word per line)
        while((line = scan.readLine()) != null) {
            dict.add(line);
        }
        scan.close();
        if(args.length == 0) { // No Command line Arguments? Pick 2 random
                               // words.
            Random r = new Random(System.currentTimeMillis());
            String[] words = dict.toArray(new String[dict.size()]);
            int x = r.nextInt(words.length), y = r.nextInt(words.length);
            while(x == y) //same word? that's not fun...
                y = r.nextInt(words.length);
            printPath(words[x], words[y]);
        }
        else { // Arguments provided, search for path pairwise
            for(int i = 0; i < args.length; i += 2) {
                if(i + 1 < args.length)
                    printPath(args[i], args[i + 1]);
            }
        }
    }
}

— st0le
소스

더 빠른 결과를 위해 Memoization을 사용했습니다. 사전 경로가 하드 코드되었습니다.

— st0le

@ 조이, 더 이상은 아니었다. 이제는 매번 증가하고에 추가되는 정적 필드가 있습니다 System.nanoTime().

— 피터 테일러

@Joey, aah, 알았지 만 지금은 그대로두고 개정을

— 늘리고

오, btw, 나는 일하고 있고 그 글자 맞추기 웹 사이트는 분명히 차단되어 있으므로 사전에 액세스 할 수 없습니다 ... 내일 아침까지 10 고유 단어를 가장 잘 생성 할 것입니다. 건배!

— st0le

양방향 bfs를 수행하여 (계산) 복잡성을 줄일 수 있습니다. 즉, 양쪽에서 검색하고 다른 쪽에서 노드를 방문하면 중지합니다.

— Nabb

유닉스에 대한 c

dijkstra 알고리즘 사용

코드의 대부분은 의상 n-ary 트리 구현으로, 유지하는 역할을합니다.

파일 IO가 느리다고 가정 할 때 단어 목록 (따라서 입력 파일을 읽는 횟수를 최소화 함 (다른 경우에는 두 번, 인수가없는 경우 두 번))
우리가 그것들을 만들 때 부분적인 나무들.
마지막 경로.

누구나보고에 관심이 어떻게 작동 아마 읽어야한다 findPath, process그리고 processOne(및 관련 댓글). 그리고 아마도buildPath 와 buildPartialPath. 나머지는 부기와 비계입니다. 테스트 및 개발 중에 사용되었지만 "생산"버전에는없는 몇 가지 루틴이 그대로 남아 있습니다.

내가 사용하고 /usr/share/dict/words는 완전히 무작위로 실행시키는 것은 생성하는 많은 긴 비의 항목이 내 맥 OS 10.5 상자에 많은 의 OY들.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <getline.h>
#include <time.h>
#include <unistd.h>
#include <ctype.h>

const char*wordfile="/usr/share/dict/words";
/* const char*wordfile="./testwords.txt"; */
const long double RANDOM_MAX = (2LL<<31)-1;

typedef struct node_t {
  char*word;
  struct node_t*kids;
  struct node_t*next;
} node;


/* Return a pointer to a newly allocated node. If word is non-NULL, 
 * call setWordNode;
 */
node*newNode(char*word){
  node*n=malloc(sizeof(node));
  n->word=NULL;
  n->kids=NULL;
  n->next=NULL;
  if (word) n->word = strdup(word);
  return n;
}
/* We can use the "next" links to treat these as a simple linked list,
 * and further can make it a stack or queue by
 *
 * * pop()/deQueu() from the head
 * * push() onto the head
 * * enQueue at the back
 */
void push(node*n, node**list){
  if (list==NULL){
    fprintf(stderr,"Active operation on a NULL list! Exiting\n");
    exit(5);
  }
  n->next = (*list);
  (*list) = n;
}
void enQueue(node*n, node**list){
  if (list==NULL){
    fprintf(stderr,"Active operation on a NULL list! Exiting\n");
    exit(5);
  }
  if ( *list==NULL ) {
    *list=n;
  } else {
    enQueue(n,&((*list)->next));
  }
}
node*pop(node**list){
  node*temp=NULL;
  if (list==NULL){
    fprintf(stderr,"Active operation on a NULL list! Exiting\n");
    exit(5);
  }
  temp = *list;
  if (temp != NULL) {
    (*list) = temp->next;
    temp->next=NULL;
  }
  return temp;
}
node*deQueue(node**list){ /* Alias for pop */
  return pop(list);
}

/* return a pointer to a node in tree matching word or NULL if none */
node* isInTree(char*word, node*tree){
  node*isInNext=NULL;
  node*isInKids=NULL;
  if (tree==NULL || word==NULL) return NULL;
  if (tree->word && (0 == strcasecmp(word,tree->word))) return tree;
  /* prefer to find the target at shallow levels so check the siblings
     before the kids */
  if (tree->next && (isInNext=isInTree(word,tree->next))) return isInNext;
  if (tree->kids && (isInKids=isInTree(word,tree->kids))) return isInKids;
  return NULL;
}

node* freeTree(node*t){
  if (t==NULL) return NULL;
  if (t->word) {free(t->word); t->word=NULL;}
  if (t->next) t->next=freeTree(t->next);
  if (t->kids) t->kids=freeTree(t->kids);
  free(t);
  return NULL;
}

void printTree(node*t, int indent){
  int i;
  if (t==NULL) return;
  for (i=0; i<indent; i++) printf("\t"); printf("%s\n",t->word);
  printTree(t->kids,indent+1);
  printTree(t->next,indent);
}

/* count the letters of difference between two strings */
int countDiff(const char*w1, const char*w2){
  int count=0;
  if (w1==NULL || w2==NULL) return -1;
  while ( (*w1)!='\0' && (*w2)!='\0' ) {
    if ( (*w1)!=(*w2) ) count++;
    w1++;
    w2++;
  }
  return count;
}

node*buildPartialPath(char*stop, node*tree){
  node*list=NULL;
  while ( (tree != NULL) && 
      (tree->word != NULL) && 
      (0 != strcasecmp(tree->word,stop)) ) {
    node*kid=tree->kids;
    node*newN = newNode(tree->word);
    push(newN,&list);
    newN=NULL;
    /* walk over all all kids not leading to stop */
    while ( kid && 
        (strcasecmp(kid->word,stop)!=0) &&
        !isInTree(stop,kid->kids) ) {
      kid=kid->next;
    }
    if (kid==NULL) {
      /* Assuming a preconditions where isInTree(stop,tree), we should
       * not be able to get here...
       */
      fprintf(stderr,"Unpossible!\n");
      exit(7);
    } 
    /* Here we've found a node that either *is* the target or leads to it */
    if (strcasecmp(stop,kid->word) == 0) {
      break;
    }
    tree = kid;
  }
  return list; 
}
/* build a node list path 
 *
 * We can walk down each tree, identfying nodes as we go
 */
node*buildPath(char*pivot,node*frontTree,node*backTree){
  node*front=buildPartialPath(pivot,frontTree);
  node*back=buildPartialPath(pivot,backTree);
  /* weld them together with pivot in between 
  *
  * The front list is in reverse order, the back list in order
  */
  node*thePath=NULL;
  while (front != NULL) {
    node*n=pop(&front);
    push(n,&thePath);
  }
  if (pivot != NULL) {
    node*n=newNode(pivot);
    enQueue(n,&thePath);
  }
  while (back != NULL) {
    node*n=pop(&back);
    enQueue(n,&thePath);
  }
  return thePath;
}

/* Add new child nodes to the single node in ts named by word. Also
 * queue these new word in q
 * 
 * Find node N matching word in ts
 * For tword in wordList
 *    if (tword is one change from word) AND (tword not in ts)
 *        add tword to N.kids
 *        add tword to q
 *        if tword in to
 *           return tword
 * return NULL
 */
char* processOne(char *word, node**q, node**ts, node**to, node*wordList){
  if ( word==NULL || q==NULL || ts==NULL || to==NULL || wordList==NULL ) {
    fprintf(stderr,"ProcessOne called with NULL argument! Exiting.\n");
    exit(9);
  }
  char*result=NULL;
  /* There should be a node in ts matching the leading node of q, find it */
  node*here = isInTree(word,*ts);
  /* Now test each word in the list as a possible child of HERE */
  while (wordList != NULL) {
    char *tword=wordList->word;
    if ((1==countDiff(word,tword)) && !isInTree(tword,*ts)) {
      /* Queue this up as a child AND for further processing */
      node*newN=newNode(tword);
      enQueue(newN,&(here->kids));
      newN=newNode(tword);
      enQueue(newN,q);
      /* This might be our pivot */
      if ( isInTree(tword,*to) ) {
    /* we have found a node that is in both trees */
    result=strdup(tword);
    return result;
      }
    }
    wordList=wordList->next;
  }
  return result;
}

/* Add new child nodes to ts for all the words in q */
char* process(node**q, node**ts, node**to, node*wordList){
  node*tq=NULL;
  char*pivot=NULL;
  if ( q==NULL || ts==NULL || to==NULL || wordList==NULL ) {
    fprintf(stderr,"Process called with NULL argument! Exiting.\n");
    exit(9);
  }
  while (*q && (pivot=processOne((*q)->word,&tq,ts,to,wordList))==NULL) {
    freeTree(deQueue(q));
  }
  freeTree(*q); 
  *q=tq;
  return pivot;
}

/* Find a path between w1 and w2 using wordList by dijkstra's
 * algorithm
 *
 * Use a breadth-first extensions of the trees alternating between
 * trees.
 */
node* findPath(char*w1, char*w2, node*wordList){
  node*thePath=NULL; /* our resulting path */
  char*pivot=NULL; /* The node we find that matches */
  /* trees of existing nodes */
  node*t1=newNode(w1); 
  node*t2=newNode(w2);
  /* queues of nodes to work on */
  node*q1=newNode(w1);
  node*q2=newNode(w2);

  /* work each queue all the way through alternating until a word is
     found in both lists */
  while( (q1!=NULL) && ((pivot = process(&q1,&t1,&t2,wordList)) == NULL) &&
     (q2!=NULL) && ((pivot = process(&q2,&t2,&t1,wordList)) == NULL) )
    /* no loop body */ ;


  /* one way or another we are done with the queues here */
  q1=freeTree(q1);
  q2=freeTree(q2);
  /* now construct the path */
  if (pivot!=NULL) thePath=buildPath(pivot,t1,t2);
  /* clean up after ourselves */
  t1=freeTree(t1);
  t2=freeTree(t2);

  return thePath;
}

/* Convert a non-const string to UPPERCASE in place */
void upcase(char *s){
  while (s && *s) {
    *s = toupper(*s);
    s++;
  }
}

/* Walks the input file stuffing lines of the given length into a list */
node*getListWithLength(const char*fname, int len){
  int l=-1;
  size_t n=0;
  node*list=NULL;
  char *line=NULL;
  /* open the word file */
  FILE*f = fopen(fname,"r");
  if (NULL==f){
    fprintf(stderr,"Could not open word file '%s'. Exiting.\n",fname);
    exit(3);
  }
  /* walk the file, trying each word in turn */
  while ( !feof(f) && ((l = getline(&line,&n,f)) != -1) ) {
    /* strip trailing whitespace */
    char*temp=line;
    strsep(&temp," \t\n");
    if (strlen(line) == len) {
      node*newN = newNode(line);
      upcase(newN->word);
      push(newN,&list);
    }
  }
  fclose(f);
  return list;
}

/* Assumes that filename points to a file containing exactly one
 * word per line with no other whitespace.
 * It will return a randomly selected word from filename.
 *
 * If veto is non-NULL, only non-matching words of the same length
 * wll be considered.
 */
char*getRandomWordFile(const char*fname, const char*veto){
  int l=-1, count=1;
  size_t n=0;
  char *word=NULL;
  char *line=NULL;
  /* open the word file */
  FILE*f = fopen(fname,"r");
  if (NULL==f){
    fprintf(stderr,"Could not open word file '%s'. Exiting.\n",fname);
    exit(3);
  }
  /* walk the file, trying each word in turn */
  while ( !feof(f) && ((l = getline(&line,&n,f)) != -1) ) {
    /* strip trailing whitespace */
    char*temp=line;
    strsep(&temp," \t\n");
    if (strlen(line) < 2) continue; /* Single letters are too easy! */
    if ( (veto==NULL) || /* no veto means chose from all */ 
     ( 
      ( strlen(line) == strlen(veto) )  && /* veto means match length */
      ( 0 != strcasecmp(veto,line) )       /* but don't match word */ 
       ) ) { 
      /* This word is worthy of consideration. Select it with random
         chance (1/count) then increment count */
      if ( (word==NULL) || (random() < RANDOM_MAX/count) ) {
    if (word) free(word);
    word=strdup(line);
      }
      count++;
    }
  }
  fclose(f);
  upcase(word);
  return word;
}

void usage(int argc, char**argv){
  fprintf(stderr,"%s [ <startWord> [ <endWord> ]]:\n\n",argv[0]);
  fprintf(stderr,
      "\tFind the shortest transformation from one word to another\n");
  fprintf(stderr,
      "\tchanging only one letter at a time and always maintaining a\n");
  fprintf(stderr,
      "\tword that exists in the word file.\n\n");
  fprintf(stderr,
      "\tIf startWord is not passed, chose at random from '%s'\n",
      wordfile);
  fprintf(stderr,
      "\tIf endWord is not passed, chose at random from '%s'\n",
      wordfile);
  fprintf(stderr,
      "\tconsistent with the length of startWord\n");
  exit(2);
}

int main(int argc, char**argv){
  char *startWord=NULL;
  char *endWord=NULL;

  /* intialize OS services */
  srandom(time(0)+getpid());
  /* process command line */
  switch (argc) {
  case 3:
    endWord = strdup(argv[2]);
    upcase(endWord);
  case 2:
    startWord = strdup(argv[1]);
    upcase(startWord);
  case 1:
    if (NULL==startWord) startWord = getRandomWordFile(wordfile,NULL);
    if (NULL==endWord)   endWord   = getRandomWordFile(wordfile,startWord);
    break;
  default:
    usage(argc,argv);
    break;
  }
  /* need to check this in case the user screwed up */
  if ( !startWord || ! endWord || strlen(startWord) != strlen(endWord) ) {
    fprintf(stderr,"Words '%s' and '%s' are not the same length! Exiting\n",
        startWord,endWord);
    exit(1);
  }
  /* Get a list of all the words having the right length */
  node*wordList=getListWithLength(wordfile,strlen(startWord));
  /* Launch into the path finder*/
  node *theList=findPath(startWord,endWord,wordList);
  /* Print the resulting path */
  if (theList) {
    int count=-2;
    while (theList) {
      printf("%s\n",theList->word);
      theList=theList->next;
      count++;
    }
    printf("%d\n",count);
  } else {
    /* No path found case */
    printf("%s %s OY\n",startWord,endWord);
  }
  return 0;
}

일부 출력 :

$ ./changeword dive name
DIVE
DIME
DAME
NAME
2
$ ./changeword house gorge
HOUSE
HORSE
GORSE
GORGE
2
$ ./changeword stop read
STOP
STEP
SEEP
SEED
REED
READ
4
$ ./changeword peace slate
PEACE
PLACE
PLATE
SLATE
2
$ ./changeword pole fast  
POLE
POSE
POST
PAST
FAST
3
$ ./changeword          
QUINTIPED LINEARITY OY
$ ./changeword sneaky   
SNEAKY WAXILY OY
$ ./changeword TRICKY
TRICKY
PRICKY
PRINKY
PRANKY
TRANKY
TWANKY
SWANKY
SWANNY
SHANNY
SHANTY
SCANTY
SCATTY
SCOTTY
SPOTTY
SPOUTY
STOUTY
STOUTH
STOUSH
SLOUSH
SLOOSH
SWOOSH
19
$ ./changeword router outlet
ROUTER
ROTTER
RUTTER
RUTHER
OUTHER
OUTLER
OUTLET
5
$ ./changeword 
IDIOM
IDISM
IDIST
ODIST
OVIST
OVEST
OVERT
AVERT
APERT
APART
SPART
SPARY
SEARY
DEARY
DECRY
DECAY
DECAN
DEDAN
SEDAN
17

복잡성 분석은 사소하지 않습니다. 검색은 양면 반복 심화입니다.

검사되는 각 노드에 대해 전체 단어 목록을 봅니다 (그러나 올바른 길이의 단어로 제한됨). 목록의 길이를 호출W .
최소 단계 수는 S_min = (<number of different letter>-1)한 문자 만 떨어져 있으면 중간 단계 0에서 변경 점수를 매기기 때문입니다. TRICKY-SWOOSH 실행을 참조하면 최대 값을 정량화하기가 어렵습니다. 나무의 각 반은S/2-1 에S/2
트리의 분기 동작에 대한 분석을 수행하지 않았지만 호출합니다 B.

따라서 최소 작업 수는 약 2 * (S/2)^B * W하지만 실제로는 좋지 않습니다.

— dmckee
소스

어쩌면 이것은 순진하지만, 설계 또는 구현에서 가장자리 가중치가 필요한 것을 보지 못했습니다. Dijkstra의 실제 비가 중 그래프 (가장자리 가중치는 "1")에 적용되지만 간단한 너비 우선 검색은 여기 O(|V|+|E|)대신 범위를 개선하기 위해 적용되지 O(|E|+|V| log |V|)않습니까?

— MrGomez