13

이 작업은 단순히 파이썬의 내장 함수보다 n / n을 선택하고 n을 계산할 수있는 속도를 보는 것입니다. 물론 큰 n의 경우 이것은 다소 큰 수이므로 정수를 출력하는 것이 아니라 숫자의 합계를 출력해야합니다. 예를 들어에 대한 n = 100000대답은 135702입니다. 들어 n=1000000그 것이다 1354815.

파이썬 코드는 다음과 같습니다.

from scipy.misc import comb
def sum_digits(n):
   r = 0
   while n:
       r, n = r + n % 10, n / 10
   return r
sum_digits(comb(n,n/2,exact=True))

당신의 점수는 (highest n on your machine using your code)/(highest n on your machine using my code)입니다. 코드는 60 초 이내에 종료되어야합니다.

프로그램은 모든 짝수 n에 대해 올바른 출력을 제공해야합니다. 2 <= n <= (가장 높은 n)

이항 계수 또는 이항 계수로 빠르게 변환 될 수있는 값을 계산하는 내장 코드 또는 라이브러리를 사용할 수 없습니다.

원하는 언어를 사용할 수 있습니다.

선행 답변 놀라운 680.09의 현재 주요 답변은 공정한 것입니다.

fastest-code number-theory

2

우리는 파이썬이나 선택한 언어로 솔루션을 제출해야합니까?

현대 컴퓨터 에서이 작업을 수행 n하고 수백만 명의 사람들에게 잘 걸리는 루틴을 작성할 수는 있지만 파이썬 함수가 n = 1e5질식하지 않고 더 큰 것을 처리 하지는 않을 것 입니다.

— COTO

@Alessandro 원하는 언어를 사용할 수 있습니다. 유일한 제한 사항은 내장 함수를 사용하여 계수를 계산할 수 없다는 것입니다.

2

계승 함수가 허용됩니까? 나는 그것들이 "이항 계수로 빠르게 변형 될 수 있기 때문에"(모든 것이 단지 하나의 계승을 다른 계승 제곱으로 나눈 것)이 아니라고 가정했지만, 대답이 하나를 사용하고 있기 때문에 명확성이 좋을 것입니다.

— Geobits

1

@Comintern : 1 분에 287mil 또는 35 초에 169mil로 해당 참조 지점을 성공적으로 복제했습니다! :)

— justhalf

9

C ++ (GMP)-(287,000,000 / 422,000) = 680.09

xnor의 Kummer 's Theorem과 qwr의 GMP를 뻔뻔스럽게 결합하십시오. ~~여전히 Go 솔루션에 가깝지 않은 이유는 확실하지 않습니다.~~

편집 : 숫자가 비슷한 경우 곱셈이 더 빠르다는 것을 상기시켜 준 Keith Randall에게 감사드립니다. 메모리 관리에 대한 메모리 통합 개념과 유사한 멀티 레벨 곱셈을 구현했습니다. 그리고 결과는 인상적입니다. 예전에는 51을 사용했지만 이제는 0.5 초만 걸립니다 (즉, 100 배 개선 !!)

기존 코드 (n = 14,000,000)
0.343 초 후에 체질
51.929s에서 이항 계산 완료
0.901에서 합산 완료
14000000 : 18954729

진짜 0m53.194s
사용자 0m53.116s
시스 0m0.060s

새 코드 (n = 14,000,000)
0.343 초 후에 체질
0.552 초에 이항 계산 완료
0.902에서 합산 완료
14000000 : 18954729

실제 0m1.804s
사용자 0m1.776s
시스 0m0.023s

에 대한 실행 n=287,000,000

4.211 년대에 체로 거름
17.934 년대 이항 계산 완료
37.677에서 요약을 완료했습니다.
287000000 : 388788354

진짜 0m59.928s
사용자 0m58.759s
시스 0m1.116s

코드. 와 컴파일-lgmp -lgmpxx -O3

#include <gmpxx.h>
#include <iostream>
#include <time.h>
#include <cstdio>

const int MAX=287000000;
const int PRIME_COUNT=15700000;

int primes[PRIME_COUNT], factors[PRIME_COUNT], count;
bool sieve[MAX];
int max_idx=0;

void run_sieve(){
    sieve[2] = true;
    primes[0] = 2;
    count = 1;
    for(int i=3; i<MAX; i+=2){
        sieve[i] = true;
    }
    for(int i=3; i<17000; i+=2){
        if(!sieve[i]) continue;
        for(int j = i*i; j<MAX; j+=i){
            sieve[j] = false;
        }
    }
    for(int i=3; i<MAX; i+=2){
        if(sieve[i]) primes[count++] = i;
    }
}

mpz_class sum_digits(mpz_class n){
    clock_t t = clock();
    char* str = mpz_get_str(NULL, 10, n.get_mpz_t());
    int result = 0;
    for(int i=0;str[i]>0;i++){
        result+=str[i]-48;
    }
    printf("Done summing in %.3fs\n", ((float)(clock()-t))/CLOCKS_PER_SEC);
    return result;
}

mpz_class nc2_fast(const mpz_class &x){
    clock_t t = clock();
    int prime;
    const unsigned int n = mpz_get_ui(x.get_mpz_t());
    const unsigned int n2 = n/2;
    unsigned int m;
    unsigned int digit;
    unsigned int carry=0;
    unsigned int carries=0;
    mpz_class result = 1;
    mpz_class prime_prods = 1;
    mpz_class tmp;
    mpz_class tmp_prods[32], tmp_prime_prods[32];
    for(int i=0; i<32; i++){
        tmp_prods[i] = (mpz_class)NULL;
        tmp_prime_prods[i] = (mpz_class)NULL;
    }
    for(int i=0; i< count; i++){
        prime = primes[i];
        carry=0;
        carries=0;
        if(prime > n) break;
        if(prime > n2){
            tmp = prime;
            for(int j=0; j<32; j++){
                if(tmp_prime_prods[j] == NULL){
                    tmp_prime_prods[j] = tmp;
                    break;
                } else {
                    mpz_mul(tmp.get_mpz_t(), tmp.get_mpz_t(), tmp_prime_prods[j].get_mpz_t());
                    tmp_prime_prods[j] = (mpz_class)NULL;
                }
            }
            continue;
        }
        m=n2;
        while(m>0){
            digit = m%prime;
            carry = (2*digit + carry >= prime) ? 1 : 0;
            carries += carry;
            m/=prime;
        }
        if(carries>0){
            tmp = 0;
            mpz_ui_pow_ui(tmp.get_mpz_t(), prime, carries);
            for(int j=0; j<32; j++){
                if(tmp_prods[j] == NULL){
                    tmp_prods[j] = tmp;
                    break;
                } else {
                    mpz_mul(tmp.get_mpz_t(), tmp.get_mpz_t(), tmp_prods[j].get_mpz_t());
                    tmp_prods[j] = (mpz_class)NULL;
                }
            }
        }
    }
    result = 1;
    prime_prods = 1;
    for(int j=0; j<32; j++){
        if(tmp_prods[j] != NULL){
            mpz_mul(result.get_mpz_t(), result.get_mpz_t(), tmp_prods[j].get_mpz_t());
        }
        if(tmp_prime_prods[j] != NULL){
            mpz_mul(prime_prods.get_mpz_t(), prime_prods.get_mpz_t(), tmp_prime_prods[j].get_mpz_t());
        }
    }
    mpz_mul(result.get_mpz_t(), result.get_mpz_t(), prime_prods.get_mpz_t());
    printf("Done calculating binom in %.3fs\n", ((float)(clock()-t))/CLOCKS_PER_SEC);
    return result;
}

int main(int argc, char* argv[]){
    const mpz_class n = atoi(argv[1]);
    clock_t t = clock();
    run_sieve();
    printf("Done sieving in %.3fs\n", ((float)(clock()-t))/CLOCKS_PER_SEC);
    std::cout << n << ": " << sum_digits(nc2_fast(n)) << std::endl;
    return 0;
}

— Justhalf
소스

2

두 피연산자의 크기가 거의 같은 경우 곱셈이 더 효율적입니다. 항상 큰 수에 작은 수를 곱하고 있습니다. 작은 숫자를 반복해서 결합하면 속도가 더 빠를 수 있지만 더 많은 메모리가 필요합니다.

— Keith Randall

와우, 그것은 많은 차이를 만듭니다. 기하 급수적으로 빠릅니다. 35 초 안에 169mil에 도달 할 수 있습니다.

— justhalf

와우! 코드의 다른 부분에 대한 시간은 무엇입니까?

나는 이미 그것을 내 대답에 넣었습니다. n중앙 이항 계수를 계산하는 18 초 까지 소수를 생성하는 4s , 결과를 문자열로 변환하고 숫자를 합산하는 나머지 37s.

— justhalf

1

이 답변은 이항 계수를 계산하는 모든 오픈 소스 라이브러리에 기여해야한다고 생각합니다. 다른 사람이 이렇게 빨리 코드를 작성했다고 믿을 수 없습니다!

7

이동, 33.96 = (16300000/480000)

package main

import "math/big"

const n = 16300000

var (
    sieve     [n + 1]bool
    remaining [n + 1]int
    count     [n + 1]int
)

func main() {
    println("finding primes")
    for p := 2; p <= n; p++ {
        if sieve[p] {
            continue
        }
        for i := p * p; i <= n; i += p {
            sieve[i] = true
        }
    }

    // count net number of times each prime appears in the result.
    println("counting factors")
    for i := 2; i <= n; i++ {
        remaining[i] = i
    }
    for p := 2; p <= n; p++ {
        if sieve[p] {
            continue
        }

        for i := p; i <= n; i += p {
            for remaining[i]%p == 0 { // may have multiple factors of p
                remaining[i] /= p

                // count positive for n!
                count[p]++
                // count negative twice for ((n/2)!)^2
                if i <= n/2 {
                    count[p] -= 2
                }
            }
        }
    }

    // ignore all the trailing zeros
    count[2] -= count[5]
    count[5] = 0

    println("listing factors")
    var m []uint64
    for i := 0; i <= n; i++ {
        for count[i] > 0 {
            m = append(m, uint64(i))
            count[i]--
        }
    }

    println("grouping factors")
    m = group(m)

    println("multiplying")
    x := mul(m)

    println("converting to base 10")
    d := 0
    for _, c := range x.String() {
        d += int(c - '0')
    }
    println("sum of digits:", d)
}

// Return product of elements in a.
func mul(a []uint64) *big.Int {
    if len(a) == 1 {
        x := big.NewInt(0)
        x.SetUint64(a[0])
        return x
    }
    m := len(a) / 2
    x := mul(a[:m])
    y := mul(a[m:])
    x.Mul(x, y) // fast because x and y are about the same length
    return x
}

// return a slice whose members have the same product
// as the input slice, but hopefully shorter.
func group(a []uint64) []uint64 {
    var g []uint64
    r := uint64(1)
    b := 1
    for _, x := range a {
        c := bits(x)
        if b+c <= 64 {
            r *= x
            b += c
        } else {
            g = append(g, r)
            r = x
            b = c
        }
    }
    g = append(g, r)
    return g
}

// bits returns the number of bits in the representation of x
func bits(x uint64) int {
    n := 0
    for x != 0 {
        n++
        x >>= 1
    }
    return n
}

분자와 분모의 모든 주요 요소를 세고 일치하는 요소를 취소하여 작동합니다. 남은 수를 곱하여 결과를 얻습니다.

시간의 80 % 이상이 10 진법으로 전환하는 데 소비됩니다.이를 수행하는 더 좋은 방법이 있어야합니다.

— 키이스 랜달
소스

기본 10에서 큰 숫자를 인쇄 해야하는 문제의 경우 일반적으로 기본 1E9 ~ 2 ^ 30에 숫자를 저장하는 자체 BigInteger 클래스를 작성하는 것이 도움이됩니다.

— 피터 테일러

당신은 현재 국가 마일로 승리하고 있습니다.

@ PetetTaylor : 나는 그것을 시도했지만 곱셈 코드에는 많은 % 1e9가 필요하므로 곱셈이 느려집니다.

— Keith Randall

6

파이썬 3 (8.8 = 220 만 / 0.25 백만)

이것은 속도가 알려지지 않은 Python에 있으므로 다른 언어로 이식하는 것이 더 좋습니다.

이 StackOverflow 콘테스트 에서 가져온 프라임 생성기 입니다.

import numpy
import time

def primesfrom2to(n):
    """ Input n>=6, Returns a array of primes, 2 <= p < n """
    sieve = numpy.ones(n//3 + (n%6==2), dtype=numpy.bool)
    for i in range(1,int(n**0.5)//3+1):
        if sieve[i]:
            k=3*i+1|1
            sieve[       k*k/3     ::2*k] = False
            sieve[k*(k-2*(i&1)+4)/3::2*k] = False
    return numpy.r_[2,3,((3*numpy.nonzero(sieve)[0][1:]+1)|1)]

t0 = time.clock()

N=220*10**4
n=N//2

print("N = %d" % N)
print()

print("Generating primes.")
primes = primesfrom2to(N)

t1 = time.clock()
print ("Time taken: %f" % (t1-t0))

print("Computing product.")
product = 1

for p in primes:
    p=int(p)
    carries = 0 
    carry = 0

    if p>n:
        product*=p
        continue

    m=n

    #Count carries of n+n in base p as per Kummer's Theorem
    while m:
        digit = m%p
        carry = (2*digit + carry >= p)
        carries += carry
        m//=p

    if carries >0:
        for _ in range(carries):
            product *= p

    #print(p,carries,product)

t2 = time.clock()
print ("Time taken: %f" % (t2-t1))

print("Converting number to string.")

# digit_sum = 0
# result=product

# while result:
    # digit_sum+=result%10
    # result//=10

digit_sum = 0
digit_string = str(product)

t3 = time.clock()
print ("Time taken: %f" % (t3-t2))

print("Summing digits.")
for d in str(digit_string):digit_sum+=int(d)

t4 = time.clock()
print ("Time taken: %f" % (t4-t3))
print ()

print ("Total time: %f" % (t4-t0))
print()
print("Sum of digits = %d" % digit_sum)

이 알고리즘의 주요 아이디어는 Kummer의 정리 를 사용 하여 이항의 소인수 분해를 얻는 것입니다. 각 소수에 대해 우리는 답을 나누는 최고 힘을 배우고 실행중인 제품에 소수의 힘을 곱합니다. 이런 식으로 답의 소인수 분해에서 각 소 인당 한 번만 곱하면됩니다.

시간 분석을 보여주는 출력 :

N = 2200000
Generating primes.
Time taken: 0.046408
Computing product.
Time taken: 17.931472
Converting number to string.
Time taken: 39.083390
Summing digits.
Time taken: 1.502393

Total time: 58.563664

Sum of digits = 2980107

놀랍게도 대부분의 시간은 숫자를 합산하기 위해 숫자를 문자열로 변환하는 데 소비됩니다. 또한 놀랍게도, 문자열로 변환하는 것은 훨씬 빠르게 반복에서 자리를 얻는 것보다했다 %10및 //10전체 문자열은 아마도 메모리에 보관해야하더라도.

소수를 생성하는 것은 무시할만한 시간이 걸립니다 (따라서 기존 코드를 복사하는 것이 불공평하다고 느끼지 않습니다). 숫자를 합산하는 것이 빠릅니다. 실제 곱셈은 1/3의 시간이 걸립니다.

숫자 합산이 제한 요소 인 것처럼 보이면, 십진수로 숫자를 곱하는 알고리즘은 이진 / 소수 변환을 단축하여 총 시간을 절약 할 수 있습니다.

— xnor
소스

이것은 매우 인상적이며 왜 cpython이 구현을 사용하지 않는지 궁금해합니다!

3

자바 (점수 22500/365000 = 0.062)

이 컴퓨터에는 파이썬이 없으므로 누군가 점수를 매길 수 있다면 감사하겠습니다. 그렇지 않으면 기다려야합니다.

(\binom{2 엔}{엔}) = \sum_{케이 = 0}^{엔} {(\binom{엔}{케이})}^{2}

$\binom{2n}{n} = \sum_{k=0}^n \binom{n}{k}^2$

병목 현상은 파스칼 삼각형의 관련 섹션 (실행 시간의 90 %)을 계산하는 데 추가되므로 더 나은 곱셈 알고리즘을 사용하면 실제로 도움이되지 않습니다.

질문이라고 부르는 n것은 내가 부르는 것 2n입니다. 커맨드 라인 인수는 질문이 호출하는 것 n입니다.

public class CodeGolf37270 {
    public static void main(String[] args) {
        if (args.length != 1) {
            System.err.println("Usage: java CodeGolf37270 <n>");
            System.exit(1);
        }

        int two_n = Integer.parseInt(args[0]);
        // \binom{2n}{n} = \sum_{k=0}^n \binom{n}{k}^2
        // Two cases:
        //   n = 2m: \binom{4m}{2m} = \binom{2m}{m}^2 + 2\sum_{k=0}^{m-1} \binom{2m}{k}^2
        //   n = 2m+1: \binom{4m+2}{2m+1} = 2\sum_{k=0}^{m} \binom{2m+1}{k}^2
        int n = two_n / 2;
        BigInt[] nCk = new BigInt[n/2 + 1];
        nCk[0] = new BigInt(1);
        for (int k = 1; k < nCk.length; k++) nCk[k] = nCk[0];
        for (int row = 2; row <= n; row++) {
            BigInt tmp = nCk[0];
            for (int col = 1; col < row && col < nCk.length; col++) {
                BigInt replacement = tmp.add(nCk[col]);
                tmp = nCk[col];
                nCk[col] = replacement;
            }
        }

        BigInt central = nCk[0]; // 1^2 = 1
        int lim = (n & 1) == 1 ? nCk.length : (nCk.length - 1);
        for (int k = 1; k < lim; k++) central = central.add(nCk[k].sq());
        central = central.add(central);
        if ((n & 1) == 0) central = central.add(nCk[nCk.length - 1].sq());

        System.out.println(central.digsum());
    }

    private static class BigInt {
        static final int B = 1000000000;
        private int[] val;

        public BigInt(int x) {
            val = new int[] { x };
        }

        private BigInt(int[] val) {
            this.val = val;
        }

        public BigInt add(BigInt that) {
            int[] left, right;
            if (val.length < that.val.length) {
                left = that.val;
                right = val;
            }
            else {
                left = val;
                right = that.val;
            }

            int[] sum = left.clone();
            int carry = 0, k = 0;
            for (; k < right.length; k++) {
                int a = sum[k] + right[k] + carry;
                sum[k] = a % B;
                carry = a / B;
            }
            while (carry > 0 && k < sum.length) {
                int a = sum[k] + carry;
                sum[k] = a % B;
                carry = a / B;
                k++;
            }
            if (carry > 0) {
                int[] wider = new int[sum.length + 1];
                System.arraycopy(sum, 0, wider, 0, sum.length);
                wider[sum.length] = carry;
                sum = wider;
            }

            return new BigInt(sum);
        }

        public BigInt sq() {
            int[] rv = new int[2 * val.length];
            // Naive multiplication
            for (int i = 0; i < val.length; i++) {
                for (int j = i; j < val.length; j++) {
                    int k = i+j;
                    long c = val[i] * (long)val[j];
                    if (j > i) c <<= 1;
                    while (c > 0) {
                        c += rv[k];
                        rv[k] = (int)(c % B);
                        c /= B;
                        k++;
                    }
                }
            }

            int len = rv.length;
            while (len > 1 && rv[len - 1] == 0) len--;
            if (len < rv.length) {
                int[] rv2 = new int[len];
                System.arraycopy(rv, 0, rv2, 0, len);
                rv = rv2;
            }

            return new BigInt(rv);
        }

        public long digsum() {
            long rv = 0;
            for (int i = 0; i < val.length; i++) {
                int x = val[i];
                while (x > 0) {
                    rv += x % 10;
                    x /= 10;
                }
            }
            return rv;
        }
    }
}

— 피터 테일러
소스

귀하의 프로그램은 29,500, 참조 프로그램은 440,000을 얻으므로 0.067의 점수가됩니다. 이것은 Java 1.7 ( javac CodeGolf37270.java) 로 컴파일되고 Java 1.8 ( )로 실행됩니다 java CodeGolf37270 n. 내가 모르는 최적화 옵션이 있는지 확실하지 않습니다. Java 1.8 패키지는 Java 패키지와 함께 설치되지 않으므로 Java 1.8로 컴파일을 시도 할 수 없습니다.

— Dennis

재미있는 접근법. 왜 반복적으로 계산하는 것이 간단한 공식을 사용하는 것보다 빠를 수 있다고 생각합니까?

— justhalf

@ justhalf, 나는 그것이 더 빠를 것인지 아닌지에 대한 직감을 가지고 있지 않았으며 복잡도 계산을 시도하지 않았습니다. 중앙 이항 계수에 대한 ID 목록을 살펴보면서 10 진수를 추출하는 데 최적화 된 사용자 지정 큰 정수 클래스로 구현하기 쉬운 수식을 찾으려고 노력했습니다. 그리고 그것이 매우 효율적이지 않다는 것을 알았을 때, 나는 그것을 게시하고 다른 사람이 실험을 반복하지 못하게 할 수도 있습니다. (FOWW 나는 Toom 곱셈에 노력하고 있지만 테스트하고 디버깅 할시기는 확실하지 않습니다).

— 피터 테일러

2

GMP-1500000/300000 = 5.0

이 답변은 체와 경쟁하지는 않지만 때로는 짧은 코드로 여전히 결과를 얻을 수 있습니다.

#include <gmpxx.h>
#include <iostream>

mpz_class sum_digits(mpz_class n)
{
    char* str = mpz_get_str(NULL, 10, n.get_mpz_t());
    int result = 0;
    for(int i=0; str[i]>0; i++)

    result += str[i] - 48;

    return result;
}


mpz_class comb_2(const mpz_class &x)
{
    const unsigned int k = mpz_get_ui(x.get_mpz_t()) / 2;
    mpz_class result = k + 1;

    for(int i=2; i<=k; i++)
    {
        result *= k + i;
        mpz_divexact_ui(result.get_mpz_t(), result.get_mpz_t(), i);
    }

    return result;
}

int main()
{
    const mpz_class n = 1500000;
    std::cout << sum_digits(comb_2(n)) << std::endl;

    return 0;
}

— qwr
소스

2

Java, 사용자 지정 큰 정수 클래스 : 32.9 (120000000/365000)

메인 클래스는 매우 간단합니다.

import java.util.*;

public class PPCG37270 {
    public static void main(String[] args) {
        long start = System.nanoTime();

        int n = 12000000;
        if (args.length == 1) n = Integer.parseInt(args[0]);

        boolean[] sieve = new boolean[n + 1];
        int[] remaining = new int[n + 1];
        int[] count = new int[n + 1];

        for (int p = 2; p <= n; p++) {
            if (sieve[p]) continue;
            long p2 = p * (long)p;
            if (p2 > n) continue;
            for (int i = (int)p2; i <= n; i += p) sieve[i] = true;
        }

        for (int i = 2; i <= n; i++) remaining[i] = i;
        for (int p = 2; p <= n; p++) {
            if (sieve[p]) continue;
            for (int i = p; i <= n; i += p) {
                while (remaining[i] % p == 0) {
                    remaining[i] /= p;
                    count[p]++;
                    if (i <= n/2) count[p] -= 2;
                }
            }
        }

        count[2] -= count[5];
        count[5] = 0;

        List<BigInt> partialProd = new ArrayList<BigInt>();
        long accum = 1;
        for (int i = 2; i <= n; i++) {
            for (int j = count[i]; j > 0; j--) {
                long tmp = accum * i;
                if (tmp < 1000000000L) accum = tmp;
                else {
                    partialProd.add(new BigInt((int)accum));
                    accum = i;
                }
            }
        }
        partialProd.add(new BigInt((int)accum));
        System.out.println(prod(partialProd).digsum());
        System.out.println((System.nanoTime() - start) / 1000000 + "ms");
    }

    private static BigInt prod(List<BigInt> vals) {
        while (vals.size() > 1) {
            int n = vals.size();
            List<BigInt> next = new ArrayList<BigInt>();
            for (int i = 0; i < n; i += 2) {
                if (i == n - 1) next.add(vals.get(i));
                else next.add(vals.get(i).mul(vals.get(i+1)));
            }
            vals = next;
        }
        return vals.get(0);
    }
}

곱셈에 최적화 된 큰 정수 클래스에 의존 toString()하며 둘 다 구현에있어 심각한 병목 현상이 발생 java.math.BigInteger합니다.

/**
 * A big integer class which is optimised for conversion to decimal.
 * For use in simple applications where BigInteger.toString() is a bottleneck.
 */
public class BigInt {
    // The base of the representation.
    private static final int B = 1000000000;
    // The number of decimal digits per digit of the representation.
    private static final int LOG10_B = 9;

    public static final BigInt ZERO = new BigInt(0);
    public static final BigInt ONE = new BigInt(1);

    // We use sign-magnitude representation.
    private final boolean negative;

    // Least significant digit is at val[off]; most significant is at val[off + len - 1]
    // Unless len == 1 we guarantee that val[off + len - 1] is non-zero.
    private final int[] val;
    private final int off;
    private final int len;

    // Toom-style multiplication parameters from
    // Zuras, D. (1994). More on squaring and multiplying large integers. IEEE Transactions on Computers, 43(8), 899-908.
    private static final int[][][] Q = new int[][][]{
        {},
        {},
        {{1, -1}},
        {{4, 2, 1}, {1, 1, 1}, {1, 2, 4}},
        {{8, 4, 2, 1}, {-8, 4, -2, 1}, {1, 1, 1, 1}, {1, -2, 4, -8}, {1, 2, 4, 8}}
    };
    private static final int[][][] R = new int[][][]{
        {},
        {},
        {{1, -1, 1}},
        {{-21, 2, -12, 1, -6}, {7, -1, 10, -1, 7}, {-6, 1, -12, 2, -21}},
        {{-180, 6, 2, -80, 1, 3, -180}, {-510, 4, 4, 0, -1, -1, 120}, {1530, -27, -7, 680, -7, -27, 1530}, {120, -1, -1, 0, 4, 4, -510}, {-180, 3, 1, -80, 2, 6, -180}}
    };
    private static final int[][] S = new int[][]{
        {},
        {},
        {1, 1, 1},
        {1, 6, 2, 6, 1},
        {1, 180, 120, 360, 120, 180, 1}
    };

    /**
     * Constructs a big version of an integer value.
     * @param x The value to represent.
     */
    public BigInt(int x) {
        this(Integer.toString(x));
    }

    /**
     * Constructs a big version of a long value.
     * @param x The value to represent.
     */
    public BigInt(long x) {
        this(Long.toString(x));
    }

    /**
     * Parses a decimal representation of an integer.
     * @param str The value to represent.
     */
    public BigInt(String str) {
        this(str.charAt(0) == '-', split(str));
    }

    /**
     * Constructs a sign-magnitude representation taking the entire span of the array as the range of interest.
     * @param neg Is the value negative?
     * @param val The base-B digits, least significant first.
     */
    private BigInt(boolean neg, int[] val) {
        this(neg, val, 0, val.length);
    }

    /**
     * Constructs a sign-magnitude representation taking a range of an array as the magnitude.
     * @param neg Is the value negative?
     * @param val The base-B digits, least significant at offset off, most significant at off + val - 1.
     * @param off The offset within the array.
     * @param len The number of base-B digits.
     */
    private BigInt(boolean neg, int[] val, int off, int len) {
        // Bounds checks
        if (val == null) throw new IllegalArgumentException("val");
        if (off < 0 || off >= val.length) throw new IllegalArgumentException("off");
        if (len < 1 || off + len > val.length) throw new IllegalArgumentException("len");

        this.negative = neg;
        this.val = val;
        this.off = off;
        // Enforce the invariant that this.len is 1 or val[off + len - 1] is non-zero.
        while (len > 1 && val[off + len - 1] == 0) len--;
        this.len = len;

        // Sanity check
        for (int i = 0; i < len; i++) {
            if (val[off + i] < 0) throw new IllegalArgumentException("val contains negative digits");
        }
    }

    /**
     * Splits a string into base-B digits.
     * @param str The string to parse.
     * @return An array which can be passed to the (boolean, int[]) constructor.
     */
    private static int[] split(String str) {
        if (str.charAt(0) == '-') str = str.substring(1);

        int[] arr = new int[(str.length() + LOG10_B - 1) / LOG10_B];
        int i, off;
        // Each element of arr represents LOG10_B characters except (probably) the last one.
        for (i = 0, off = str.length() - LOG10_B; off > 0; off -= LOG10_B) {
            arr[i++] = Integer.parseInt(str.substring(off, off + LOG10_B));
        }
        arr[i] = Integer.parseInt(str.substring(0, off + LOG10_B));
        return arr;
    }

    public boolean isZero() {
        return len == 1 && val[off] == 0;
    }

    public BigInt negate() {
        return new BigInt(!negative, val, off, len);
    }

    public BigInt add(BigInt that) {
        // If the signs differ, then since we use sign-magnitude representation we want to do a subtraction.
        boolean isSubtraction = negative ^ that.negative;

        BigInt left, right;
        if (len < that.len) {
            left = that;
            right = this;
        }
        else {
            left = this;
            right = that;

            // For addition I just care about the lengths of the arrays.
            // For subtraction I want the largest absolute value on the left.
            if (isSubtraction && len == that.len) {
                int cmp = compareAbsolute(that);
                if (cmp == 0) return ZERO; // Cheap special case
                if (cmp < 0) {
                    left = that;
                    right = this;
                }
            }
        }

        if (right.isZero()) return left;

        BigInt result;
        if (!isSubtraction) {
            int[] sum = new int[left.len + 1];
            // A copy here rather than using left.val in the main loops and copying remaining values
            // at the end gives a small performance boost, probably due to cache locality.
            System.arraycopy(left.val, left.off, sum, 0, left.len);

            int carry = 0, k = 0;
            for (; k < right.len; k++) {
                int a = sum[k] + right.val[right.off + k] + carry;
                sum[k] = a % B;
                carry = a / B;
            }
            for (; carry > 0 && k < left.len; k++) {
                int a = sum[k] + carry;
                sum[k] = a % B;
                carry = a / B;
            }
            sum[left.len] = carry;

            result = new BigInt(negative, sum);
        }
        else {
            int[] diff = new int[left.len];
            System.arraycopy(left.val, left.off, diff, 0, left.len);

            int carry = 0, k = 0;
            for (; k < right.len; k++) {
                int a = diff[k] - right.val[right.off + k] + carry;
                // Why did anyone ever think that rounding positive and negative divisions differently made sense?
                if (a < 0) {
                    diff[k] = a + B;
                    carry = -1;
                }
                else {
                    diff[k] = a % B;
                    carry = a / B;
                }
            }
            for (; carry != 0 && k < left.len; k++) {
                int a = diff[k] + carry;
                if (a < 0) {
                    diff[k] = a + B;
                    carry = -1;
                }
                else {
                    diff[k] = a % B;
                    carry = a / B;
                }
            }

            result = new BigInt(left.negative, diff, 0, k > left.len ? k : left.len);
        }

        return result;
    }

    private int compareAbsolute(BigInt that) {
        if (len > that.len) return 1;
        if (len < that.len) return -1;

        for (int i = len - 1; i >= 0; i--) {
            if (val[off + i] > that.val[that.off + i]) return 1;
            if (val[off + i] < that.val[that.off + i]) return -1;
        }

        return 0;
    }

    public BigInt mul(BigInt that) {
        if (isZero() || that.isZero()) return ZERO;

        if (len == 1) return that.mulSmall(negative ? -val[off] : val[off]);
        if (that.len == 1) return mulSmall(that.negative ? -that.val[that.off] : that.val[that.off]);

        int shorter = len < that.len ? len : that.len;
        BigInt result;
        // Cutoffs have been hand-tuned.
        if (shorter > 300) result = mulToom(3, that);
        else if (shorter > 28) result = mulToom(2, that);
        else result = mulNaive(that);

        return result;
    }

    BigInt mulSmall(int m) {
        if (m == 0) return ZERO;
        if (m == 1) return this;
        if (m == -1) return negate();

        // We want to do the magnitude calculation with a positive multiplicand.
        boolean neg = negative;
        if (m < 0) {
            neg = !neg;
            m = -m;
        }

        int[] pr = new int[len + 1];
        int carry = 0;
        for (int i = 0; i < len; i++) {
            long t = val[off + i] * (long)m + carry;
            pr[i] = (int)(t % B);
            carry = (int)(t / B);
        }
        pr[len] = carry;
        return new BigInt(neg, pr);
    }

    // NB This truncates.
    BigInt divSmall(int d) {
        if (d == 0) throw new ArithmeticException();
        if (d == 1) return this;
        if (d == -1) return negate();

        // We want to do the magnitude calculation with a positive divisor.
        boolean neg = negative;
        if (d < 0) {
            neg = !neg;
            d = -d;
        }

        int[] div = new int[len];
        int rem = 0;
        for (int i = len - 1; i >= 0; i--) {
            long t = val[off + i] + rem * (long)B;
            div[i] = (int)(t / d);
            rem = (int)(t % d);
        }

        return new BigInt(neg, div);
    }

    BigInt mulNaive(BigInt that) {
        int[] rv = new int[len + that.len];
        // Naive multiplication
        for (int i = 0; i < len; i++) {
            for (int j = 0; j < that.len; j++) {
                int k = i + j;
                long c = val[off + i] * (long)that.val[that.off + j];
                while (c > 0) {
                    c += rv[k];
                    rv[k] = (int)(c % B);
                    c /= B;
                    k++;
                }
            }
        }

        return new BigInt(this.negative ^ that.negative, rv);
    }

    private BigInt mulToom(int k, BigInt that) {
        // We split each number into k parts of m base-B digits each.
        // m = ceil(longer / k)
        int m = ((len > that.len ? len : that.len) + k - 1) / k;

        // Perform the splitting and evaluation steps of Toom-Cook.
        BigInt[] f1 = this.toomFwd(k, m);
        BigInt[] f2 = that.toomFwd(k, m);

        // Pointwise multiplication.
        for (int i = 0; i < f1.length; i++) f1[i] = f1[i].mul(f2[i]);

        // Inverse (or interpolation) and recomposition.
        return toomBk(k, m, f1, negative ^ that.negative, val[off], that.val[that.off]);
    }

    // Splits a number into k parts of m base-B digits each and does the polynomial evaluation.
    private BigInt[] toomFwd(int k, int m) {
        // Split.
        BigInt[] a = new BigInt[k];
        for (int i = 0; i < k; i++) {
            int o = i * m;
            if (o >= len) a[i] = ZERO;
            else {
                int l = m;
                if (o + l > len) l = len - o;
                // Ignore signs for now.
                a[i] = new BigInt(false, val, off + o, l);
            }
        }

        // Evaluate
        return transform(Q[k], a);
    }

    private BigInt toomBk(int k, int m, BigInt[] f, boolean neg, int lsd1, int lsd2) {
        // Inverse (or interpolation).
        BigInt[] b = transform(R[k], f);

        // Recomposition: add at suitable offsets, dividing by the normalisation factors
        BigInt prod = ZERO;
        int[] s = S[k];
        for (int i = 0; i < b.length; i++) {
            int[] shifted = new int[i * m + b[i].len];
            System.arraycopy(b[i].val, b[i].off, shifted, i * m, b[i].len);
            prod = prod.add(new BigInt(neg ^ b[i].negative, shifted).divSmall(s[i]));
        }

        // Handle the remainders.
        // In the worst case the absolute value of the sum of the remainders is s.length, so pretty small.
        // It should be easy enough to work out whether to go up or down.
        int lsd = (int)((lsd1 * (long)lsd2) % B);
        int err = lsd - prod.val[prod.off];
        if (err > B / 2) err -= B / 2;
        if (err < -B / 2) err += B / 2;
        return prod.add(new BigInt(err));
    }

    /**
     * Multiplies a matrix of small integers and a vector of big ones.
     * The matrix has a implicit leading row [1 0 ... 0] and an implicit trailing row [0 ... 0 1].
     * @param m The matrix.
     * @param v The vector.
     * @return m v
     */
    private BigInt[] transform(int[][] m, BigInt[] v) {
        BigInt[] b = new BigInt[m.length + 2];
        b[0] = v[0];
        for (int i = 0; i < m.length; i++) {
            BigInt s = ZERO;
            for (int j = 0; j < m[i].length; j++) s = s.add(v[j].mulSmall(m[i][j]));
            b[i + 1] = s;
        }
        b[b.length - 1] = v[v.length - 1];

        return b;
    }

    /**
     * Sums the digits of this integer.
     * @return The sum of the digits of this integer.
     */
    public long digsum() {
        long rv = 0;
        for (int i = 0; i < len; i++) {
            int x = val[off + i];
            while (x > 0) {
                rv += x % 10;
                x /= 10;
            }
        }
        return rv;
    }
}

큰 병목 현상은 순진한 곱셈 (60 %)이고 다른 곱셈 (37 %)과 체질 (3 %)입니다. digsum()호출은 미미하다.

OpenJDK 7 (64 비트)로 측정 한 성능입니다.

— 피터 테일러
소스

아주 좋아요 감사합니다.

1

파이썬 2 (PyPy), 1,134,000 / 486,000 = 2.32

#/!usr/bin/pypy
n=input(); a, b, c=1, 1, 2**((n+2)/4)
for i in range(n-1, n/2, -2): a*=i
for i in range(2, n/4+1): b*=i
print sum(map(int, str(a*c/b)))

결과 : 1,537,506

재미있는 사실 : 코드의 병목 현상은 이항 계수를 계산하지 않고 숫자를 추가하는 것입니다.

— 데니스
소스

파이썬이 숫자를 추가하는 데 왜 그렇게 느린가요? 당신과 xnor는 모두 그렇게 말합니다. 호기심이 생겨서 시계를 봤습니다. 합계 부분 (Java)에 대해 1 초도 채 걸리지 않았습니다.

— Geobits

@Geobits 흠, 궁금합니다. Java도 이진 10 진수 변환을 비슷한 속도로 수행 할 수 있습니까? 정수를 이진수로 표현합니까?

— xnor

그건 좋은 질문이야. 정수 / 정수 / long / Long의 경우 바이너리임을 알고 있습니다. BigInteger의 내부 표현이 정확히 무엇인지 잘 모르겠습니다. 십진수 인 경우 수학에서 속도가 느리지 만 문자열로 변환하는 속도가 빠른 이유를 분명히 설명합니다. 내일 찾아 봐

— Geobits

@Geobits, BigInteger의 내부 표현은 기본 2입니다.

— Peter Taylor

나는 항상 그렇게 가정했지만 그것은 나를 놀라게했습니다. 적어도 OpenJDK에서 긴 크기의 덩어리로 나누고 그런 식으로 변환하는 것처럼 보입니다.

— Geobits

1

자바 (2,020,000 / 491,000) = 4.11

이전에 업데이트 된 2.24

Java BigInteger는 가장 빠른 숫자 크런치가 아니지만 아무것도 아닌 것보다 낫습니다.

이것에 대한 기본 공식은 것처럼 보이지만 n! / ((n/2)!^2)중복 곱셈처럼 보입니다.

분자와 분모 모두에서 발견되는 모든 주요 요소를 제거함으로써 상당한 속도 향상을 얻을 수 있습니다. 이를 위해 먼저 간단한 기본 체를 실행합니다. 그런 다음 각 소수에 대해 어떤 힘을 키워야 하는지를 세고 있습니다. 분자에서 요인을 볼 때마다 분모가 감소합니다.

팩토링하기 전에 계산하고 제거하기 쉽기 때문에 두 개를 개별적으로 처리합니다.

완료되면 필요한 최소 곱셈이 있습니다 . BigInt 곱하기 속도 가 느리기 때문에 좋습니다 .

import java.math.BigInteger;
import java.util.ArrayList;
import java.util.List;

public class CentBiCo {
    public static void main(String[] args) {
        int n = 2020000;
        long time = System.currentTimeMillis();
        sieve(n);
        System.out.println(sumDigits(cbc(n)));
        System.out.println(System.currentTimeMillis()-time);
    }

    static boolean[] sieve;
    static List<Integer> primes;
    static void sieve(int n){
        primes = new ArrayList<Integer>((int)(Math.sqrt(n)));
        sieve = new boolean[n];
        sieve[2]=true;
        for(int i=3;i<sieve.length;i+=2)
            if(i%2==1)
                sieve[i] = true;
        for(int i=3;i<sieve.length;i+=2){
            if(!sieve[i])
                continue;
            for(int j=i*2;j<sieve.length;j+=i)
                sieve[j] = false;
        }
        for(int i=2;i<sieve.length;i++)
            if(sieve[i])
                primes.add(i);
    }

    static int[] factors;
    static void addFactors(int n, int flip){
        for(int k=0;primes.get(k)<=n;){
            int i = primes.get(k);
            if(n%i==0){
                factors[i] += flip;
                n /= i;
            } else {
                if(++k == primes.size())
                    break;
            }
        }
        factors[n]++;
    }

    static BigInteger cbc(int n){
        factors = new int[n+1];
        int x = n/2;
        for(int i=x%2<1?x+1:x+2;i<n;i+=2)
            addFactors(i,1);
        factors[2] = x;
        for(int i=1;i<=x/2;i++){
            int j=i;
            while(j%2<1 && factors[2] > 1){
                j=j/2;
                factors[2]--;
            }
            addFactors(j,-1);
            factors[2]--;
        }
        BigInteger cbc = BigInteger.ONE;
        for(int i=3;i<factors.length;i++){
            if(factors[i]>0)
                cbc = cbc.multiply(BigInteger.valueOf(i).pow(factors[i]));
        }
        return cbc.shiftLeft(factors[2]);
    }

    static long sumDigits(BigInteger in){
        long sum = 0;
        String str = in.toString();
        for(int i=0;i<str.length();i++)
            sum += str.charAt(i)-'0';
        return sum;
    }
}

아, 그리고 2735298검증 목적으로, n = 2020000의 출력 합은 입니다.

— 지오 비트
소스

중심 이항 계수의 자릿수 합계

C ++ (GMP)-(287,000,000 / 422,000) = 680.09

이동, 33.96 = (16300000/480000)

파이썬 3 (8.8 = 220 만 / 0.25 백만)

자바 (점수 22500/365000 = 0.062)

GMP-1500000/300000 = 5.0

Java, 사용자 지정 큰 정수 클래스 : 32.9 (120000000/365000)

파이썬 2 (PyPy), 1,134,000 / 486,000 = 2.32

결과 : 1,537,506

자바 (2,020,000 / 491,000) = 4.11