Javascript의 atob을 사용하여 base64를 디코딩하면 utf-8 문자열이 제대로 디코딩되지 않습니다.

106

Javascript window.atob()함수를 사용하여 base64로 인코딩 된 문자열 (특히 GitHub API의 base64로 인코딩 된 콘텐츠)을 디코딩하고 있습니다. 문제는 ASCII로 인코딩 된 문자가 다시 표시된다는 â¢것입니다 ( 대신 ™). 들어오는 base64 인코딩 스트림을 올바르게 처리하여 utf-8로 디코딩되도록하려면 어떻게해야합니까?

javascript encoding utf-8

— 브랜든 스크립트
소스

3

링크 한 MDN 페이지에 "유니 코드 또는 UTF-8 문자열과 함께 사용"이라는 문구로 시작하는 단락이 있습니다.

— Pointy

1

당신은 노드에 있습니까? 보다 나은 솔루션이 있습니다atob

— Bergi

269

이 문제를 정확히 설명하는 Mozilla의 MDN 문서 에 대한 훌륭한 기사 가 있습니다 .

"유니 코드 문제" DOMStrings는 16 비트로 인코딩 된 문자열이므로 대부분의 브라우저 window.btoa에서 유니 코드 문자열을 호출 Character Out Of Range exception하면 문자가 8 비트 바이트 (0x00 ~ 0xFF)의 범위를 초과하면 a 가 발생합니다 . 이 문제를 해결할 수있는 두 가지 방법이 있습니다.

첫 번째는 전체 문자열 (UTF-8 사용, 참조 encodeURIComponent) 을 이스케이프 한 다음 인코딩하는 것입니다.

두 번째는 UTF-16 DOMString을 UTF-8 문자 배열 로 변환 한 다음 인코딩하는 것입니다.

이전 솔루션에 대한 참고 사항 : MDN 기사는 원래 예외 문제 를 해결하기 위해 unescape및 사용을 제안 했지만 그 이후로 더 이상 사용되지 않습니다. 여기에있는 다른 답변은 및 을 사용하여이 문제를 해결하는 것을 제안했으며 이는 신뢰할 수없고 예측할 수없는 것으로 입증되었습니다. 이 답변에 대한 최신 업데이트는 최신 JavaScript 함수를 사용하여 속도를 향상시키고 코드를 현대화합니다.escapeCharacter Out Of RangedecodeURIComponentencodeURIComponent

시간을 절약하려는 경우 라이브러리 사용을 고려할 수도 있습니다.

js-base64 (NPM, Node.js에 적합)
base64-js

UTF8 ⇢ base64 인코딩

function b64EncodeUnicode(str) {
    // first we use encodeURIComponent to get percent-encoded UTF-8,
    // then we convert the percent encodings into raw bytes which
    // can be fed into btoa.
    return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g,
        function toSolidBytes(match, p1) {
            return String.fromCharCode('0x' + p1);
    }));
}

b64EncodeUnicode('✓ à la mode'); // "4pyTIMOgIGxhIG1vZGU="
b64EncodeUnicode('\n'); // "Cg=="

base64 ⇢ UTF8 디코딩

function b64DecodeUnicode(str) {
    // Going backwards: from bytestream, to percent-encoding, to original string.
    return decodeURIComponent(atob(str).split('').map(function(c) {
        return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2);
    }).join(''));
}

b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU='); // "✓ à la mode"
b64DecodeUnicode('Cg=='); // "\n"

2018 년 이전 솔루션 (기능적이며 최신 버전이 아닌 이전 브라우저에 대한 더 나은 지원)

다음은 @ MA-Maddin을 통한 추가 TypeScript 호환성과 함께 MDN에서 직접 제공하는 현재 권장 사항입니다.

// Encoding UTF8 ⇢ base64

function b64EncodeUnicode(str) {
    return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function(match, p1) {
        return String.fromCharCode(parseInt(p1, 16))
    }))
}

b64EncodeUnicode('✓ à la mode') // "4pyTIMOgIGxhIG1vZGU="
b64EncodeUnicode('\n') // "Cg=="

// Decoding base64 ⇢ UTF8

function b64DecodeUnicode(str) {
    return decodeURIComponent(Array.prototype.map.call(atob(str), function(c) {
        return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2)
    }).join(''))
}

b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU=') // "✓ à la mode"
b64DecodeUnicode('Cg==') // "\n"

원래 솔루션 (더 이상 사용되지 않음)

이것은 사용 escape되었고 unescape(현재는 더 이상 사용되지 않지만 모든 최신 브라우저에서 여전히 작동합니다) :

function utf8_to_b64( str ) {
    return window.btoa(unescape(encodeURIComponent( str )));
}

function b64_to_utf8( str ) {
    return decodeURIComponent(escape(window.atob( str )));
}

// Usage:
utf8_to_b64('✓ à la mode'); // "4pyTIMOgIGxhIG1vZGU="
b64_to_utf8('4pyTIMOgIGxhIG1vZGU='); // "✓ à la mode"

마지막으로 GitHub API를 호출 할 때이 문제가 처음 발생했습니다. (모바일) Safari에서 제대로 작동 하려면 소스를 디코딩 하기 전에 실제로 base64 소스에서 모든 공백을 제거해야했습니다 . 이것이 2017 년에도 여전히 관련이 있는지 여부는 모르겠습니다.

function b64_to_utf8( str ) {
    str = str.replace(/\s/g, '');    
    return decodeURIComponent(escape(window.atob( str )));
}

— 브랜든 스크립트
소스

1

w3schools.com/jsref/jsref_unescape.asp "JavaScript 버전 1.5에서 unescape () 함수가 더 이상 사용되지 않습니다. 대신 decodeURI () 또는 decodeURIComponent ()를 사용하십시오."

— Tedd Hansen

1

당신은 내 일, 형제 저장

— 씨 네오

2

업데이트 : MDN의 솔루션 # 1입니다 은 "유니 코드 문제는" 고정 된 b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU=');"라 모드 à ✓"이제 제대로 출력

— weeix

2

디코딩하는 또 다른 방법 decodeURIComponent(atob('4pyTIMOgIGxhIG1vZGU=').split('').map(x => '%' + x.charCodeAt(0).toString(16)).join('')) 은 가장 성능이 좋은 코드는 아니지만 그것이 무엇인지입니다.

— daniel.gindi

2

return String.fromCharCode(parseInt(p1, 16));TypeScript 호환성이 있습니다.

— 마틴 슈나이더

20

모든것은 변한다. 탈출 / 언 이스케이프 방법이 사용되지 않습니다.

Base64로 인코딩하기 전에 문자열을 URI 인코딩 할 수 있습니다. 이것은 Base64로 인코딩 된 UTF8이 아니라 Base64로 인코딩 된 URL로 인코딩 된 데이터를 생성합니다. 양쪽 모두 동일한 인코딩에 동의해야합니다.

여기에서 작업 예제를 참조하십시오 : http://codepen.io/anon/pen/PZgbPW

// encode string
var base64 = window.btoa(encodeURIComponent('€ 你好 æøåÆØÅ'));
// decode string
var str = decodeURIComponent(window.atob(tmp));
// str is now === '€ 你好 æøåÆØÅ'

OP 문제의 경우 js-base64 와 같은 타사 라이브러리 가 문제를 해결해야합니다.

— 테드 한센
소스

1

입력 문자열의 base64를 생성하는 것이 아니라 인코딩 된 구성 요소를 생성한다는 점을 지적하고 싶습니다. 그래서 당신이 그것을 멀리 보내면 상대방은 "64 기수"로 디코딩하고 원래의 문자열을 얻을 수 없다

— 리카르도 갈리

3

당신이 맞습니다, 나는 그것을 지적하기 위해 텍스트를 업데이트했습니다. 감사. 대안은 타사 라이브러리 (예 : js-base64)를 사용하여 base64를 직접 구현하거나 "오류 : '창'에서 'btoa'를 실행하지 못했습니다 : 인코딩 할 문자열에 Latin1 범위 밖의 문자가 포함되어 있습니다. "

— Tedd Hansen

14

문자열을 바이트로 처리하는 것이 더 많은 경우 다음 함수를 사용할 수 있습니다.

function u_atob(ascii) {
    return Uint8Array.from(atob(ascii), c => c.charCodeAt(0));
}

function u_btoa(buffer) {
    var binary = [];
    var bytes = new Uint8Array(buffer);
    for (var i = 0, il = bytes.byteLength; i < il; i++) {
        binary.push(String.fromCharCode(bytes[i]));
    }
    return btoa(binary.join(''));
}


// example, it works also with astral plane characters such as '𝒞'
var encodedString = new TextEncoder().encode('✓');
var base64String = u_btoa(encodedString);
console.log('✓' === new TextDecoder().decode(u_atob(base64String)))

— 리카르도 갈리
소스

1

감사. 귀하의 답변은 제가이 작업을 수행하는 데 결정적인 역할을했으며 며칠에 걸쳐 많은 시간이 걸렸습니다. +1. stackoverflow.com/a/51814273/470749

— Ryan

훨씬 빠르고 더 많은 브라우저 간 솔루션 (기본적으로 동일한 출력)을 보려면 stackoverflow.com/a/53433503/5601591

— Jack Giffin

u_atob 및 u_btoa는 IE10 (2012) 이후 모든 브라우저에서 사용할 수있는 기능을 사용합니다. 견고 해 보입니다 (TextEncoder를 참조하면 그저 예일뿐입니다)

— Riccardo Galli

5

다음은 Mozilla 개발 리소스에 설명 된 2018 년 업데이트 된 솔루션입니다.

유니 코드에서 B64로 인코딩하려면

function b64EncodeUnicode(str) {
    // first we use encodeURIComponent to get percent-encoded UTF-8,
    // then we convert the percent encodings into raw bytes which
    // can be fed into btoa.
    return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g,
        function toSolidBytes(match, p1) {
            return String.fromCharCode('0x' + p1);
    }));
}

b64EncodeUnicode('✓ à la mode'); // "4pyTIMOgIGxhIG1vZGU="
b64EncodeUnicode('\n'); // "Cg=="

B64에서 유니 코드로 디코딩하려면

function b64DecodeUnicode(str) {
    // Going backwards: from bytestream, to percent-encoding, to original string.
    return decodeURIComponent(atob(str).split('').map(function(c) {
        return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2);
    }).join(''));
}

b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU='); // "✓ à la mode"
b64DecodeUnicode('Cg=='); // "\n"

— 마누엘 G
소스

4

나를 위해 작동하는 전체 기사 : https://developer.mozilla.org/en-US/docs/Web/JavaScript/Base64_encoding_and_decoding

Unicode / UTF-8에서 인코딩하는 부분은

function utf8_to_b64( str ) {
   return window.btoa(unescape(encodeURIComponent( str )));
}

function b64_to_utf8( str ) {
   return decodeURIComponent(escape(window.atob( str )));
}

// Usage:
utf8_to_b64('✓ à la mode'); // "4pyTIMOgIGxhIG1vZGU="
b64_to_utf8('4pyTIMOgIGxhIG1vZGU='); // "✓ à la mode"

이것은 오늘날 가장 많이 사용되는 방법 중 하나입니다.

— 리카
소스

수락 된 답변과 동일한 링크입니다.

— brandonscript

3

널리 사용되는 base64 URI를 생성하는 솔루션을 원할 수 있다고 가정합니다. data:text/plain;charset=utf-8;base64,4pi44pi54pi64pi74pi84pi+4pi/데모를 보려면 방문 하십시오 (데이터 URI 복사, 새 탭 열기, 주소 표시 줄에 데이터 URI 붙여 넣기, 페이지로 이동하려면 Enter 키 누르기). 이 URI가 base64로 인코딩된다는 사실에도 불구하고 브라우저는 여전히 높은 코드 포인트를 인식하고 올바르게 디코딩 할 수 있습니다. 축소 된 인코더 + 디코더는 1058 바이트 (+ Gzip → 589 바이트)입니다.

!function(e){"use strict";function h(b){var a=b.charCodeAt(0);if(55296<=a&&56319>=a)if(b=b.charCodeAt(1),b===b&&56320<=b&&57343>=b){if(a=1024*(a-55296)+b-56320+65536,65535<a)return d(240|a>>>18,128|a>>>12&63,128|a>>>6&63,128|a&63)}else return d(239,191,189);return 127>=a?inputString:2047>=a?d(192|a>>>6,128|a&63):d(224|a>>>12,128|a>>>6&63,128|a&63)}function k(b){var a=b.charCodeAt(0)<<24,f=l(~a),c=0,e=b.length,g="";if(5>f&&e>=f){a=a<<f>>>24+f;for(c=1;c<f;++c)a=a<<6|b.charCodeAt(c)&63;65535>=a?g+=d(a):1114111>=a?(a-=65536,g+=d((a>>10)+55296,(a&1023)+56320)):c=0}for(;c<e;++c)g+="\ufffd";return g}var m=Math.log,n=Math.LN2,l=Math.clz32||function(b){return 31-m(b>>>0)/n|0},d=String.fromCharCode,p=atob,q=btoa;e.btoaUTF8=function(b,a){return q((a?"\u00ef\u00bb\u00bf":"")+b.replace(/[\x80-\uD7ff\uDC00-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]?/g,h))};e.atobUTF8=function(b,a){a||"\u00ef\u00bb\u00bf"!==b.substring(0,3)||(b=b.substring(3));return p(b).replace(/[\xc0-\xff][\x80-\xbf]*/g,k)}}(""+void 0==typeof global?""+void 0==typeof self?this:self:global)

다음은이를 생성하는 데 사용되는 소스 코드입니다.

var fromCharCode = String.fromCharCode;
var btoaUTF8 = (function(btoa, replacer){"use strict";
    return function(inputString, BOMit){
        return btoa((BOMit ? "\xEF\xBB\xBF" : "") + inputString.replace(
            /[\x80-\uD7ff\uDC00-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]?/g, replacer
        ));
    }
})(btoa, function(nonAsciiChars){"use strict";
    // make the UTF string into a binary UTF-8 encoded string
    var point = nonAsciiChars.charCodeAt(0);
    if (point >= 0xD800 && point <= 0xDBFF) {
        var nextcode = nonAsciiChars.charCodeAt(1);
        if (nextcode !== nextcode) // NaN because string is 1 code point long
            return fromCharCode(0xef/*11101111*/, 0xbf/*10111111*/, 0xbd/*10111101*/);
        // https://mathiasbynens.be/notes/javascript-encoding#surrogate-formulae
        if (nextcode >= 0xDC00 && nextcode <= 0xDFFF) {
            point = (point - 0xD800) * 0x400 + nextcode - 0xDC00 + 0x10000;
            if (point > 0xffff)
                return fromCharCode(
                    (0x1e/*0b11110*/<<3) | (point>>>18),
                    (0x2/*0b10*/<<6) | ((point>>>12)&0x3f/*0b00111111*/),
                    (0x2/*0b10*/<<6) | ((point>>>6)&0x3f/*0b00111111*/),
                    (0x2/*0b10*/<<6) | (point&0x3f/*0b00111111*/)
                );
        } else return fromCharCode(0xef, 0xbf, 0xbd);
    }
    if (point <= 0x007f) return nonAsciiChars;
    else if (point <= 0x07ff) {
        return fromCharCode((0x6<<5)|(point>>>6), (0x2<<6)|(point&0x3f));
    } else return fromCharCode(
        (0xe/*0b1110*/<<4) | (point>>>12),
        (0x2/*0b10*/<<6) | ((point>>>6)&0x3f/*0b00111111*/),
        (0x2/*0b10*/<<6) | (point&0x3f/*0b00111111*/)
    );
});

그런 다음 base64 데이터를 디코딩하려면 HTTP가 데이터를 데이터 URI로 가져 오거나 아래 함수를 사용합니다.

var clz32 = Math.clz32 || (function(log, LN2){"use strict";
    return function(x) {return 31 - log(x >>> 0) / LN2 | 0};
})(Math.log, Math.LN2);
var fromCharCode = String.fromCharCode;
var atobUTF8 = (function(atob, replacer){"use strict";
    return function(inputString, keepBOM){
        inputString = atob(inputString);
        if (!keepBOM && inputString.substring(0,3) === "\xEF\xBB\xBF")
            inputString = inputString.substring(3); // eradicate UTF-8 BOM
        // 0xc0 => 0b11000000; 0xff => 0b11111111; 0xc0-0xff => 0b11xxxxxx
        // 0x80 => 0b10000000; 0xbf => 0b10111111; 0x80-0xbf => 0b10xxxxxx
        return inputString.replace(/[\xc0-\xff][\x80-\xbf]*/g, replacer);
    }
})(atob, function(encoded){"use strict";
    var codePoint = encoded.charCodeAt(0) << 24;
    var leadingOnes = clz32(~codePoint);
    var endPos = 0, stringLen = encoded.length;
    var result = "";
    if (leadingOnes < 5 && stringLen >= leadingOnes) {
        codePoint = (codePoint<<leadingOnes)>>>(24+leadingOnes);
        for (endPos = 1; endPos < leadingOnes; ++endPos)
            codePoint = (codePoint<<6) | (encoded.charCodeAt(endPos)&0x3f/*0b00111111*/);
        if (codePoint <= 0xFFFF) { // BMP code point
          result += fromCharCode(codePoint);
        } else if (codePoint <= 0x10FFFF) {
          // https://mathiasbynens.be/notes/javascript-encoding#surrogate-formulae
          codePoint -= 0x10000;
          result += fromCharCode(
            (codePoint >> 10) + 0xD800,  // highSurrogate
            (codePoint & 0x3ff) + 0xDC00 // lowSurrogate
          );
        } else endPos = 0; // to fill it in with INVALIDs
    }
    for (; endPos < stringLen; ++endPos) result += "\ufffd"; // replacement character
    return result;
});

더 많은 표준의 장점은이 인코더와이 디코더가 올바르게 표시되는 유효한 URL로 사용될 수 있기 때문에 더 널리 적용된다는 것입니다. 관찰하십시오.

(function(window){
    "use strict";
    var sourceEle = document.getElementById("source");
    var urlBarEle = document.getElementById("urlBar");
    var mainFrameEle = document.getElementById("mainframe");
    var gotoButton = document.getElementById("gotoButton");
    var parseInt = window.parseInt;
    var fromCodePoint = String.fromCodePoint;
    var parse = JSON.parse;
    
    function unescape(str){
        return str.replace(/\\u[\da-f]{0,4}|\\x[\da-f]{0,2}|\\u{[^}]*}|\\[bfnrtv"'\\]|\\0[0-7]{1,3}|\\\d{1,3}/g, function(match){
          try{
            if (match.startsWith("\\u{"))
              return fromCodePoint(parseInt(match.slice(2,-1),16));
            if (match.startsWith("\\u") || match.startsWith("\\x"))
              return fromCodePoint(parseInt(match.substring(2),16));
            if (match.startsWith("\\0") && match.length > 2)
              return fromCodePoint(parseInt(match.substring(2),8));
            if (/^\\\d/.test(match)) return fromCodePoint(+match.slice(1));
          }catch(e){return "\ufffd".repeat(match.length)}
          return parse('"' + match + '"');
        });
    }
    
    function whenChange(){
      try{ urlBarEle.value = "data:text/plain;charset=UTF-8;base64," + btoaUTF8(unescape(sourceEle.value), true);
      } finally{ gotoURL(); }
    }
    sourceEle.addEventListener("change",whenChange,{passive:1});
    sourceEle.addEventListener("input",whenChange,{passive:1});
    
    // IFrame Setup:
    function gotoURL(){mainFrameEle.src = urlBarEle.value}
    gotoButton.addEventListener("click", gotoURL, {passive: 1});
    function urlChanged(){urlBarEle.value = mainFrameEle.src}
    mainFrameEle.addEventListener("load", urlChanged, {passive: 1});
    urlBarEle.addEventListener("keypress", function(evt){
      if (evt.key === "enter") evt.preventDefault(), urlChanged();
    }, {passive: 1});
    
        
    var fromCharCode = String.fromCharCode;
    var btoaUTF8 = (function(btoa, replacer){
		    "use strict";
        return function(inputString, BOMit){
        	return btoa((BOMit?"\xEF\xBB\xBF":"") + inputString.replace(
        		/[\x80-\uD7ff\uDC00-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]?/g, replacer
    		));
    	}
    })(btoa, function(nonAsciiChars){
		"use strict";
    	// make the UTF string into a binary UTF-8 encoded string
    	var point = nonAsciiChars.charCodeAt(0);
    	if (point >= 0xD800 && point <= 0xDBFF) {
    		var nextcode = nonAsciiChars.charCodeAt(1);
    		if (nextcode !== nextcode) { // NaN because string is 1code point long
    			return fromCharCode(0xef/*11101111*/, 0xbf/*10111111*/, 0xbd/*10111101*/);
    		}
    		// https://mathiasbynens.be/notes/javascript-encoding#surrogate-formulae
    		if (nextcode >= 0xDC00 && nextcode <= 0xDFFF) {
    			point = (point - 0xD800) * 0x400 + nextcode - 0xDC00 + 0x10000;
    			if (point > 0xffff) {
    				return fromCharCode(
    					(0x1e/*0b11110*/<<3) | (point>>>18),
    					(0x2/*0b10*/<<6) | ((point>>>12)&0x3f/*0b00111111*/),
    					(0x2/*0b10*/<<6) | ((point>>>6)&0x3f/*0b00111111*/),
    					(0x2/*0b10*/<<6) | (point&0x3f/*0b00111111*/)
    				);
    			}
    		} else {
    			return fromCharCode(0xef, 0xbf, 0xbd);
    		}
    	}
    	if (point <= 0x007f) { return inputString; }
    	else if (point <= 0x07ff) {
    		return fromCharCode((0x6<<5)|(point>>>6), (0x2<<6)|(point&0x3f/*00111111*/));
    	} else {
    		return fromCharCode(
    			(0xe/*0b1110*/<<4) | (point>>>12),
    			(0x2/*0b10*/<<6) | ((point>>>6)&0x3f/*0b00111111*/),
    			(0x2/*0b10*/<<6) | (point&0x3f/*0b00111111*/)
    		);
    	}
    });
    setTimeout(whenChange, 0);
})(window);

img:active{opacity:0.8}

<center>
<textarea id="source" style="width:66.7vw">Hello \u1234 W\186\0256ld!
Enter text into the top box. Then the URL will update automatically.
</textarea><br />
<div style="width:66.7vw;display:inline-block;height:calc(25vw + 1em + 6px);border:2px solid;text-align:left;line-height:1em">
<input id="urlBar" style="width:calc(100% - 1em - 13px)" /><img id="gotoButton" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABsAAAAeCAMAAADqx5XUAAAAclBMVEX///9NczZ8e32ko6fDxsU/fBoSQgdFtwA5pAHVxt+7vLzq5ex23y4SXABLiiTm0+/c2N6DhoQ6WSxSyweVlZVvdG/Uz9aF5kYlbwElkwAggACxs7Jl3hX07/cQbQCar5SU9lRntEWGum+C9zIDHwCGnH5IvZAOAAABmUlEQVQoz7WS25acIBBFkRLkIgKKtOCttbv//xdDmTGZzHv2S63ltuBQQP4rdRiRUP8UK4wh6nVddQwj/NtDQTvac8577zTQb72zj65/876qqt7wykU6/1U6vFEgjE1mt/5LRqrpu7oVsn0sjZejMfxR3W/yLikqAFcUx93YxLmZGOtElmEu6Ufd9xV3ZDTGcEvGLbMk0mHHlUSvS5svCwS+hVL8loQQyfpI1Ay8RF/xlNxcsTchGjGDIuBG3Ik7TMyNxn8m0TSnBAK6Z8UZfp3IbAonmJvmsEACum6aNv7B0CnvpezDcNhw9XWsuAr7qnRg6dABmeM4dTgn/DZdXWs3LMspZ1KDMt1kcPJ6S1icWNp2qaEmjq6myx7jbQK3VKItLJaW5FR+cuYlRhYNKzGa9vF4vM5roLW3OSVjkmiGJrPhUq301/16pVKZRGFYWjTP50spTxBN5Z4EKnSonruk+n4tUokv1aJSEl/MLZU90S3L6/U6o0J142iQVp3HcZxKSo8LfkNRCtJaKYFSRX7iaoAAUDty8wvWYR6HJEepdwAAAABJRU5ErkJggg==" style="width:calc(1em + 4px);line-height:1em;vertical-align:-40%;cursor:pointer" />
<iframe id="mainframe" style="width:66.7vw;height:25vw" frameBorder="0"></iframe>
</div>
</center>

스 니펫 확장

매우 표준화 된 것 외에도 위의 코드 스 니펫은 매우 빠릅니다. 데이터를 다양한 형식 (예 : Riccardo Galli의 응답)간에 여러 번 변환해야하는 간접적 인 연속 체인 대신, 위 코드 조각은 가능한 한 직접적입니다. String.prototype.replace인코딩 할 때 데이터를 처리하는 데 하나의 간단한 빠른 호출 만 사용 하고 디코딩 할 때 데이터를 디코딩하는 데 하나만 사용합니다. 또 다른 장점은 (특히 큰 문자열의 경우) String.prototype.replace브라우저가 문자열 크기 조정의 기본 메모리 관리를 자동으로 처리 할 수있게하여 특히 최적화 된 Chrome 및 Firefox와 같은 상시 브라우저에서 성능이 크게 향상된다는 것입니다.String.prototype.replace. 마지막으로, 케이크의 장식은 사용자를 제외한 라틴 스크립트의 경우 0x7f 이상의 코드 포인트를 포함하지 않는 문자열은 문자열이 대체 알고리즘에 의해 수정되지 않은 상태로 유지되기 때문에 처리 속도가 더 빠르다는 것입니다.

https://github.com/anonyco/BestBase64EncoderDecoder/ 에서이 솔루션에 대한 github 저장소를 만들었습니다.

— 잭 기핀
소스

"사용자가 만든 방식"과 "브라우저에서 해석 할 수있는 방식"의 의미를 자세히 설명해 주시겠습니까? Mozilla가 권장하는 것보다이 솔루션을 사용하는 것의 부가가치는 무엇입니까?

— brandonscript

@brandonscript Mozilla는 MDN과 다릅니다. MDN은 사용자가 만든 콘텐츠입니다. 솔루션을 추천하는 MDN의 페이지는 브라우저 공급 업체가 만든 콘텐츠가 아니라 사용자가 만든 콘텐츠입니다.

— Jack Giffin

솔루션 공급 업체가 생성 되었습니까? 나는 그 기원을 인정하는 것을 제안합니다. 그렇지 않다면 사용자가 만든 것이고 MDN의 대답과 다르지 않습니까?

— brandonscript

@brandonscript 좋은 지적입니다. 당신이 올바른지. 나는 그 텍스트를 제거했다. 또한 내가 추가 한 데모를 확인하십시오.

— Jack Giffin

0

작은 수정, 이스케이프 제거 및 이스케이프는 더 이상 사용되지 않으므로 다음과 같습니다.

function utf8_to_b64( str ) {
    return window.btoa(decodeURIComponent(encodeURIComponent(str)));
}

function b64_to_utf8( str ) {
     return decodeURIComponent(encodeURIComponent(window.atob(str)));
}


function b64_to_utf8( str ) {
    str = str.replace(/\s/g, '');    
    return decodeURIComponent(encodeURIComponent(window.atob(str)));
}

— Darkves
소스

2

문서 링크가 지금과는 다른 것으로 보이며 관리 할 정규식 솔루션을 제안합니다.

— brandonscript

2

encodeURIComponent의 반대 이므로 작동하지 않습니다 decodeURIComponent. 즉, 변환을 실행 취소합니다. 및에서 일어나는 일에 대한 자세한 설명은 stackoverflow.com/a/31412163/1534459 를 참조하십시오 . escapeunescape

— bodo

1

@canaaerus 귀하의 의견을 이해하지 못합니까? escape 및 unescape는 더 이상 사용되지 않습니다. [decode | encode] URIComponent 함수로 바꾸면됩니다. :-) 모든 것이 잘 작동합니다. 먼저 질문 읽기

— Darkves

1

@Darkves : encodeURIComponent사용되는 이유 는 유니 코드 문자열 (전체 범위)을 올바르게 처리하기 위해서입니다. 예를 들면 그래서 window.btoa(decodeURIComponent(encodeURIComponent('€')))주는 Error: String contains an invalid character이 같은 때문에 window.btoa('€')및 btoa인코딩 할 수 없습니다 €.

— bodo

2

@Darkves : 네, 맞습니다. 그러나 Encode와 이스케이프 메서드가 동일한 작업을 수행하지 않기 때문에 이스케이프를 EncodeURIComponent로 교체하고 DecodeURIComponent로 이스케이프 해제 할 수 없습니다. 디코드 및 이스케이프 해제와 동일합니다. 나는 원래 같은 실수를했다, btw. 문자열을 UriEncode 한 다음 UriDecode하면 입력 한 것과 동일한 문자열이 반환됩니다. 그렇게하는 것은 말도 안됩니다. encodeURIComponent로 인코딩 된 문자열을 이스케이프 해제하면 입력 한 것과 동일한 문자열을 다시 얻지 못하므로 이스케이프 / 이스케이프 해제를 사용하면 작동하지만 사용자의 경우에는 그렇지 않습니다.

— Stefan Steiger

0

.NET Framework가 부족할 수있는 브라우저를위한 미래 보장형 코드는 다음과 같습니다 escape/unescape(). IE 9 및 이전 버전은를 지원하지 않으므로 atob/btoa()사용자 지정 base64 함수를 사용해야합니다.

// Polyfill for escape/unescape
if( !window.unescape ){
    window.unescape = function( s ){
        return s.replace( /%([0-9A-F]{2})/g, function( m, p ) {
            return String.fromCharCode( '0x' + p );
        } );
    };
}
if( !window.escape ){
    window.escape = function( s ){
        var chr, hex, i = 0, l = s.length, out = '';
        for( ; i < l; i ++ ){
            chr = s.charAt( i );
            if( chr.search( /[A-Za-z0-9\@\*\_\+\-\.\/]/ ) > -1 ){
                out += chr; continue; }
            hex = s.charCodeAt( i ).toString( 16 );
            out += '%' + ( hex.length % 2 != 0 ? '0' : '' ) + hex;
        }
        return out;
    };
}

// Base64 encoding of UTF-8 strings
var utf8ToB64 = function( s ){
    return btoa( unescape( encodeURIComponent( s ) ) );
};
var b64ToUtf8 = function( s ){
    return decodeURIComponent( escape( atob( s ) ) );
};

UTF-8 인코딩 및 디코딩에 대한보다 포괄적 인 예는 http://jsfiddle.net/47zwb41o/ 에서 찾을 수 있습니다.

— Beejor
소스

-1

위의 해결 방법을 포함하여 여전히 문제가 발생하면 다음과 같이 시도하십시오. TS에서 탈출이 지원되지 않는 경우를 고려하십시오.

blob = new Blob(["\ufeff", csv_content]); // this will make symbols to appears in excel

csv_content의 경우 아래와 같이 시도 할 수 있습니다.

function b64DecodeUnicode(str: any) {        
        return decodeURIComponent(atob(str).split('').map((c: any) => {
            return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2);
        }).join(''));
    }

— 디와 카르
소스