最新消息:Welcome to the puzzle paradise for programmers! Here, a well-designed puzzle awaits you. From code logic puzzles to algorithmic challenges, each level is closely centered on the programmer's expertise and skills. Whether you're a novice programmer or an experienced tech guru, you'll find your own challenges on this site. In the process of solving puzzles, you can not only exercise your thinking skills, but also deepen your understanding and application of programming knowledge. Come to start this puzzle journey full of wisdom and challenges, with many programmers to compete with each other and show your programming wisdom! Translated with DeepL.com (free version)

jquery - Javascript, convert unicode string to Javascript escape? - Stack Overflow

matteradmin20PV0评论

I have a variable that contains a string consisting of Japanese characters, for instance;

"みどりいろ"

How would I go about converting this to its Javascript escape form?

The result I am after for this example specifically is:

"\u306f\u3044\u3044\u308d"

I'd prefer a jquery approach if there's a variation.

I have a variable that contains a string consisting of Japanese characters, for instance;

"みどりいろ"

How would I go about converting this to its Javascript escape form?

The result I am after for this example specifically is:

"\u306f\u3044\u3044\u308d"

I'd prefer a jquery approach if there's a variation.

Share Improve this question asked Jan 9, 2014 at 7:51 JamusJamus 8654 gold badges12 silver badges28 bronze badges 1
  • 1 @SergeiZahharenko - escape("abc") //"abc"... – Derek 朕會功夫 Commented Jan 9, 2014 at 8:06
Add a comment  | 

6 Answers 6

Reset to default 42
"み".charCodeAt(0).toString(16);

This will give you the unicode (in Hex). You can run it through a loop:

String.prototype.toUnicode = function(){
    var result = "";
    for(var i = 0; i < this.length; i++){
        // Assumption: all characters are < 0xffff
        result += "\\u" + ("000" + this[i].charCodeAt(0).toString(16)).substr(-4);
    }
    return result;
};

"みどりいろ".toUnicode();       //"\u307f\u3069\u308a\u3044\u308d"
"Mi Do Ri I Ro".toUnicode();  //"\u004d\u0069\u0020\u0044\u006f\u0020\u0052\u0069\u0020\u0049\u0020\u0052\u006f"
"Green".toUniCode();          //"\u0047\u0072\u0065\u0065\u006e"

Demo: http://jsfiddle.net/DerekL/X7MCy/

More on: .charCodeAt

Above answer is reasonable. A slight space and performance optimization:

function escapeUnicode(str) {
    return str.replace(/[^\0-~]/g, function(ch) {
        return "\\u" + ("000" + ch.charCodeAt().toString(16)).slice(-4);
    });
}

just

escape("みどりいろ")

should meet the needs for most cases, buf if you need it in the form of "\u" instead of "%xx" / "%uxxxx" then you might want to use regular expressions:

escape("みどりいろ").replace(/%/g, '\\').toLowerCase()

escape("みどりいろ").replace(/%u([A-F0-9]{4})|%([A-F0-9]{2})/g, function(_, u, x) { return "\\u" + (u || '00' + x).toLowerCase() });

(toLowerCase is optional to make it look exactly like in the first post)

It doesn't escape characters it doesn't need to in most cases which may be a plus for you; if not - see Derek's answer, or use my version:

'\\u' + "みどりいろ".split('').map(function(t) { return ('000' + t.charCodeAt(0).toString(16)).substr(-4) }).join('\\u');

My version of code, based on previous answers. I use if to convert non UTF8 chars in JSON.stringify().

const toUTF8 = string =>
    string.split('').map(
        ch => !ch.match(/^[^a-z0-9\s\t\r\n_|\\+()!@#$%^&*=?/~`:;'"\[\]\-]+$/i)
            ? ch
            : '\\' + 'u' + '000' + ch.charCodeAt(0).toString(16)
    ).join('');

Usage:

JSON.stringify({key: 'Категория дли импорта'}, (key, value) => {
    if (typeof value === "string") {
        return toUTF8(value);
    }

    return value;
});

Returns JSON:

{"key":"\\u00041a\\u000430\\u000442\\u000435\\u000433\\u00043e\\u000440\\u000438\\u00044f \\u000434\\u00043b\\u000438 \\u000438\\u00043c\\u00043f\\u00043e\\u000440\\u000442\\u000430"}

Just use the encodeURI function:

encodeURI("みどりいろ")
"%E3%81%BF%E3%81%A9%E3%82%8A%E3%81%84%E3%82%8D"

And the other side decode it back:

decodeURI("%E3%81%BF%E3%81%A9%E3%82%8A%E3%81%84%E3%82%8D")
"みどりいろ"

I have an answer for this question. This function I made worked for me. To encode only the non utf-8 characters to Unicode.

function toUnicode(word){
       let array = word.split("");
       array =  array.map((character)=>{
                if(character.match(/[^a-zA-Z]/g)){
                    let conversion = "000" + character.charCodeAt(0).toString(16)
                    return "\\u" + conversion;
                 }
                 return character;
});
return array.join("")
}
Post a comment

comment list (0)

  1. No comments so far