💻 Developer#UTF-8#CP949#한글깨짐#인코딩

UTF-8 vs CP949: Why Korean Text Gets Garbled

5 min read · Last updated: 2026-05-08

What is character encoding?

Computers store characters as numbers (bytes). A character encoding is the mapping that defines which number corresponds to which character. The same bytes will display as completely different characters if read with the wrong encoding.

UTF-8 vs CP949

AttributeUTF-8CP949 (extended EUC-KR)
CoverageAll Unicode characters worldwideKorean + some Western European
Korean bytes/char3 bytes2 bytes
StandardUnicode (international)Microsoft extension
Typical environmentWeb, Linux, macOSLegacy Windows apps
BOMOptionalNone

Why does Mojibake (garbled text) occur?

When a CP949-encoded file is opened as UTF-8, the 2-byte Korean character sequences do not conform to UTF-8 rules, so they are replaced with replacement characters (like ??? or ). This phenomenon is called mojibake (文字化け).

Example: "가" in CP949 = 0xB0 0xA1 → when misread as UTF-8, produces garbled output.

Detecting and converting encodings

  1. Check encoding: Most text editors show the current encoding in the status bar. On Linux/macOS, the file command or Python's chardet library can detect it programmatically.
  2. Convert: Editors typically offer "Save with encoding" options. On the command line: iconv -f CP949 -t UTF-8 input.txt > output.txt.
  3. Web development: The <meta charset="UTF-8"> declaration must match the actual file encoding, otherwise browsers will misinterpret the file.

Key takeaways

  • Encoding mismatch is the root cause of garbled Korean text.
  • Always use UTF-8 for new projects — it is the international standard with the broadest compatibility.
  • Develop a habit of checking the encoding before opening files.
  • Convert CP949 ↔ UTF-8 with iconv or your editor's encoding conversion feature.

Related Tools

📄
Character Encoding Tool
Convert text to UTF-8 Hex, Unicode code points, URL encoding, Base64, and more.
🔤
Text ↔ Hex Converter
Encode text as UTF-8 hex bytes or decode hex bytes back to readable text.

You might also like

Reading Hex Logs: A Practical Debugging GuideBinary and Hexadecimal Basics for Debugging