Alert

์ด ๊ธ€์€ Claude Code์˜ ๋„์›€์„ ๋ฐ›์•„ ์ž‘์„ฑ๋˜์—ˆ์Šต๋‹ˆ๋‹ค

TL;DR

  • ์ง๋ ฌํ™”(Serialization)๋Š” ๋ฉ”๋ชจ๋ฆฌ์— ์žˆ๋Š” ๊ฐ์ฒด๋ฅผ ์ €์žฅํ•˜๊ฑฐ๋‚˜ ์ „์†กํ•  ์ˆ˜ ์žˆ๋Š” ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๊ฒƒ
  • JSON์€ ์‚ฌ๋žŒ์ด ์ฝ์„ ์ˆ˜ ์žˆ๊ณ  ์–ธ์–ด ๊ฐ„ ํ˜ธํ™˜์ด ๋˜๋Š” ํ…์ŠคํŠธ ํฌ๋งท, pickle์€ Python ์ „์šฉ ๋ฐ”์ด๋„ˆ๋ฆฌ ํฌ๋งท
  • ์™ธ๋ถ€ ์‹œ์Šคํ…œ๊ณผ ํ†ต์‹ ํ•˜๋ฉด JSON, Python ๋‚ด๋ถ€ ์ €์žฅ์ด๋ฉด pickle, ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ์ด๋ฉด Avro/Protobuf

Sources


1. ์ง๋ ฌํ™”๋ž€

์™œ ํ•„์š”ํ•œ๊ฐ€

Python์—์„œ ๊ฐ์ฒด๋ฅผ ๋งŒ๋“ค๋ฉด ๋ฉ”๋ชจ๋ฆฌ์— ์กด์žฌํ•œ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ ์ด ๊ฐ์ฒด๋ฅผ ํŒŒ์ผ์— ์ €์žฅํ•˜๊ฑฐ๋‚˜, ๋„คํŠธ์›Œํฌ๋กœ ๋ณด๋‚ด๊ฑฐ๋‚˜, ๋‹ค๋ฅธ ํ”„๋กœ์„ธ์Šค์— ์ „๋‹ฌํ•˜๋ ค๋ฉด ๋ฉ”๋ชจ๋ฆฌ ๋ฐ”๊นฅ์œผ๋กœ ๊บผ๋‚ด์•ผ ํ•œ๋‹ค. ๋ฌธ์ œ๋Š” ๋ฉ”๋ชจ๋ฆฌ ์† ๊ฐ์ฒด๋Š” ํฌ์ธํ„ฐ, ์ฐธ์กฐ, ๋‚ด๋ถ€ ๊ตฌ์กฐ๊ฐ€ ๋’ค์„ž์—ฌ ์žˆ์–ด์„œ ๊ทธ ์ƒํƒœ ๊ทธ๋Œ€๋กœ๋Š” ์ €์žฅํ•˜๊ฑฐ๋‚˜ ์ „์†กํ•  ์ˆ˜ ์—†๋‹ค๋Š” ๊ฒƒ์ด๋‹ค.

์ง๋ ฌํ™”(Serialization) ๋Š” ์ด ๋ฉ”๋ชจ๋ฆฌ ์† ๊ฐ์ฒด๋ฅผ ์—ฐ์†๋œ ๋ฐ”์ดํŠธ ๋˜๋Š” ๋ฌธ์ž์—ด๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๊ณผ์ •์ด๋‹ค. ๋ฐ˜๋Œ€๋กœ ๋ฐ”์ดํŠธ/๋ฌธ์ž์—ด์—์„œ ๊ฐ์ฒด๋ฅผ ๋ณต์›ํ•˜๋Š” ๊ฒƒ์„ ์—ญ์ง๋ ฌํ™”(Deserialization) ๋ผ๊ณ  ํ•œ๋‹ค.

๋ฉ”๋ชจ๋ฆฌ ์† ๊ฐ์ฒด โ†’ [์ง๋ ฌํ™”] โ†’ ๋ฐ”์ดํŠธ/๋ฌธ์ž์—ด โ†’ [์—ญ์ง๋ ฌํ™”] โ†’ ๋ฉ”๋ชจ๋ฆฌ ์† ๊ฐ์ฒด

์ง๋ ฌํ™”์˜ ๋‹ค๋ฅธ ์ด๋ฆ„๋“ค

๊ฐ™์€ ๊ฐœ๋…์„ ๋ฌธ๋งฅ์— ๋”ฐ๋ผ ๋‹ค๋ฅด๊ฒŒ ๋ถ€๋ฅธ๋‹ค.

  • Serialization / Deserialization โ€” ๊ฐ€์žฅ ์ผ๋ฐ˜์ ์ธ ํ‘œํ˜„
  • Marshalling / Unmarshalling โ€” RPC, ๋„คํŠธ์›Œํฌ ํ†ต์‹ ์—์„œ ์ฃผ๋กœ ์‚ฌ์šฉ
  • Pickling / Unpickling โ€” Python ๊ณ ์œ  ํ‘œํ˜„
  • Encoding / Decoding โ€” JSON ๋“ฑ ํ…์ŠคํŠธ ํฌ๋งท์—์„œ ์ฃผ๋กœ ์‚ฌ์šฉ
์ง๋ ฌํ™”๊ฐ€ ์—†์œผ๋ฉด ์–ด๋–ป๊ฒŒ ๋˜๋‚˜
user = {"name": "alice", "age": 30, "scores": [95, 87, 92]}
 
# ํŒŒ์ผ์— ์ €์žฅํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด?
with open("user.txt", "w") as f:
    f.write(str(user))  # "{'name': 'alice', 'age': 30, 'scores': [95, 87, 92]}"
 
# ๋‹ค์‹œ ์ฝ์œผ๋ฉด?
with open("user.txt", "r") as f:
    data = f.read()
    print(type(data))  # <class 'str'> โ€” dict๊ฐ€ ์•„๋‹ˆ๋ผ ๋ฌธ์ž์—ด์ด๋‹ค
    # eval(data)๋กœ ๋ณต์›ํ•  ์ˆ˜๋Š” ์žˆ์ง€๋งŒ ๋ณด์•ˆ์ƒ ์ ˆ๋Œ€ ํ•˜๋ฉด ์•ˆ ๋œ๋‹ค

str()๋กœ ๋ณ€ํ™˜ํ•˜๋ฉด ํ˜•ํƒœ๋งŒ ๋น„์Šทํ•  ๋ฟ ํƒ€์ž… ์ •๋ณด๊ฐ€ ์‚ฌ๋ผ์ง„๋‹ค. ์ง๋ ฌํ™”๋Š” ์ด ๋ฌธ์ œ๋ฅผ ํƒ€์ž…๊ณผ ๊ตฌ์กฐ๋ฅผ ๋ณด์กดํ•˜๋ฉด์„œ ํ•ด๊ฒฐํ•œ๋‹ค.


2. ์ง๋ ฌํ™”๊ฐ€ ์“ฐ์ด๋Š” ๊ณณ

์ง๋ ฌํ™”๋Š” ๋ฐ์ดํ„ฐ๊ฐ€ ๋ฉ”๋ชจ๋ฆฌ ๋ฐ”๊นฅ์œผ๋กœ ๋‚˜๊ฐ€๋Š” ๊ฑฐ์˜ ๋ชจ๋“  ๊ณณ์—์„œ ์“ฐ์ธ๋‹ค.

์ƒํ™ฉ์ง๋ ฌํ™” ๋Œ€์ƒ์ฃผ๋กœ ์“ฐ๋Š” ํฌ๋งท
REST API ์š”์ฒญ/์‘๋‹ตdict, ๋ฆฌ์ŠคํŠธ โ†’ HTTP bodyJSON
ํŒŒ์ผ ์ €์žฅ (์„ค์ •, ์ƒํƒœ)๊ฐ์ฒด โ†’ ํŒŒ์ผJSON, YAML, pickle
๋ฉ”์‹œ์ง€ ํ (Kafka, RabbitMQ)๋ฉ”์‹œ์ง€ ๊ฐ์ฒด โ†’ ๋ฐ”์ดํŠธJSON, Avro, Protobuf
์บ์‹œ (Redis, Memcached)๊ฐ์ฒด โ†’ ๋ฐ”์ดํŠธpickle, JSON, msgpack
DB ์ €์žฅ (BLOB, JSON ์ปฌ๋Ÿผ)๊ฐ์ฒด โ†’ ๋ฐ”์ดํŠธ/๋ฌธ์ž์—ดJSON, pickle
RPC (gRPC, Thrift)ํ•จ์ˆ˜ ์ธ์ž/๋ฐ˜ํ™˜๊ฐ’ โ†’ ๋ฐ”์ดํŠธProtobuf, Thrift
ํ”„๋กœ์„ธ์Šค ๊ฐ„ ํ†ต์‹  (IPC)๊ฐ์ฒด โ†’ ๋ฐ”์ดํŠธpickle
ML ๋ชจ๋ธ ์ €์žฅํ•™์Šต๋œ ๋ชจ๋ธ โ†’ ํŒŒ์ผpickle, joblib, ONNX

DE ๊ด€์ ์—์„œ์˜ ์ง๋ ฌํ™”

๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ์—์„œ๋Š” ์ง๋ ฌํ™” ํฌ๋งท ์„ ํƒ์ด ์„ฑ๋Šฅ๊ณผ ํ˜ธํ™˜์„ฑ์— ์ง๊ฒฐ๋œ๋‹ค. Kafka์— JSON์œผ๋กœ ๋ฉ”์‹œ์ง€๋ฅผ ๋„ฃ์œผ๋ฉด ์‚ฌ๋žŒ์ด ์ฝ๊ธฐ ์‰ฝ์ง€๋งŒ ๋А๋ฆฌ๊ณ  ํฌ๋‹ค. Avro๋‚˜ Protobuf๋ฅผ ์“ฐ๋ฉด ์Šคํ‚ค๋งˆ๊ฐ€ ๊ฐ•์ œ๋˜๊ณ  ๋ฐ”์ด๋„ˆ๋ฆฌ๋ผ ๋น ๋ฅด์ง€๋งŒ ๋””๋ฒ„๊น…์ด ์–ด๋ ต๋‹ค.


3. ์ง๋ ฌํ™” ํฌ๋งท ๋น„๊ต

ํ…์ŠคํŠธ ํฌ๋งท
ํฌ๋งทํŠน์ง•์žฅ์ ๋‹จ์ 
JSONkey-value ๊ตฌ์กฐ, ์›น ํ‘œ์ค€์‚ฌ๋žŒ์ด ์ฝ๊ธฐ ์‰ฌ์›€, ์–ธ์–ด ๊ฐ„ ํ˜ธํ™˜๋ฐ”์ด๋„ˆ๋ฆฌ ๋ฐ์ดํ„ฐ ๋ถˆ๊ฐ€, ๋А๋ฆผ
XMLํƒœ๊ทธ ๊ธฐ๋ฐ˜, ์—”ํ„ฐํ”„๋ผ์ด์ฆˆ์Šคํ‚ค๋งˆ ๊ฒ€์ฆ(XSD), ๋„ค์ž„์ŠคํŽ˜์ด์Šค์žฅํ™ฉํ•จ, ํŒŒ์‹ฑ ๋А๋ฆผ
YAML๋“ค์—ฌ์“ฐ๊ธฐ ๊ธฐ๋ฐ˜๊ฐ€๋…์„ฑ ์ตœ๊ณ , ์„ค์ • ํŒŒ์ผ์— ์ ํ•ฉํŒŒ์‹ฑ ๋А๋ฆผ, ์ŠคํŽ™์ด ๋ณต์žก
CSVํ–‰/์—ด ๊ตฌ์กฐ๋‹จ์ˆœ, ์Šคํ”„๋ ˆ๋“œ์‹œํŠธ ํ˜ธํ™˜์ค‘์ฒฉ ๊ตฌ์กฐ ๋ถˆ๊ฐ€, ํƒ€์ž… ์ •๋ณด ์—†์Œ
๋ฐ”์ด๋„ˆ๋ฆฌ ํฌ๋งท
ํฌ๋งทํŠน์ง•์žฅ์ ๋‹จ์ 
picklePython ์ „์šฉPython ๊ฐ์ฒด ๊ฑฐ์˜ ์ „๋ถ€ ์ง€์›Python์—์„œ๋งŒ ์‚ฌ์šฉ ๊ฐ€๋Šฅ, ๋ณด์•ˆ ์ทจ์•ฝ
ProtobufGoogle, ์Šคํ‚ค๋งˆ ํ•„์ˆ˜ (.proto)๋น ๋ฆ„, ์–ธ์–ด ๊ฐ„ ํ˜ธํ™˜, ํƒ€์ž… ์•ˆ์ „์Šคํ‚ค๋งˆ ์ •์˜ ํ•„์š”, ์‚ฌ๋žŒ์ด ์ฝ๊ธฐ ๋ถˆ๊ฐ€
AvroApache, ์Šคํ‚ค๋งˆ ๋‚ด์žฅ์Šคํ‚ค๋งˆ ์ง„ํ™” ์ง€์›, Hadoop ์ƒํƒœ๊ณ„Java ์ค‘์‹ฌ ์ƒํƒœ๊ณ„
MessagePackJSON๊ณผ ์œ ์‚ฌํ•œ ๊ตฌ์กฐJSON๋ณด๋‹ค ๋น ๋ฅด๊ณ  ์ž‘์Œ์Šคํ‚ค๋งˆ ์—†์Œ

์„ ํƒ ๊ธฐ์ค€

  • ์‚ฌ๋žŒ์ด ์ฝ์–ด์•ผ ํ•œ๋‹ค โ†’ JSON, YAML
  • Python ๋‚ด๋ถ€์—์„œ๋งŒ ์“ด๋‹ค โ†’ pickle
  • ์–ธ์–ด ๊ฐ„ ํ˜ธํ™˜ + ์„ฑ๋Šฅ โ†’ Protobuf, MessagePack
  • ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ โ†’ Avro, Protobuf

4. Python json ๋ชจ๋“ˆ

JSON์€ ๊ฐ€์žฅ ๋„๋ฆฌ ์“ฐ์ด๋Š” ํ…์ŠคํŠธ ์ง๋ ฌํ™” ํฌ๋งท์ด๋‹ค. Python์˜ json ๋ชจ๋“ˆ์€ ํ‘œ์ค€ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์— ํฌํ•จ๋˜์–ด ์žˆ๋‹ค.

๊ธฐ๋ณธ ์‚ฌ์šฉ๋ฒ•
import json
 
# ์ง๋ ฌํ™” (Python โ†’ JSON ๋ฌธ์ž์—ด)
data = {"name": "alice", "age": 30, "scores": [95, 87, 92]}
json_str = json.dumps(data)
print(json_str)  # {"name": "alice", "age": 30, "scores": [95, 87, 92]}
 
# ์—ญ์ง๋ ฌํ™” (JSON ๋ฌธ์ž์—ด โ†’ Python)
restored = json.loads(json_str)
print(restored["name"])  # alice
print(type(restored))    # <class 'dict'>
ํŒŒ์ผ ์ž…์ถœ๋ ฅ
# ํŒŒ์ผ์— ์ €์žฅ
with open("data.json", "w") as f:
    json.dump(data, f, indent=2, ensure_ascii=False)
 
# ํŒŒ์ผ์—์„œ ์ฝ๊ธฐ
with open("data.json", "r") as f:
    loaded = json.load(f)

dump/load๋Š” ํŒŒ์ผ ๊ฐ์ฒด๋ฅผ ๋ฐ›๊ณ , dumps/loads๋Š” ๋ฌธ์ž์—ด์„ ๋‹ค๋ฃฌ๋‹ค. s๋Š” string์˜ ์•ฝ์ž๋‹ค.

Python โ†” JSON ํƒ€์ž… ๋ณ€ํ™˜ ๊ทœ์น™
PythonJSON๋น„๊ณ 
dictobject
list, tuplearraytuple์€ list๋กœ ๋ณ€ํ™˜๋จ (๋ณต์› ์‹œ ๊ตฌ๋ถ„ ๋ถˆ๊ฐ€)
strstring
int, floatnumber
True / Falsetrue / false
Nonenull

์ด ์™ธ์˜ ํƒ€์ž…(datetime, set, ์ปค์Šคํ…€ ํด๋ž˜์Šค ๋“ฑ)์€ ๊ธฐ๋ณธ์ ์œผ๋กœ ์ง๋ ฌํ™”ํ•  ์ˆ˜ ์—†๋‹ค.

from datetime import datetime
 
json.dumps({"now": datetime.now()})
# TypeError: Object of type datetime is not JSON serializable
์ปค์Šคํ…€ ๊ฐ์ฒด ์ง๋ ฌํ™” โ€” default ํŒŒ๋ผ๋ฏธํ„ฐ

default ํŒŒ๋ผ๋ฏธํ„ฐ์— ๋ณ€ํ™˜ ํ•จ์ˆ˜๋ฅผ ๋„˜๊ธฐ๋ฉด ๊ธฐ๋ณธ ์ง€์›ํ•˜์ง€ ์•Š๋Š” ํƒ€์ž…๋„ ์ง๋ ฌํ™”ํ•  ์ˆ˜ ์žˆ๋‹ค.

from datetime import datetime, date
 
def json_default(obj):
    if isinstance(obj, (datetime, date)):
        return obj.isoformat()
    if isinstance(obj, set):
        return list(obj)
    raise TypeError(f"์ง๋ ฌํ™” ๋ถˆ๊ฐ€: {type(obj)}")
 
data = {
    "created": datetime(2026, 4, 12, 15, 30),
    "tags": {"python", "serialization"},
}
 
json_str = json.dumps(data, default=json_default, ensure_ascii=False)
print(json_str)
# {"created": "2026-04-12T15:30:00", "tags": ["python", "serialization"]}
์ปค์Šคํ…€ ์—ญ์ง๋ ฌํ™” โ€” object_hook

JSON์„ ์ฝ์„ ๋•Œ dict๋ฅผ ์›ํ•˜๋Š” ๊ฐ์ฒด๋กœ ์ž๋™ ๋ณ€ํ™˜ํ•  ์ˆ˜ ์žˆ๋‹ค.

from dataclasses import dataclass
 
@dataclass
class User:
    name: str
    age: int
 
def as_user(dct):
    if "name" in dct and "age" in dct:
        return User(**dct)
    return dct
 
json_str = '{"name": "alice", "age": 30}'
user = json.loads(json_str, object_hook=as_user)
print(user)        # User(name='alice', age=30)
print(type(user))  # <class 'User'>
pretty-print ์˜ต์…˜
data = {"users": [{"name": "alice", "age": 30}, {"name": "bob", "age": 25}]}
 
# ๊ธฐ๋ณธ โ€” ํ•œ ์ค„
json.dumps(data)
 
# indent๋กœ ๋ณด๊ธฐ ์ข‹๊ฒŒ
print(json.dumps(data, indent=2, ensure_ascii=False))
# {
#   "users": [
#     {
#       "name": "alice",
#       "age": 30
#     },
#     ...
#   ]
# }
 
# separators๋กœ ๊ณต๋ฐฑ ์ œ๊ฑฐ (์ „์†ก ์‹œ ํฌ๊ธฐ ์ ˆ์•ฝ)
json.dumps(data, separators=(",", ":"))
# {"users":[{"name":"alice","age":30},{"name":"bob","age":25}]}

5. Python pickle ๋ชจ๋“ˆ

pickle์€ Python ์ „์šฉ ๋ฐ”์ด๋„ˆ๋ฆฌ ์ง๋ ฌํ™” ํฌ๋งท์ด๋‹ค. JSON๊ณผ ๋‹ฌ๋ฆฌ Python ๊ฐ์ฒด๋ฅผ ๊ฑฐ์˜ ๊ทธ๋Œ€๋กœ ์ €์žฅํ•˜๊ณ  ๋ณต์›ํ•  ์ˆ˜ ์žˆ๋‹ค.

๊ธฐ๋ณธ ์‚ฌ์šฉ๋ฒ•
import pickle
 
data = {"name": "alice", "scores": [95, 87, 92], "active": True}
 
# ์ง๋ ฌํ™” (Python โ†’ ๋ฐ”์ดํŠธ)
pickled = pickle.dumps(data)
print(type(pickled))  # <class 'bytes'>
 
# ์—ญ์ง๋ ฌํ™” (๋ฐ”์ดํŠธ โ†’ Python)
restored = pickle.loads(pickled)
print(restored)  # {'name': 'alice', 'scores': [95, 87, 92], 'active': True}
ํŒŒ์ผ ์ž…์ถœ๋ ฅ
# ํŒŒ์ผ์— ์ €์žฅ (๋ฐ”์ด๋„ˆ๋ฆฌ ๋ชจ๋“œ)
with open("data.pkl", "wb") as f:
    pickle.dump(data, f)
 
# ํŒŒ์ผ์—์„œ ์ฝ๊ธฐ
with open("data.pkl", "rb") as f:
    loaded = pickle.load(f)
pickle์ด JSON๋ณด๋‹ค ๊ฐ•๋ ฅํ•œ ์ 

pickle์€ Python ๊ฐ์ฒด์˜ ํƒ€์ž…๊ณผ ๊ตฌ์กฐ๋ฅผ ๊ทธ๋Œ€๋กœ ๋ณด์กดํ•œ๋‹ค.

from datetime import datetime
from collections import defaultdict
from dataclasses import dataclass
 
@dataclass
class User:
    name: str
    age: int
 
data = {
    "user": User("alice", 30),
    "created": datetime(2026, 4, 12),
    "counter": defaultdict(int, {"a": 1, "b": 2}),
    "numbers": {1, 2, 3},
}
 
# pickle์€ ์ด ๋ชจ๋“  ๊ฒƒ์„ ๊ทธ๋Œ€๋กœ ์ง๋ ฌํ™”/์—ญ์ง๋ ฌํ™”ํ•  ์ˆ˜ ์žˆ๋‹ค
pickled = pickle.dumps(data)
restored = pickle.loads(pickled)
 
print(type(restored["user"]))     # <class 'User'> โ€” dataclass ๊ทธ๋Œ€๋กœ
print(type(restored["created"]))  # <class 'datetime.datetime'>
print(type(restored["counter"]))  # <class 'collections.defaultdict'>
print(type(restored["numbers"]))  # <class 'set'>

JSON์ด์—ˆ๋‹ค๋ฉด User๋Š” dict๋กœ, datetime์€ ์—๋Ÿฌ๋กœ, set์€ list๋กœ ๋ณ€ํ™˜๋์„ ๊ฒƒ์ด๋‹ค.

์ง๋ ฌํ™” ๊ฐ€๋Šฅํ•œ/๋ถˆ๊ฐ€๋Šฅํ•œ ๊ฒƒ
# โœ… ์ง๋ ฌํ™” ๊ฐ€๋Šฅ
pickle.dumps(42)                        # ์ˆซ์ž
pickle.dumps("hello")                   # ๋ฌธ์ž์—ด
pickle.dumps([1, 2, 3])                 # ๋ฆฌ์ŠคํŠธ
pickle.dumps({"key": "value"})          # ๋”•์…”๋„ˆ๋ฆฌ
pickle.dumps(User("alice", 30))         # ํด๋ž˜์Šค ์ธ์Šคํ„ด์Šค
pickle.dumps(len)                       # ๋‚ด์žฅ ํ•จ์ˆ˜
 
# โŒ ์ง๋ ฌํ™” ๋ถˆ๊ฐ€๋Šฅ
pickle.dumps(lambda x: x + 1)          # TypeError โ€” ๋žŒ๋‹ค
pickle.dumps(open("test.txt"))          # TypeError โ€” ํŒŒ์ผ ๊ฐ์ฒด
 
# โš ๏ธ ์ฃผ์˜: ํ•จ์ˆ˜/ํด๋ž˜์Šค๋Š” "์ด๋ฆ„"์œผ๋กœ ์ €์žฅ๋œ๋‹ค
# ์—ญ์ง๋ ฌํ™”ํ•  ๋•Œ ํ•ด๋‹น ๋ชจ๋“ˆ์—์„œ import ๊ฐ€๋Šฅํ•ด์•ผ ํ•œ๋‹ค
๋ณด์•ˆ ์ฃผ์˜์‚ฌํ•ญ

pickle์€ ์‹ ๋ขฐํ•  ์ˆ˜ ์—†๋Š” ๋ฐ์ดํ„ฐ์— ์‚ฌ์šฉํ•˜๋ฉด ์•ˆ ๋œ๋‹ค

pickle์€ ์—ญ์ง๋ ฌํ™” ๊ณผ์ •์—์„œ ์ž„์˜ ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค. ์•…์˜์ ์œผ๋กœ ์กฐ์ž‘๋œ pickle ๋ฐ์ดํ„ฐ๋ฅผ loads()ํ•˜๋ฉด ์‹œ์Šคํ…œ ๋ช…๋ น์ด ์‹คํ–‰๋  ์ˆ˜ ์žˆ๋‹ค.

# ์ด๋Ÿฐ ๊ณต๊ฒฉ์ด ๊ฐ€๋Šฅํ•˜๋‹ค (์‹คํ–‰ํ•˜์ง€ ๋ง ๊ฒƒ)
import pickle
malicious = b"cos\nsystem\n(S'rm -rf /'\ntR."
pickle.loads(malicious)  # os.system('rm -rf /') ์‹คํ–‰
  • ์™ธ๋ถ€์—์„œ ๋ฐ›์€ ๋ฐ์ดํ„ฐ๋Š” ์ ˆ๋Œ€ pickle๋กœ ์—ญ์ง๋ ฌํ™”ํ•˜์ง€ ์•Š๋Š”๋‹ค
  • API ํ†ต์‹ , ์‚ฌ์šฉ์ž ์ž…๋ ฅ ๋“ฑ์—๋Š” JSON์„ ์‚ฌ์šฉํ•œ๋‹ค
  • pickle์€ ๋‚ด๊ฐ€ ๋งŒ๋“  ๋ฐ์ดํ„ฐ๋ฅผ ๋‚ด๊ฐ€ ์“ฐ๋Š” ๊ฒฝ์šฐ์—๋งŒ ์•ˆ์ „ํ•˜๋‹ค
ํ”„๋กœํ† ์ฝœ ๋ฒ„์ „

pickle์—๋Š” ์—ฌ๋Ÿฌ ํ”„๋กœํ† ์ฝœ ๋ฒ„์ „์ด ์žˆ๋‹ค. ๋ฒ„์ „์ด ๋†’์„์ˆ˜๋ก ํšจ์œจ์ ์ด๋‹ค.

# ์ตœ์‹  ํ”„๋กœํ† ์ฝœ ์‚ฌ์šฉ (๊ถŒ์žฅ)
pickle.dumps(data, protocol=pickle.HIGHEST_PROTOCOL)
 
# ๊ธฐ๋ณธ ํ”„๋กœํ† ์ฝœ ํ™•์ธ
print(pickle.DEFAULT_PROTOCOL)  # 5 (Python 3.14 ๊ธฐ์ค€)
๋ฒ„์ „PythonํŠน์ง•
02.xํ…์ŠคํŠธ ๋ชจ๋“œ, ๋””๋ฒ„๊น…์šฉ
22.3+new-style ํด๋ž˜์Šค ์ง€์›
43.4+๋Œ€์šฉ๋Ÿ‰ ๊ฐ์ฒด ์ง€์›
53.8+out-of-band ๋ฒ„ํผ, ํ˜„์žฌ ๊ธฐ๋ณธ๊ฐ’

6. ๋ฐ”์ด๋„ˆ๋ฆฌ ์ง๋ ฌํ™” โ€” Avro, Protobuf

JSON์ด๋‚˜ pickle๊ณผ ๋‹ฌ๋ฆฌ, Avro์™€ Protobuf๋Š” ์Šคํ‚ค๋งˆ๋ฅผ ๋จผ์ € ์ •์˜ํ•˜๊ณ  ๊ทธ ์Šคํ‚ค๋งˆ์— ๋งž์ถฐ ์ง๋ ฌํ™”ํ•œ๋‹ค. ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ์—์„œ ๋ฉ”์‹œ์ง€ ํฌ๋งท์„ ๊ฐ•์ œํ•˜๊ณ  ์‹ถ์„ ๋•Œ ์‚ฌ์šฉํ•œ๋‹ค.

Avro

Apache Avro๋Š” Hadoop ์ƒํƒœ๊ณ„์—์„œ ํƒ„์ƒํ•œ ์ง๋ ฌํ™” ํฌ๋งท์ด๋‹ค. ์Šคํ‚ค๋งˆ๊ฐ€ ๋ฐ์ดํ„ฐ ํŒŒ์ผ์— ๋‚ด์žฅ๋˜์–ด์„œ ๋ณ„๋„ ์ฝ”๋“œ ์ƒ์„ฑ ์—†์ด ์ฝ๊ณ  ์“ธ ์ˆ˜ ์žˆ๋‹ค.

pip install avro

์Šคํ‚ค๋งˆ๋ฅผ JSON์œผ๋กœ ์ •์˜ํ•œ๋‹ค. (user.avsc)

{
  "namespace": "example.avro",
  "type": "record",
  "name": "User",
  "fields": [
    {"name": "name", "type": "string"},
    {"name": "age", "type": "int"},
    {"name": "email", "type": ["string", "null"]}
  ]
}
import avro.schema
from avro.datafile import DataFileWriter, DataFileReader
from avro.io import DatumWriter, DatumReader
 
# ์Šคํ‚ค๋งˆ ๋กœ๋“œ
schema = avro.schema.parse(open("user.avsc", "rb").read())
 
# ์ง๋ ฌํ™” โ€” Avro ํŒŒ์ผ์— ์“ฐ๊ธฐ
writer = DataFileWriter(open("users.avro", "wb"), DatumWriter(), schema)
writer.append({"name": "alice", "age": 30, "email": "alice@example.com"})
writer.append({"name": "bob", "age": 25, "email": None})
writer.close()
 
# ์—ญ์ง๋ ฌํ™” โ€” Avro ํŒŒ์ผ์—์„œ ์ฝ๊ธฐ
# DatumReader()์— ์Šคํ‚ค๋งˆ๋ฅผ ๋„˜๊ธฐ์ง€ ์•Š์•„๋„ ๋œ๋‹ค โ€” ํŒŒ์ผ ํ—ค๋”์— ์Šคํ‚ค๋งˆ๊ฐ€ ๋“ค์–ด์žˆ๊ธฐ ๋•Œ๋ฌธ
reader = DataFileReader(open("users.avro", "rb"), DatumReader())
for user in reader:
    print(user)
# {'name': 'alice', 'age': 30, 'email': 'alice@example.com'}
# {'name': 'bob', 'age': 25, 'email': None}
reader.close()

Avro๋Š” ์Šคํ‚ค๋งˆ๋ฅผ ํŒŒ์ผ์— ๋‚ด์žฅํ•œ๋‹ค

Avro ํŒŒ์ผ(.avro)์˜ ๊ตฌ์กฐ๋Š” ์ด๋ ‡๊ฒŒ ์ƒ๊ฒผ๋‹ค:

[ํŒŒ์ผ ํ—ค๋”: ์Šคํ‚ค๋งˆ(JSON) + ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ]
[๋ฐ์ดํ„ฐ ๋ธ”๋ก 1]
[๋ฐ์ดํ„ฐ ๋ธ”๋ก 2]
...

์“ธ ๋•Œ ์Šคํ‚ค๋งˆ๋ฅผ ์ง€์ •ํ•˜๋ฉด, ๊ทธ ์Šคํ‚ค๋งˆ๊ฐ€ ํŒŒ์ผ ํ—ค๋”์— ํ•จ๊ป˜ ์ €์žฅ๋œ๋‹ค. ๊ทธ๋ž˜์„œ ์ฝ์„ ๋•Œ๋Š” ํŒŒ์ผ๋งŒ ์žˆ์œผ๋ฉด ์Šคํ‚ค๋งˆ๋ฅผ ๊บผ๋‚ด์„œ ์—ญ์ง๋ ฌํ™”ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋ณ„๋„์˜ ์Šคํ‚ค๋งˆ ํŒŒ์ผ(.avsc)์ด๋‚˜ ์ƒ์„ฑ๋œ ์ฝ”๋“œ ์—†์ด๋„ ๋ˆ„๊ตฌ๋“  .avro ํŒŒ์ผ๋งŒ์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ฝ์„ ์ˆ˜ ์žˆ๋‹ค๋Š” ๋œป์ด๋‹ค.

Protobuf๋Š” ๋ฐ˜๋Œ€๋‹ค. ๋ฐ”์ด๋„ˆ๋ฆฌ์— ์Šคํ‚ค๋งˆ๊ฐ€ ์—†์–ด์„œ, ์ฝ๋Š” ์ชฝ๋„ .proto๋กœ ์ƒ์„ฑํ•œ ์ฝ”๋“œ(user_pb2.py)๋ฅผ ๊ฐ–๊ณ  ์žˆ์–ด์•ผ ์—ญ์ง๋ ฌํ™”ํ•  ์ˆ˜ ์žˆ๋‹ค.

Avro์˜ ๊ธฐํƒ€ ํŠน์ง•

  • ์Šคํ‚ค๋งˆ ์ง„ํ™”(evolution)๋ฅผ ์ง€์›ํ•œ๋‹ค โ€” ํ•„๋“œ ์ถ”๊ฐ€/์‚ญ์ œ ์‹œ ํ˜ธํ™˜์„ฑ ์œ ์ง€ ๊ฐ€๋Šฅ
  • Kafka + Schema Registry ์กฐํ•ฉ์—์„œ ๋ฉ”์‹œ์ง€ ํฌ๋งท์œผ๋กœ ๋งŽ์ด ์‚ฌ์šฉ๋œ๋‹ค
Protobuf (Protocol Buffers)

Google์ด ๋งŒ๋“  ์ง๋ ฌํ™” ํฌ๋งท์ด๋‹ค. .proto ํŒŒ์ผ๋กœ ์Šคํ‚ค๋งˆ๋ฅผ ์ •์˜ํ•˜๊ณ , ์ปดํŒŒ์ผ๋Ÿฌ(protoc)๋กœ Python ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•ด์„œ ์‚ฌ์šฉํ•œ๋‹ค.

pip install protobuf
# protoc ์ปดํŒŒ์ผ๋Ÿฌ๋„ ๋ณ„๋„ ์„ค์น˜ ํ•„์š” (https://protobuf.dev)

์Šคํ‚ค๋งˆ๋ฅผ .proto ํŒŒ์ผ๋กœ ์ •์˜ํ•œ๋‹ค. (user.proto)

syntax = "proto3";
 
message User {
  string name = 1;
  int32 age = 2;
  string email = 3;
}

Python ์ฝ”๋“œ ์ƒ์„ฑ:

protoc --python_out=. user.proto
# user_pb2.py ํŒŒ์ผ์ด ์ƒ์„ฑ๋œ๋‹ค
import user_pb2
 
# ์ง๋ ฌํ™”
user = user_pb2.User()
user.name = "alice"
user.age = 30
user.email = "alice@example.com"
 
binary = user.SerializeToString()  # bytes
print(len(binary))  # ~30 bytes (JSON์ด๋ฉด ~60 bytes)
 
# ์—ญ์ง๋ ฌํ™”
restored = user_pb2.User()
restored.ParseFromString(binary)
print(restored.name)   # alice
print(restored.age)    # 30

JSON ๋ณ€ํ™˜๋„ ๊ฐ€๋Šฅํ•˜๋‹ค:

from google.protobuf import json_format
 
# Protobuf โ†’ JSON
json_str = json_format.MessageToJson(user)
print(json_str)  # {"name": "alice", "age": 30, "email": "alice@example.com"}
 
# JSON โ†’ Protobuf
user2 = user_pb2.User()
json_format.Parse(json_str, user2)

Protobuf์˜ ํŠน์ง•

  • .proto โ†’ protoc โ†’ Python ์ฝ”๋“œ ์ƒ์„ฑ ๊ณผ์ •์ด ํ•„์š”ํ•˜๋‹ค (Avro์™€์˜ ํ•ต์‹ฌ ์ฐจ์ด)
  • ๋ฐ”์ด๋„ˆ๋ฆฌ ํฌ๊ธฐ๊ฐ€ JSON ๋Œ€๋น„ ์ ˆ๋ฐ˜ ์ดํ•˜๋กœ ์ž‘๊ณ , ์ง๋ ฌํ™”/์—ญ์ง๋ ฌํ™” ์†๋„๊ฐ€ ๋น ๋ฅด๋‹ค
  • gRPC์˜ ๊ธฐ๋ณธ ์ง๋ ฌํ™” ํฌ๋งท์ด๋‹ค
Avro vs Protobuf
๊ธฐ์ค€AvroProtobuf
์Šคํ‚ค๋งˆ ์œ„์น˜๋ฐ์ดํ„ฐ ํŒŒ์ผ์— ๋‚ด์žฅ.proto ํŒŒ์ผ ๋ณ„๋„ ๊ด€๋ฆฌ
์ฝ”๋“œ ์ƒ์„ฑ๋ถˆํ•„์š” (๋™์  ์ฒ˜๋ฆฌ)ํ•„์ˆ˜ (protoc ์ปดํŒŒ์ผ)
์Šคํ‚ค๋งˆ ์ง„ํ™”๊ธฐ๋ณธ ์ง€์› (reader/writer schema ๋ถ„๋ฆฌ)์ง€์› (ํ•„๋“œ ๋ฒˆํ˜ธ ๊ธฐ๋ฐ˜)
์ฃผ ์‚ฌ์šฉ์ฒ˜Kafka, Hadoop, ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธgRPC, ๋งˆ์ดํฌ๋กœ์„œ๋น„์Šค ํ†ต์‹ 
์ƒํƒœ๊ณ„Java/Hadoop ์ค‘์‹ฌ์–ธ์–ด ์ค‘๋ฆฝ์ , Google ์ƒํƒœ๊ณ„

7. dataclass + ์ง๋ ฌํ™”

dataclass์™€ ์ง๋ ฌํ™”๋Š” ์ž์ฃผ ํ•จ๊ป˜ ์“ฐ์ธ๋‹ค. ๊ตฌ์กฐํ™”๋œ ๋ฐ์ดํ„ฐ๋ฅผ JSON์œผ๋กœ ๋ณ€ํ™˜ํ•˜๊ฑฐ๋‚˜, API ์‘๋‹ต์„ dataclass๋กœ ๋งคํ•‘ํ•˜๋Š” ํŒจํ„ด์ด ์ผ๋ฐ˜์ ์ด๋‹ค.

dataclass โ†’ JSON (asdict ํ™œ์šฉ)
from dataclasses import dataclass, asdict
import json
 
@dataclass
class Address:
    city: str
    zipcode: str
 
@dataclass
class User:
    name: str
    age: int
    address: Address
 
user = User("alice", 30, Address("์„œ์šธ", "06000"))
 
# asdict()๋กœ dict ๋ณ€ํ™˜ ํ›„ json.dumps()
json_str = json.dumps(asdict(user), ensure_ascii=False, indent=2)
print(json_str)
# {
#   "name": "alice",
#   "age": 30,
#   "address": {
#     "city": "์„œ์šธ",
#     "zipcode": "06000"
#   }
# }

์ค‘์ฒฉ๋œ dataclass๋„ asdict()๊ฐ€ ์žฌ๊ท€์ ์œผ๋กœ dict ๋ณ€ํ™˜ํ•ด์ฃผ๋ฏ€๋กœ ๊ทธ๋Œ€๋กœ json.dumps()์— ๋„ฃ์œผ๋ฉด ๋œ๋‹ค.

JSON โ†’ dataclass

์—ญ๋ฐฉํ–ฅ์€ ์ž๋™ ๋ณ€ํ™˜์ด ์—†์–ด์„œ ์ง์ ‘ ๋งคํ•‘ํ•ด์•ผ ํ•œ๋‹ค.

# ๋ฐฉ๋ฒ• 1: ๋‹จ์ˆœํ•œ ๊ฒฝ์šฐ โ€” dict unpacking
json_str = '{"name": "alice", "age": 30}'
data = json.loads(json_str)
user = User(**data, address=Address("์„œ์šธ", "06000"))
# ๋ฐฉ๋ฒ• 2: ์ค‘์ฒฉ ๊ตฌ์กฐ โ€” object_hook ํ™œ์šฉ
def from_json(json_str: str) -> User:
    data = json.loads(json_str)
    data["address"] = Address(**data["address"])
    return User(**data)
 
json_str = '{"name": "alice", "age": 30, "address": {"city": "์„œ์šธ", "zipcode": "06000"}}'
user = from_json(json_str)
print(user)  # User(name='alice', age=30, address=Address(city='์„œ์šธ', zipcode='06000'))
datetime์ด ํฌํ•จ๋œ dataclass
from dataclasses import dataclass, field, asdict
from datetime import datetime
import json
 
@dataclass
class Event:
    title: str
    start: datetime
    end: datetime
    attendees: list[str] = field(default_factory=list)
 
def json_default(obj):
    if isinstance(obj, datetime):
        return obj.isoformat()
    raise TypeError(f"์ง๋ ฌํ™” ๋ถˆ๊ฐ€: {type(obj)}")
 
event = Event(
    title="์Šคํ”„๋ฆฐํŠธ ๋ฆฌ๋ทฐ",
    start=datetime(2026, 4, 12, 14, 0),
    end=datetime(2026, 4, 12, 15, 0),
    attendees=["alice", "bob"],
)
 
json_str = json.dumps(asdict(event), default=json_default, ensure_ascii=False, indent=2)
print(json_str)
# {
#   "title": "์Šคํ”„๋ฆฐํŠธ ๋ฆฌ๋ทฐ",
#   "start": "2026-04-12T14:00:00",
#   "end": "2026-04-12T15:00:00",
#   "attendees": ["alice", "bob"]
# }
dataclass + pickle

pickle์€ dataclass๋ฅผ ๋ณ„๋„ ๋ณ€ํ™˜ ์—†์ด ๋ฐ”๋กœ ์ง๋ ฌํ™”ํ•  ์ˆ˜ ์žˆ๋‹ค.

import pickle
 
pickled = pickle.dumps(user)
restored = pickle.loads(pickled)
print(restored)         # User(name='alice', age=30, address=Address(city='์„œ์šธ', zipcode='06000'))
print(type(restored))   # <class 'User'>
print(restored == user) # True

JSON๊ณผ ๋‹ฌ๋ฆฌ ํƒ€์ž…์ด ๊ทธ๋Œ€๋กœ ๋ณด์กด๋˜๋ฏ€๋กœ ๋ณ€ํ™˜ ์ฝ”๋“œ๊ฐ€ ํ•„์š” ์—†๋‹ค. ๋‹ค๋งŒ pickle์˜ ๋ณด์•ˆ ์ฃผ์˜์‚ฌํ•ญ์€ ๊ทธ๋Œ€๋กœ ์ ์šฉ๋œ๋‹ค.


8. ์–ธ์ œ ๋ญ˜ ์“ธ๊นŒ

๊ธฐ์ค€JSONpickleProtobuf / Avro
์‚ฌ๋žŒ์ด ์ฝ์„ ์ˆ˜ ์žˆ๋Š”๊ฐ€OXX
์–ธ์–ด ๊ฐ„ ํ˜ธํ™˜OX (Python ์ „์šฉ)O
Python ๊ฐ์ฒด ๊ทธ๋Œ€๋กœ ๋ณด์กดX (ํƒ€์ž… ์†์‹ค)OX (์Šคํ‚ค๋งˆ ๋ณ€ํ™˜ ํ•„์š”)
๋ณด์•ˆ์•ˆ์ „์œ„ํ—˜ (์ž„์˜ ์ฝ”๋“œ ์‹คํ–‰)์•ˆ์ „
์†๋„๋ณดํ†ต๋น ๋ฆ„๋งค์šฐ ๋น ๋ฆ„
ํฌ๊ธฐํผ (ํ…์ŠคํŠธ)์ค‘๊ฐ„์ž‘์Œ (๋ฐ”์ด๋„ˆ๋ฆฌ)
์Šคํ‚ค๋งˆ ๊ฐ•์ œXXO
ํŒ๋‹จ ํ๋ฆ„
์™ธ๋ถ€ ์‹œ์Šคํ…œ๊ณผ ํ†ต์‹ ํ•˜๋Š”๊ฐ€?
โ”œโ”€โ”€ Yes โ†’ JSON (REST API, ์›น)
โ”‚         Protobuf (gRPC, ๊ณ ์„ฑ๋Šฅ)
โ”‚         Avro (Kafka, ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ)
โ””โ”€โ”€ No โ†’ Python ๋‚ด๋ถ€์—์„œ๋งŒ ์“ฐ๋Š”๊ฐ€?
          โ”œโ”€โ”€ Yes โ†’ pickle (๋ชจ๋ธ ์ €์žฅ, ์บ์‹œ, IPC)
          โ””โ”€โ”€ ์„ค์ • ํŒŒ์ผ โ†’ YAML, JSON

์‹ค๋ฌด ์กฐํ•ฉ ์˜ˆ์‹œ

  • ์›น API โ€” ์š”์ฒญ/์‘๋‹ต์€ JSON, ๋‚ด๋ถ€ ์บ์‹œ๋Š” pickle
  • ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ โ€” Kafka ๋ฉ”์‹œ์ง€๋Š” Avro, ์ค‘๊ฐ„ ๊ฒฐ๊ณผ ์บ์‹œ๋Š” pickle
  • ML ์„œ๋น™ โ€” ๋ชจ๋ธ ์ €์žฅ์€ pickle/joblib, API ์‘๋‹ต์€ JSON