Engineering

Codable migration patterns: schema evolution that doesn't lose user data

Adding a field to a Codable struct in Swift looks innocent. Done wrong, it eats your users' state. Here are five patterns that handle every kind of schema change cleanly.

Honam Kang4 min read

A user opens mq-dir 0.1.0 with state saved by 0.1.0-alpha.7. The app force-quits. They blame us. We blame ourselves.

This is the story of what we learned, the five Codable patterns we converged on, and the test pattern that prevents this from ever happening again.

The shape of the problem

mq-dir persists a WorkspaceState to ~/Library/Application Support/com.mqdir.app/state.json. The state has roughly:

struct WorkspaceState: Codable {
  let projects: [Project]
  let favorites: [Favorite]
  let activeProjectID: UUID
}

Each Project has a layout, four panes, tabs, etc. Total schema across the model is ~40 fields.

When you ship v0.1.0 and v0.1.1 adds a new field — say, paneIndex to track focus — the user's existing state.json doesn't have that field. Default Codable behavior: throw, app crashes on launch.

Five patterns to never crash on user state again:

Pattern 1: decodeIfPresent with defaults

The 80% case. New field, sensible default:

struct PaneState: Codable {
  let id: UUID
  var folderBookmark: Data
  var viewMode: PaneViewMode
  var focusedTabIndex: Int  // ← added in v0.1.1
  
  init(from decoder: Decoder) throws {
    let c = try decoder.container(keyedBy: CodingKeys.self)
    id = try c.decode(UUID.self, forKey: .id)
    folderBookmark = try c.decode(Data.self, forKey: .folderBookmark)
    viewMode = try c.decodeIfPresent(PaneViewMode.self, forKey: .viewMode) ?? .list
    focusedTabIndex = try c.decodeIfPresent(Int.self, forKey: .focusedTabIndex) ?? 0
  }
}

Three rules:

  1. Hand-roll init(from:). The default Codable synthesizer doesn't do decodeIfPresent.
  2. Always provide a default. ?? .list, ?? 0, ?? [].
  3. Field's type stays non-optional in the model — the migration concern is decoding only.

This pattern handles 80% of schema bumps in practice.

Pattern 2: Version tag for non-backward-compatible changes

When the change is structural — renaming a field, changing a type, splitting one field into two — version explicitly:

struct WorkspaceState: Codable {
  static let currentVersion = 3
  let version: Int
  let projects: [Project]
  
  init(from decoder: Decoder) throws {
    let c = try decoder.container(keyedBy: CodingKeys.self)
    let v = try c.decodeIfPresent(Int.self, forKey: .version) ?? 1
    
    self.version = WorkspaceState.currentVersion
    
    switch v {
    case 1:
      self.projects = try Self.decodeV1Projects(from: c)
    case 2:
      self.projects = try Self.decodeV2Projects(from: c)
    case 3:
      self.projects = try c.decode([Project].self, forKey: .projects)
    default:
      throw DecodingError.dataCorruptedError(
        forKey: .version, in: c,
        debugDescription: "Unknown version \(v)"
      )
    }
  }
}

The version field is a hard contract. When you bump it, you migrate explicitly.

Pattern 3: Forward-compat with unknown fields

Sometimes the user installs a newer version, then downgrades. The newer version wrote fields the older version doesn't know about. Your old code shouldn't crash — and shouldn't drop the unknown fields.

struct PaneState: Codable {
  let id: UUID
  var folderBookmark: Data
  
  // Preserve forward-compat fields verbatim
  private var unknownFields: [String: AnyCodable]? = nil
  
  init(from decoder: Decoder) throws {
    let c = try decoder.container(keyedBy: AnyCodingKey.self)
    
    var known: Set<String> = ["id", "folderBookmark"]
    
    self.id = try c.decode(UUID.self, forKey: AnyCodingKey("id"))
    self.folderBookmark = try c.decode(Data.self, forKey: AnyCodingKey("folderBookmark"))
    
    var unknown: [String: AnyCodable] = [:]
    for key in c.allKeys where !known.contains(key.stringValue) {
      unknown[key.stringValue] = try c.decode(AnyCodable.self, forKey: key)
    }
    self.unknownFields = unknown.isEmpty ? nil : unknown
  }
  
  func encode(to encoder: Encoder) throws {
    var c = encoder.container(keyedBy: AnyCodingKey.self)
    try c.encode(id, forKey: AnyCodingKey("id"))
    try c.encode(folderBookmark, forKey: AnyCodingKey("folderBookmark"))
    if let unknown = unknownFields {
      for (k, v) in unknown {
        try c.encode(v, forKey: AnyCodingKey(k))
      }
    }
  }
}

This is heavy machinery. Use it for root persisted types, not every nested struct.

Pattern 4: Reset-on-corrupt with backup

When decoding fails outright — corruption, totally invalid JSON — don't crash. Don't silently overwrite. Back up and reset:

final class PersistenceService {
  func loadState() -> WorkspaceState {
    do {
      let data = try Data(contentsOf: stateURL)
      return try JSONDecoder().decode(WorkspaceState.self, from: data)
    } catch {
      Self.logger.error("Failed to decode state.json: \(error)")
      backupCorruptFile()
      return WorkspaceState.default
    }
  }
  
  private func backupCorruptFile() {
    let backupURL = stateURL.deletingLastPathComponent()
      .appendingPathComponent("state.corrupt-\(Int(Date().timeIntervalSince1970)).json")
    try? FileManager.default.copyItem(at: stateURL, to: backupURL)
  }
}

Two reasons for the backup:

  1. Diagnosis: when a user reports lost state, you ask for the backup, you reproduce the bug.
  2. Recovery: if the user notices state loss before the backup is purged, you can manually merge fields.

Without the backup, "the app reset my workspace" is a one-way trip.

Pattern 5: @MainActor-safe synchronous flush on terminate

For the case the user force-quits with unsaved state, your debounced async save won't complete. You need a synchronous flush hook:

@MainActor
final class WorkspaceManager: ObservableObject {
  func saveSynchronously() {
    saveTask?.cancel()
    do {
      let data = try JSONEncoder().encode(workspace)
      try data.write(to: persistenceService.stateURL, options: .atomic)
    } catch {
      Self.logger.error("Sync save failed: \(error)")
    }
  }
}

// In your AppDelegate / @main
NotificationCenter.default.addObserver(
  forName: NSApplication.willTerminateNotification,
  object: nil, queue: .main
) { _ in
  WorkspaceManager.shared.saveSynchronously()
}

The runloop is being torn down, so async tasks won't complete. This sync hook is the safety net.

The test pattern that ties it together

Every schema bump in mq-dir has a corresponding test:

func testMigration_v2_to_v3_preservesAllFields() throws {
  // A v2 payload, hand-rolled
  let v2Json = """
  {
    "version": 2,
    "projects": [
      {
        "id": "...",
        "name": "Default",
        "layout": "four",
        "panes": [...]
      }
    ]
  }
  """.data(using: .utf8)!
  
  let state = try JSONDecoder().decode(WorkspaceState.self, from: v2Json)
  
  XCTAssertEqual(state.version, 3)
  XCTAssertEqual(state.projects.count, 1)
  XCTAssertEqual(state.projects[0].layout, .four)
  // every field that was in v2 must round-trip
  // every field added in v3 must have its default
  
  let reEncoded = try JSONEncoder().encode(state)
  let reDecoded = try JSONDecoder().decode(WorkspaceState.self, from: reEncoded)
  XCTAssertEqual(state, reDecoded)
}

Three properties this test enforces:

  1. Old data decodes. v2 payloads still load on v3 code.
  2. New fields have defaults. No crash on missing data.
  3. Round-trip stability. Encoding then decoding produces identical state.

CONTRIBUTING.md in mq-dir says: every PR that changes a Codable struct must include a testMigration_vN_to_vN+1_* test. Reviewers reject PRs without one.

Common mistakes we made and recovered from

For honesty:

  1. Renamed a field, no version bump. Old saves had the old key. New decoder didn't find it. Fix: re-added the old key as decodeIfPresent, mapped to the new name.
  2. Used Date() as default for a missing date field. Future-you sees a "creation date" of "yesterday" for a project the user made last year. Fix: explicit .distantPast for unknown.
  3. Switched a stored enum's raw value. Old data had "list", new code expects 0. Fix: never change a Codable enum's raw representation. Add new cases; never reorder; never re-key.

The third one is a wound that doesn't heal. Once you ship a Codable enum, the raw values are public API.

What we ship now

Every persisted type in mq-dir's mqdirCore module:

  • Has a hand-rolled init(from:) with decodeIfPresent for new fields.
  • Has a version field at the root of every persisted root type.
  • Has a corresponding migration test for every schema bump.
  • Uses synchronous flush on terminate.

The user's state.json from v0.1.0-alpha.1 still loads on the latest build. That's the contract.

If you're shipping any non-trivial Codable persistence in Swift, the patterns generalize. The failure mode you're guarding against — silent data loss on schema change — is shockingly common in the wild. A handful of Codable habits eliminates it.

Open source

mq-dir is fully open source.

MIT licensed, zero telemetry. Read the source, file an issue, send a PR.

★ Star on GitHub →

Frequently asked questions

When the change isn't backward-compatible. Adding a new field with a default? No version bump needed. Renaming a field, changing its type, removing it, or restructuring an object? Version bump and explicit migration.

References

  1. [1]
  2. [2]

Ready to try mq-dir?

A native quad-pane file manager built for AI multi-tasking on macOS. Free, MIT licensed, zero telemetry.

v0.1.0-beta.11 · MIT · macOS 14.0+ · github