Networking in modern iOS: typed endpoints, retries/backoff, and observability without bloat
A practical URLSession setup that scales: typed endpoints and decoding, retry rules that do not create duplicate side effects, and lightweight logging/metrics so you can measure reliability and latency.
Networking code tends to start simple and then quietly become your app’s least testable, least observable subsystem.
You do not need a “networking layer framework” to fix that.
What you need is:
- a typed way to describe requests and decode responses
- explicit rules for retries and backoff
- enough instrumentation to answer: “what failed, for whom, and how often?”
This post outlines a small set of patterns that stay readable in a product codebase.
1) Model the API as typed endpoints
A typed endpoint is a request description that can build a URLRequest and knows the expected response type.
Keep it boring:
- no global mutable state
- no magic stringly paths spread across features
- no decoding hidden inside view models
A minimal endpoint definition:
import Foundation
enum HTTPMethod: String {
case get = "GET"
case post = "POST"
case put = "PUT"
case delete = "DELETE"
}
struct Endpoint<Response: Decodable> {
var method: HTTPMethod
var path: String
var query: [URLQueryItem] = []
var headers: [String: String] = [:]
var body: Data? = nil
func makeRequest(baseURL: URL) throws -> URLRequest {
var components = URLComponents(url: baseURL.appendingPathComponent(path), resolvingAgainstBaseURL: false)
components?.queryItems = query.isEmpty ? nil : query
guard let url = components?.url else {
throw URLError(.badURL)
}
var request = URLRequest(url: url)
request.httpMethod = method.rawValue
request.httpBody = body
headers.forEach { request.setValue($1, forHTTPHeaderField: $0) }
return request
}
}
Usage stays straightforward:
struct UserDTO: Decodable {
let id: String
let email: String
}
extension Endpoint where Response == UserDTO {
static func user(id: String) -> Self {
Endpoint(method: .get, path: "/v1/users/\(id)")
}
}
The point is not purity. The point is to centralize the request shape and response type.
2) One API client: decoding, errors, and cancellation
A good client does three things well:
- executes a
URLRequest - decodes success responses
- produces an error that is useful to log and to show
Start with an error type you can reason about:
enum APIError: Error {
case transport(URLError)
case server(status: Int, body: Data?)
case decoding(Error)
case invalidResponse
}
Client implementation:
import Foundation
final class APIClient {
private let baseURL: URL
private let session: URLSession
private let decoder: JSONDecoder
init(baseURL: URL, session: URLSession = .shared, decoder: JSONDecoder = JSONDecoder()) {
self.baseURL = baseURL
self.session = session
self.decoder = decoder
}
func send<Response: Decodable>(_ endpoint: Endpoint<Response>) async throws -> Response {
let request = try endpoint.makeRequest(baseURL: baseURL)
do {
let (data, response) = try await session.data(for: request)
guard let http = response as? HTTPURLResponse else {
throw APIError.invalidResponse
}
guard (200...299).contains(http.statusCode) else {
throw APIError.server(status: http.statusCode, body: data)
}
do {
return try decoder.decode(Response.self, from: data)
} catch {
throw APIError.decoding(error)
}
} catch let urlError as URLError {
throw APIError.transport(urlError)
}
}
}
This buys you:
- consistent error mapping
- correct cancellation behavior via
async/await - a single place to add headers like auth and request IDs
3) Retries and backoff: decide what is safe
Retries are not an “on/off” feature.
The only correct retry policy is one that encodes which failures are transient and which requests are safe to repeat.
A practical policy:
- retry on transport errors like
.timedOutand.networkConnectionLost - retry on
502/503/504with exponential backoff and jitter - do not retry requests with side effects unless they are idempotent
A simple retry wrapper:
struct RetryPolicy {
var maxAttempts: Int = 3
var baseDelaySeconds: Double = 0.4
func shouldRetry(error: Error, attempt: Int, request: URLRequest) -> Bool {
guard attempt < maxAttempts else { return false }
// Only retry safe methods by default.
let method = request.httpMethod?.uppercased()
let isSafeMethod = (method == "GET" || method == "HEAD")
if isSafeMethod == false {
return false
}
if let api = error as? APIError {
switch api {
case .transport(let urlError):
switch urlError.code {
case .timedOut, .networkConnectionLost, .notConnectedToInternet, .cannotFindHost, .cannotConnectToHost:
return true
default:
return false
}
case .server(let status, _):
return status == 502 || status == 503 || status == 504
default:
return false
}
}
return false
}
func delaySeconds(attempt: Int) -> Double {
// Exponential backoff with jitter.
let exp = baseDelaySeconds * pow(2.0, Double(attempt - 1))
let jitter = Double.random(in: 0...0.2)
return exp + jitter
}
}
And an API client method that uses it:
extension APIClient {
func send<Response: Decodable>(
_ endpoint: Endpoint<Response>,
retryPolicy: RetryPolicy
) async throws -> Response {
let request = try endpoint.makeRequest(baseURL: baseURL)
var attempt = 1
while true {
do {
return try await send(endpoint)
} catch {
if retryPolicy.shouldRetry(error: error, attempt: attempt, request: request) {
let delay = retryPolicy.delaySeconds(attempt: attempt)
try await Task.sleep(nanoseconds: UInt64(delay * 1_000_000_000))
attempt += 1
continue
}
throw error
}
}
}
}
Concrete failure mode: duplicate side effects caused by retries
A common production incident:
- You introduce retries globally.
- A
POST /purchasetimes out on a slow cellular network. - The client retries.
- The server processes both requests.
Users see duplicate receipts or duplicate credits.
Diagnosis path that works:
- Add a client-generated request ID header (for example
X-Request-ID) on every request. - Log it on the server alongside the operation identifier (order id, purchase id).
- When an incident happens, search logs for the same user and two different
X-Request-IDvalues that map to the same operation time window.
Fix:
- for side-effecting endpoints, require an idempotency key (for example
Idempotency-Key) and have the server deduplicate - if you cannot guarantee idempotency, do not auto-retry that endpoint
4) Observability without bloat: log what you need, not everything
Two goals:
- developers can debug individual failures
- the team can measure trends (error rate, latency, retry rate)
Add request correlation
At minimum:
X-Request-ID: unique per attemptX-Session-IDor user identifier: only if your privacy model allows it
Keep request IDs in logs, crash reports, and bug reports.
Record per-request metrics
URLSessionTaskMetrics gives you timing and network details for each task. You can use it to answer:
- are we DNS bound?
- is TLS handshaking the bottleneck?
- did we hit HTTP/2 connection reuse?
A lightweight verification step you can do this week:
- In a debug build, attach a
URLSessionTaskDelegateand logtaskMetrics.transactionMetrics.first?.fetchStartDateandresponseEndDatefor a representative endpoint. - Run the same flow 10 times on Wi‑Fi and 10 times on cellular.
- Compute p50 and p95 durations.
- After introducing a retry policy or caching change, repeat and compare.
If you cannot reproduce the difference locally, you likely need server-side instrumentation too.
5) Keep feature code clean: inject the client, do not singleton it
A typed endpoint setup pays off when features stay dumb:
- feature owns its endpoint definitions
- app layer owns baseURL, auth, session configuration
- tests can inject a stub session or a fake client
If you want a single rule to keep the layer from expanding forever:
- endpoints describe requests
- client executes and maps errors
- features decide what to do with the result
That separation is what keeps networking maintainable six months later.