iOS CI that stays fast as the app grows

Slow iOS CI does not arrive dramatically.

It creeps in politely.

One more UI test. One more Swift package. One more target. One more build phase that runs a script from 2019 because nobody wants to ask what it does. Six months later, every pull request waits forty minutes for a pipeline that still misses the bug.

That is not a tooling problem. That is operational debt with a progress bar.

Fast CI is not about worshipping at the altar of caching. Caches help. Better machines help. Parallelism helps. But none of them rescue a pipeline that does too much work, runs checks in the wrong order, and treats every change like it might have rewritten the rendering engine.

A good iOS CI system has one job:

Give engineers the shortest trustworthy path from change to confidence.

Not the shortest path to a green badge. Confidence.

There is a difference. Many teams learn it after shipping a green build that should have been red.

1. Split the pipeline by question

A slow pipeline usually asks every question at once.

Does the code compile?
Did formatting change?
Did unit behavior regress?
Did navigation still work?
Did the app archive?
Did signing explode again because certificates are a lifestyle disease?

These are different questions. They do not deserve the same lane.

A practical iOS CI setup has layers:

PR gate: the fastest checks required before review or merge.
Merge gate: broader checks before code lands on main.
Release gate: archive, signing, TestFlight, and slow confidence checks.
Nightly or scheduled checks: expensive suites that should not punish every small PR.

The PR gate should be boring and sharp:

formatting and linting
package resolution validation
build for the main app scheme
unit tests for affected or core modules
a tiny set of smoke UI tests only if they are genuinely stable

Do not run the whole universe on every typo fix. That is not discipline. That is a tax.

2. Make schemes small enough to be useful

Many iOS CI problems start inside Xcode before CI gets involved.

If the only reliable command is “build the entire app,” then every check inherits the full app’s cost. The pipeline cannot be smarter than the project structure allows.

Useful schemes are narrow:

App for the main application
CoreTests for pure logic
NetworkingTests for API behavior with stubs
PersistenceTests for storage and migration logic
DesignSystemTests for components that can be tested without launching the full app
AppUITests for the small number of flows that must run through the simulator

This is not architecture cosplay. It changes feedback speed.

If a pricing rule changes, CI should not need to boot a simulator, install the app, and wait for a tab bar to settle like it is negotiating a hostage release.

That test belongs in a fast target.

The useful rule:

Every expensive test should justify why a cheaper test cannot catch the same failure.

Most cannot.

3. Put the fastest failure first

CI should fail in the order that wastes the least time.

Run cheap deterministic checks before expensive ones:

validate generated files are up to date
run formatter and linter checks
resolve packages
build the affected scheme
run unit tests
run integration tests
run UI tests
archive and signing checks

This sounds obvious. Many pipelines still start by booting a simulator while a formatting error waits patiently to ruin the day fifteen minutes later.

That is negligence wearing YAML.

The first stage should answer: “Is this change obviously not mergeable?”

If yes, fail quickly and loudly.

Developers tolerate CI when it gives them a useful answer before they have mentally moved to another task. Once the feedback loop crosses the coffee-break threshold, people start batching changes, ignoring failures, and treating CI as weather.

Weather is not a quality strategy.

4. Cache the right things, not random folders with hope

Caching can make iOS CI much faster. It can also create a haunted pipeline where failures depend on which stale artifact the runner found under the floorboards.

Cache only things with clear invalidation rules.

Good candidates:

Swift Package Manager checkouts and resolved artifacts keyed by Package.resolved
CocoaPods or Carthage artifacts if the project still uses them
derived data keyed by Xcode version, SDK, dependency lockfile, and relevant build settings
simulator runtimes if your CI provider supports them safely

Bad candidates:

broad DerivedData reuse across unrelated branches
build products keyed only by branch name
generated files that should be recreated deterministically
anything that makes a clean build behave differently from a cached build

The cache key matters more than the cache line count in the blog post you copied.

For Xcode builds, include at least:

Xcode version
destination SDK
dependency lockfile hash
package manager version when relevant
build configuration
scheme or target when the cache is not universal

Then measure cache hit rate and actual time saved.

A cache that saves thirty seconds but causes one mysterious red build per week is not optimization. It is a prank with invoices.

5. Treat UI tests as scarce, not sacred

UI tests are useful. They are also expensive, fragile, and frequently overused by teams trying to compensate for missing lower-level tests.

A fast iOS CI pipeline does not eliminate UI tests. It makes them deliberate.

Use UI tests for flows where the integration matters:

app launch and first screen readiness
login or session restore
one critical purchase or upgrade path
one navigation smoke test through the main shell
a deep link path that has historically broken

Do not use UI tests to validate every text field, formatter, and disabled button. That is how a suite becomes a slot machine.

Stability basics are non-negotiable:

fixed locale
fixed time zone
deterministic seed data
animations disabled in test mode
network calls routed through stubs
accessibility identifiers for meaningful states
screenshots and UI hierarchy captured on failure

If a UI test cannot run ten times in a row on the same CI executor, it should not gate merges.

A flaky gate is worse than no gate. No gate is honest. A flaky gate trains people to rerun until morale improves.

6. Shard by duration, not by file count

Parallelism helps only when the work is balanced.

Splitting tests by file count is a beginner mistake. One UI test file can take twelve minutes while ten unit test files finish before Xcode has finished clearing its throat.

Shard by historical duration.

Keep a simple record of test runtime per test case or test bundle. Then build shards that target roughly equal duration:

shard 1: 8 minutes
shard 2: 8 minutes
shard 3: 8 minutes
shard 4: 8 minutes

Not:

shard 1: 3 minutes
shard 2: 4 minutes
shard 3: 26 minutes
shard 4: done instantly and now judging everyone

Most CI providers expose enough metadata to do this. If not, start simple: record xcodebuild result bundles, parse durations, and rebalance periodically.

The goal is not perfect scheduling. The goal is avoiding one long tail that makes all the parallel runners cosmetic.

7. Make result bundles first-class artifacts

When CI fails, the useful question is not “red or green?”

It is:

what failed?
on which simulator?
on which OS?
with which seed data?
what did the UI look like?
did this test fail before?
how long did it take compared with the baseline?

For iOS, .xcresult bundles are not optional decoration. They are the evidence.

Store them. Link them from the CI job. Extract summaries. Keep screenshots and logs. If a UI test fails, attach the screenshot and accessibility hierarchy. If a build fails, surface the actual compiler error instead of asking engineers to spelunk through a thousand-line log like it is a cursed treasure map.

Good CI reduces the distance between failure and diagnosis.

Bad CI says “exit code 65” and goes to lunch.

8. Keep dependency work out of the hot path

Dependency resolution can quietly dominate iOS CI.

Swift Package Manager has improved, but “improved” does not mean “free.” If every PR resolves packages from scratch, checks out the world, and rebuilds every dependency, the pipeline will feel slow even when the app code is fine.

Practical rules:

commit and review Package.resolved
fail CI if dependency resolution changes unexpectedly
cache package artifacts with strict keys
separate dependency update PRs from feature PRs
run heavier compatibility checks only on dependency PRs

A normal feature PR should not discover that a transitive package moved, broke, or now requires a different Swift tools version. That is not feedback. That is a drive-by.

If dependency updates are automated, batch them on a schedule and make their CI broader. Keep daily product work focused.

9. Track pipeline time like a product metric

Teams improve what they measure.

Track at least:

median PR gate duration
p90 PR gate duration
time to first failure
cache hit rate
test duration by target
flake rate by test
queue wait time before a runner starts

The most important number is often time to first useful failure.

A twenty-minute pipeline that reports a compile error at minute two is less harmful than a fifteen-minute pipeline that hides the obvious failure until the end.

Set budgets:

lint and generated-file checks: under 2 minutes
build-only PR gate: under 8 minutes
unit test PR gate: under 12 minutes
smoke UI test gate: under 15 minutes
release archive: slower is acceptable, but visible

Budgets are not wishes. If the pipeline exceeds them, someone owns the regression.

Otherwise CI time becomes a commons, and every team adds “just one more check” until everyone is paying rent to a loading spinner.

10. Do not let release checks poison PR feedback

Release automation has different goals from PR validation.

A release lane may need:

archive validation
signing
provisioning profiles
export options
symbol upload
TestFlight upload
screenshot generation
release notes
notarization for macOS targets

That work is important. It is also slow and failure-prone because Apple tooling occasionally behaves like a vending machine that took your money and returned a provisioning error.

Keep it out of the normal PR gate unless the PR specifically touches release infrastructure.

Run release checks:

on main
on release branches
on tags
manually before a scheduled release
when signing or build settings change

Do not make every copy change prove that App Store Connect still exists.

11. The boring architecture wins

Fast CI is mostly boring architecture applied consistently:

small modules
narrow schemes
deterministic tests
clear ownership of slow checks
reliable artifacts
strict dependency control
measured flake rate
measured duration

There is no magic YAML incantation that compensates for a codebase where everything depends on everything and the only test strategy is “launch the app and hope.”

CI performance is a design constraint. Treat it that way early.

A healthy iOS pipeline should make the right behavior easy:

small PRs get fast answers
risky changes get deeper verification
release work has its own lane
failures are explainable
slowdowns are noticed before they become culture

That is the standard.

Not heroic. Not fancy. Just grown-up engineering with fewer minutes sacrificed to the spinner gods.