How Modern Apparel Brands Approach Sizing & Replenishment

Replenishment is a lagging optimization — it cannot recover a structurally wrong initial size curve. This guide synthesizes findings from senior apparel planning leaders on the upstream/downstream interplay between buying, sizing, and in-season replenishment, including the 50/50 allocation benchmark, size-level masking, and the signal hygiene rules that separate real demand from noise.

// Key Takeaways

Replenishment is a lagging mechanism. It optimizes at the margin — it does not recover a structurally wrong initial size curve because buy lead times exceed the size-level signal window.
Size-level masking is the hidden risk. A style with 70% product-level sell-through can have core sizes that stocked out on day two and a residual tail of fringe sizes — the real revenue ceiling is invisible in product-level reporting.
For multi-store chains, the 50/50 benchmark outperforms heavy upfront allocation. Allocating ≤50% of the buy up front and holding ≥50% for AI-assisted replenishment cuts inter-store transfer cost and lifts full-price sell-through.
Stable demand signals beat noisy ones. Sell-through, size-level stockouts, weeks of supply, and e-commerce viewed availability are the signals that move decisions; climate, promotions, and CRM returns add noise before they add value.
AI-surfaced size-level inefficiencies require shared ownership. Planning, buying, and supply chain own the outcome jointly — one-function ownership fails because the root cause always spans upstream buy decisions and downstream replenishment parameters.

What this guide is

Sizing and replenishment are usually taught as two separate workflows: buyers build the initial size curve; allocators and supply chain run replenishment in-season. In practice, the handoff between them is where apparel brands lose the most full-price revenue — because the hard limit of what replenishment can recover is rarely made explicit.

This guide synthesizes findings from a panel of senior apparel planning leaders across premium, fast-fashion, and athletic DTC. It covers six positions that reshape how buy planning and replenishment should connect: the structural limit of replenishment, the 50/50 allocation benchmark, size-level masking, demand-signal hygiene, shared ownership of AI-surfaced inefficiencies, and disciplined test-and-learn iteration.

It is written for VP Merch, Planning Directors, and Allocation leads at mid-market apparel brands who are evaluating AI-based replenishment, redesigning the buy-to-replenishment handoff, or trying to explain to a finance partner why more replenishment agility does not close an underbuy.

1. Replenishment is a lagging mechanism, not a recovery tool

The single most important reframe is this: replenishment optimizes; it does not rebuild. It is a velocity-following mechanism that reallocates existing warehouse inventory based on demand signals that have already fired.

Three structural realities drive this:

Lead times exceed the signal window. By the time a size stocks out in store or online, the customer who wanted it has already left. For seasonal product, replacement production runs do not clear the lead time before the selling window closes. Replenishment can move existing units from the DC to the door; it cannot create units that were never bought.

The demand signal is downstream of the buy. A sell-through reading at week three is a lagging indicator of a size curve decision made months earlier. The signal tells you what the curve got wrong — it does not create the capacity to fix it.

Good replenishment magnifies a correct buy; it does not rescue an incorrect one. AI-assisted, size-level replenishment at its best extends the full-price revenue ceiling of a well-constructed buy by pushing available stock to the doors that can still sell it. Against a structurally wrong initial size curve, the same algorithm redistributes the same insufficient pool of core sizes and accelerates fringe-size residual.

The operational conclusion: sizing discipline at the buy stage is not optional, and no downstream tool removes the requirement. Buying, merchandising, and planning teams must align on the size curve before the PO is locked — because after that, the ceiling is set.

2. Size-level masking: when a 70% sell-through style is actually broken

One of the most valuable findings is also one of the least visible in product-level reporting. A style can hit a healthy product-level sell-through rate and still be broken at the size level.

The pattern: a style shows 70% sell-through at the end of the selling window. Product-level dashboards flag it as a winner. Underneath the aggregate, the core sizes (the S/M/L that typically drive 60–80% of category demand) stocked out on day two or day three. What sold through at 70% was a mix of early full-price core-size sales and slow, margin-compressing fringe-size residual. The true revenue ceiling — what the style would have sold through with correct size depth — was much higher, and it is now invisible because the stockout hid the demand signal.

The diagnostic tells are specific:

Day-2 or day-3 core-size stockouts on a new launch.
A style whose residual inventory is 80%+ concentrated in the top and bottom sizes of the range.
E-commerce viewed availability on the size selector shows disproportionate interest in sold-out sizes.
Back-in-stock email signups cluster on a subset of sizes.

When these tells line up, the style is not a winner — it is a style whose planned size curve was inadequate and whose true ceiling is obscured by the stockout. Size-level reporting — not product-level reporting — is the only layer where this surfaces. This is why AI-based replenishment frequently exposes problems that planning teams "did not have" before: the inefficiency was always there; the reporting layer hid it.

Why it matters for buy planning. If you do not correct the size curve for next season's buy, you will repeat the same ceiling — and re-experience the same "healthy" product-level sell-through mixed with hidden demand loss. The next buy has to be informed by the size-level tells, not by the product-level summary.

3. The 50/50 benchmark for multi-store chains

A frequent reflex — especially for fast-moving brands under retail pressure — is to push as much stock into stores as possible on the initial allocation, and to fix imbalances with inter-store transfers in-season. This looks efficient. It is not.

The field-tested benchmark for multi-store chains with continuity-heavy assortments: allocate no more than 50% of the buy up front; hold the rest for AI-assisted replenishment against real size-level demand.

The reasoning is operational, not theoretical:

Inter-store transfers have real cost. When 80% of the buy is already in stores, a wrong initial allocation requires physically moving units between doors to rebalance. Picking, packing, transit, receiving, and shelving are labour costs; the units are unsellable during transit; and every transfer is a defect cost against a decision that should not have been made.

Replenishment against live demand beats upfront forecasts. A warehouse-held pool of inventory can be deployed against actual size-level sell-through data by store cluster. An already-distributed pool cannot — it can only be reshuffled.

Initial allocation is a forecast; replenishment is a fact. The more of the buy that remains flexible when real demand data arrives, the higher the revenue ceiling.

The appropriate holdback range depends on fleet size and product mix:

| Business profile | Holdback | Why | |---|---|---| | Emerging DTC, 1–10 doors, seasonal product | 15–25% | Short in-season window, low inter-door transfer cost, small fleet means upfront allocation is naturally less risky | | Mid-market omni, 10–50 doors | 30–40% | Mix of continuity and seasonal product; some transfer capability but meaningful friction at scale | | Multi-store chain, 50+ doors, continuity-heavy | 40–60% | High transfer cost if initial allocation is wrong; AI-assisted replenishment can absorb the back half against live signals |

Apply the benchmark before the first receipt, not after. The holdback decision is a receipt-plan decision — not an in-season decision. If the receipt calendar is built assuming a 15% holdback and the business actually needs 45%, there is nowhere to put the additional inventory when it lands at the DC.

4. Demand signal hygiene: stable signals beat noisy ones

Modern planning stacks expose more inputs than any team can act on: POS sell-through, e-commerce viewed availability, back-in-stock email signups, return rates, promotional response curves, weather correlations, influencer-driven spike signals, competitive pricing. More inputs do not produce better decisions. The panel's working rule: use the smallest set of stable signals, and be explicit about which noisy signals are being deliberately excluded.

Stable signals — these should drive replenishment and size-curve revision decisions:

Full-price sell-through rate at the size level
Size-level stockout timing (specifically, how quickly core sizes went out)
Weeks of supply, recalculated weekly against trailing velocity
E-commerce viewed availability on OOS size selectors
Back-in-stock email signups, segmented by size

Noisy signals — useful context, dangerous as primary inputs:

Climate and weather (high variance, weak correlation at the style level)
Promotional response curves (confounds price and demand)
CRM return reason codes (useful only when the signal is large and categorical, e.g., a fit defect)
Social/influencer spike data (one-off, not a base-rate input)

The discipline is not to ignore the noisy signals — it is to refuse to let them override the stable ones. A climate-driven adjustment to a replenishment parameter can cancel out two weeks of clean size-level demand data. The goal is fewer, cleaner inputs acted on consistently, not a comprehensive dashboard acted on reactively.

5. Shared ownership of AI-surfaced inefficiencies

When AI-assisted replenishment surfaces a new class of inefficiency — and it almost always does — the organizational question is: who owns fixing it? The instinct is to assign it to whichever team runs the tool. This fails predictably.

The reason is that AI does not create the inefficiency; it exposes one that already existed. And the root cause usually spans functions:

A size-level demand gap is partly a buying decision (the initial curve was wrong).
It is partly a planning decision (the receipt calendar held too little flexibility).
It is partly a supply chain decision (the replenishment parameters and lead times compounded the gap).
It is partly a merchandising decision (how the product was assorted across doors).

Assigning the fix to one function leaves the other three contributing causes in place, and the inefficiency recurs next season in a slightly different form.

The working model is shared ownership with a single convening cadence. Planning convenes a weekly size-level performance review; buying, supply chain, and allocation all bring their levers to the same table; decisions are made against a shared three-KPI scorecard (sell-through, lost sales, stock coverage). The governance is lightweight; the KPI set is small; the meeting is short. What makes it work is that every function shows up with authority to adjust their parameter.

6. Test and learn — don't over-correct

When size-level masking is exposed and lost-sales estimates begin to appear, the temptation is to rebuild the size curve at scale. This is where businesses create new imbalances — they over-correct a real signal in one dimension and introduce a larger residual problem in another.

The field-tested pattern is narrow and iterative:

Isolate one parameter change. Do not simultaneously re-weight the size curve, reset replenishment triggers, and reallocate across stores. If the result is good, you will not know which lever produced it. If it is bad, you will not know which to revert.

Test on a cluster, not the full fleet. Pick a small set of doors or a single product cluster. Run the change there. Keep the rest of the fleet as a live control.

Track three KPIs. Sell-through rate, estimated lost sales (using stockout timing and viewed-availability deltas), and stock coverage. Three is enough; more is signal dilution.

Give the test four to six weeks. Shorter windows are dominated by noise; longer windows delay the next iteration past the selling window.

Iterate. The first change will be roughly right and partially wrong. The second change corrects the residual. Discipline here is the compounding input — each cycle improves the curve faster than a single large rebuild ever does.

The risk profile of over-correction is not symmetric with under-correction. Under-correcting is a small persistent tax. Over-correcting creates its own residual problem, and the correction of the correction takes another full selling window to unwind.

How RetailNorthstar applies these principles

RetailNorthstar connects the buy plan, the size curve library, the allocation workflow, and the in-season replenishment engine in a single data model — so that size-level demand signals flow back to the size curves that inform the next buy, rather than being lost at the handoff.

Specifically:

Size curves are stored by category, channel, and demographic segment and recalculated from trailing size-level sell-through — not from production ratios or vendor defaults.
Receipt plans are built against a configurable holdback percentage that flexes by product profile (continuity vs. seasonal) and fleet size.
Size-level masking flags surface automatically when a style's product-level sell-through is healthy but day-2/3 core-size stockouts, fringe-size residual, or viewed-availability deltas indicate a hidden ceiling.
Replenishment triggers recompute weekly against trailing velocity and decay signals, and the allocator sees shared-ownership KPIs (sell-through, lost sales, stock coverage) in a single view.

The goal is not to automate away the planning judgment — it is to make the size-level reality visible early enough, and at a granular enough resolution, that the judgment can act on it while the selling window is still open.