There was a prior plan to build our pricing intelligence by scraping unstructured cost data out of the RFPs flowing through our platform. That's not happening — the volume isn't there and won't be for some time. But I want to explain why I think that approach was actually the wrong tool even if we had the volume, and why an alternative approach is a better fit for this domain.
Most ML/AI pricing applications work by accumulating transactions, finding statistical correlations, and inferring patterns you didn't know in advance. That makes sense when the underlying system is opaque — you don't know why outcomes look the way they do, so you let the data tell you. But retrofit pricing for NYC multifamily buildings is not an opaque system. It's governed by two things that are fully legible from the outside: physics and regulatory code.
The National Electrical Code tells you exactly what service upgrade is triggered by a given panel condition. The physics of heat loss tells you exactly how many heat pumps a given apartment needs based on its floor area, envelope, and climate. These relationships don't need to be learned from data — they can be derived from first principles and encoded directly into a model. That's what my NEC analysis was: a systematic derivation of the causal structure of electrical cost drivers for plug-in heat pump installations.
The practical consequence for how you should think about data architecture: we are not trying to discover patterns we don't know. We already know the structure. We need targeted data to populate the inputs of that known structure. That's a fundamentally different and smaller data problem. Instead of needing thousands of completed projects to train on, we need a handful of well-chosen data sources that give us the specific observable conditions our model branches on.
Think of it this way: a physics-based model says "if the feeder breaker is below 50 amps, a compliance upgrade is required, at a cost of approximately $X per apartment." You don't need 10,000 buildings to know that — you need the code and a phone call to a few electricians to calibrate the cost. An ML model trying to learn the same thing from project data would need to observe the pattern statistically across hundreds of similar projects. We don't have those projects.
This document is a guide to the full map of a preliminary attempt to define cost driver variables in a pricing model for room heat pumps — what each variable is, what conditions it can take, what data it needs, and where that data could come from. This list is no doubt both somewhat wrong and incomplete – but it is indicative of the nature and breadth of cost driver inputs for this type of scope that is hopefully useful to inform thinking around a data architecture framework. Also important to note that the relative significance of these cost drivers are not addressed in this document. In practice, the range in some variables will be closer to rounding errors while the range in other variables could be the difference between a project moving forward.
The most important near-term infrastructure investment is the structured field tech form — a data entry interface that captures the right variables in structured fields (not free text) so that field observations flow directly into the cost model. Priority fields: feeder breaker rating (amps), conductor gauge, available panel slots, outlet condition by room type, existing sleeve presence by facade, service entrance rating.
Rental listing images are an underexplored automated source that TailorBird has been mining for a while. StreetEasy and similar platforms have listing photos for most NYC multifamily buildings. Some will show outlet conditions near windows, existing HVAC equipment, and apartment layouts. The question is whether we process these with computer vision to extract structured signals, or use them as human-reviewed context during pre-visit scoping.
Google/Bing Street View and satellite (if permitted by user agreements…) could be a standard automated spear-fishing first pass on the subset of addresses that are actively evaluating a retrofit — fire escape detection, facade condition, sleeve presence, street-level accessibility. Alternatively, if user agreements are still a barrier, I assume that there are other image data sources that we could buy. This is fast, cheap, and eliminates a class of site-visit questions before anyone physically goes to the building.
Conduit ($265/user/month plus an iPad Pro) gives us ACCA-certified Manual J load calculations and dimensioned 2D/3D floor plans from LiDAR scanning. It might be the right kind of tool for the heating load and floor plan data needs — window dimensions, room areas, load calcs. It does not touch electrical, which remains the harder data problem and requires the structured field form.
Bid data architecture: when we get to active bidding, the line items we present to contractors need to map to our variable structure. This has to be designed intentionally — if we just ask for a lump sum bid, we learn nothing about which variables are driving the cost.
Here's how the variables above map onto a tiered collection process — which matters for product flow design:
Phase 0 — Remote, Pre-Commitment
What we can determine before an owner makes any financial commitment, using public data, owner questionnaire, and automated sources.
Covers: unit mix, building age/type/size, rough HP count, facade conditions (fire escapes, sleeves, window types) via Street View and satellite, existing heating system type from questionnaire, rough building service size estimate from building cohort. Rental listing photos as a first-pass signal on apartment interior conditions.
Output: a range estimate with identified high-stakes variables still to confirm, and a clear explanation of what the site visit resolves.
Phase 1 — Light Field Survey
A field tech spends one to two hours at the building with a structured data collection protocol.
Covers: iPad-based Conduit LiDAR scan of representative apartments (one of each unit type) for floor plans, window dimensions, and Manual J load calcs; structured form capturing panel photos, feeder breaker ratings, outlet conditions, available slots, and service entrance observations for a sample of apartments; exterior walkthrough documenting sleeve presence on all elevations.
Output: narrows the estimate materially; surfaces whether high-stakes variables (sub-50A feeders, service adequacy) are in play.
Phase 2 — Full Electrical Site Visit
Electrician visits all or a statistically significant sample of apartments. CT logger installed on service entrance for 220.87 analysis if service adequacy is uncertain.
Output: project-ready electrical scope for pricing and filing.
Phase 3 — Bid Data Collection
As contractor bids come in, we need to capture them at the line-item level that maps to our variable structure. "Electrical work" as a lump sum tells us nothing. We need: dedicated circuit additions, panel breaker upgrades, feeder upgrades, service upgrades — each as separate line items with unit costs and conditions.
Output: over time, calibrates the cost assumptions embedded in the model per variable state. This is where the statistical layer eventually gets built — but on top of the causal structure, not instead of it.
The model below is built around a reference building: post-war construction, 100 apartments (50 one-bedrooms, 40 two-bedrooms, 10 three-bedrooms), totaling approximately 260 heat pumps. At the base case — no significant new electrical work required — total scope lands at roughly $1.0M for window heat pumps (Gradient/Midea at ~$4K per unit installed) or $1.8M for packaged wall units (Innova at ~$7K per unit). The variables below explain why the real number could be materially different from either of those figures.
A few design principles built into this model that you should understand:
Contractor pricing reality. Contractors don't price the way our model is structured. They think in "person-days" and they adjust based on how busy they are. Our model may be more granular than how they actually estimate. That granularity is still worth maintaining on our side — it lets us validate their quotes, understand where risk lives, and calibrate our model against real bids over time. As bid data accumulates, we'll learn how to translate between our structured variables and how contractors package their estimates.
Lead time as a cost driver. Innova offers a unit price discount for owners who can tolerate longer equipment lead times. We call this out specifically because it illustrates a general principle: it is possible that other line items could have a timing/flexibility dimension, not just a fixed cost. Our data model should eventually capture/surface this to the extent that it's a lever that building owners can pull if their timeline allows.
Site visit as a conditional trigger, not a prerequisite. Not all inputs require someone to physically enter the building. The question — which matters a lot for the product flow — is what can be answered remotely (derisking an owner's initial commitment), what must be confirmed during a site visit, and what only becomes clear once contractors are actively bidding. The cost driver map below notes the timing tier for each variable.
Contract provisions for field surprises. Some conditions can only be sampled, not verified exhaustively across all apartments, before work begins. For these, the right approach is not to estimate with false precision — it's to establish contract structures with unit-cost provisions for exceptions. For example: "pricing assumes no new dedicated circuits are required; if field conditions require a new circuit, the per-circuit cost is $X." Our model should identify where these provisions are warranted and what the right exception unit costs are.
This is the primary volume driver for equipment and labor cost. We assume one heat pump per habitable room — each bedroom and the living room.
Apartment type
Heat pumps
Studio
1
1 Bedroom
2
2 Bedroom
3
3 Bedroom
4
Data needed: Unit mix (count of each apartment type in the building).
Data sources:
Owner questionnaire — simplest and most reliable for a building they own
NYC HPD/DOB records — PLUTO dataset and related NYC open data include unit counts and sometimes bedroom breakdowns by building; useful for pre-questionnaire estimation and verification
Rental listing platforms (StreetEasy, Zillow, Apartments.com) — listings for a given building often include unit mix and sometimes unit layouts; scrapeable for initial estimation before owner contact
Timing: Phase 0 (remote, pre-commitment)
Variable 2: Heat pump type — window vs. packaged wall unit
Window heat pumps (Gradient, Midea) are cheaper per unit (~$4K installed) and simpler to install. Packaged through-wall units (Innova) cost more (~$7K installed) and require either existing wall sleeves or new masonry penetrations. Some buildings will require a mix — for instance, if fire escapes block certain windows on certain facades, those rooms need packaged units even if window units are the desired choice for other conditions.
The owner's aesthetic and cost preference also plays a role here. My hypothesis is that layperson owners can't form a real preference until they can see what each option looks like in their specific building. A rendering tool showing typical interior and exterior appearances for each unit type, adapted to their building's specific context, would be one way of addressing this.
Observable conditions:
Fire escapes present on which facades and at which floors?
Existing through-wall sleeves on which facades?
Window dimensions sufficient to accommodate a packaged unit without new masonry penetration?
Data sources:
Google Street View — fire escape detection, facade conditions, street-level building access; automated first pass on every building
Aerial/satellite imagery — roof and upper-floor facade conditions; sleeve presence visible on some building types
Field tech exterior walkthrough with photo capture — structured photo capture of all facades noting fire escapes and sleeve presence by elevation and floor; this is irreplaceable for definitive determination
Floor plans — window dimensions; often not available for older NYC buildings (a key insight from the Conduit demo: existing blueprints have frequently disappeared)
Conduit LiDAR scan (iPad Pro-based, ~$265/user/month) — as a byproduct of scanning apartments for Manual J load calculations, Conduit captures window dimensions and locations; it doesn't specifically output a "sleeve present/not present" field but provides dimensioned floor plans that inform the determination
Timing: Phase 0 for rough cut (Street View, satellite); Phase 1 for confirmation (exterior walkthrough)
Variable 3: Installation labor cost per heat pump
Labor cost per unit varies with heat pump type, building accessibility, and building size. Window units (Gradient vs. Midea) may also have different labor costs due to relative ease of installation — this should be explicitly priced out for each unit type. For packaged units, whether existing sleeves are present determines whether new masonry holes need to be drilled, which is a significant additional cost that varies with wall construction type.
Observable conditions:
Building has elevator (Y/N)?
Staging and storage areas available (Y/N)?
For packaged units: existing sleeves present on relevant elevations?
If no sleeves: wall construction type (brick, CMU, frame) — affects drilling cost
Building height/stories
Data needed: Above conditions plus baseline labor rate quotes from relevant contractors for each scenario.
Data sources:
Owner questionnaire — elevator, building stories, basic building facts
Google Street View — street-level loading zone assessment, dumpster/staging area potential
Field tech exterior and interior walkthrough — sleeve presence on all elevations; interior staging area confirmation
Contractor interviews — unit labor rates by scenario; this is where the "phone call to four contractors" approach applies; periodic recalibration on a quarterly basis is more valuable than any backward-looking dataset
Timing: Phase 0 for building basics; Phase 1 for sleeve confirmation; ongoing contractor calls for rate calibration
Variable 4: Resident door access
This is a real cost driver that most contractors handle qualitatively ("getting a feel from the site visit") rather than quantitatively. I'd argue it should be handled contractually rather than estimated — pricing should be based on a defined access protocol (e.g., two scheduled knocks per apartment; if not available after two attempts, a revisit fee of $X applies). This converts a soft cost factor into a contractual provision, removing it from the estimation problem.
Data source: Contract structure, not data collection. No field data needed if handled this way.
Variable 5: Attic stock (spare units)
A percentage of heat pumps purchased upfront as on-site spares for future failures. Default assumption: approximately 5% of total unit count. May need calibration against available storage space.
Data needed: Building storage capacity.
Data source: Site visit or building photos of common areas and basement.
Timing: Phase 1
Variable 6: Common area supplemental heating
After the central boiler is retired, basement-adjacent spaces and ground-floor lobbies may need supplemental heating. Residents on the first floor above an unheated or marginally heated basement are the most likely source of comfort complaints. Cost scales roughly with basement perimeter (for in-slab or perimeter heat) plus a per-zone cost for lobby mini-splits.
Data needed: Building footprint perimeter; number of common areas requiring supplemental heat.
Data sources:
Aerial/satellite imagery — building footprint dimensions for perimeter estimate
Floor plans or LiDAR scan — lobby dimensions
Timing: Phase 0 rough estimate from satellite; Phase 1 for confirmation
Variable 7: Apartment-level supplemental heating
Rare, but theoretically possible for large apartments with significant exposed exterior kitchen or bathroom areas where heat pump coverage may be insufficient. To date, completed projects have not required this. The right approach is to treat it as a known risk that can be provisioned contractually rather than estimated per project.
Data needed: Floor plan compactness; comparable building data from existing projects.
Timing: Phase 1 (Conduit LiDAR would surface this if extreme floor plan configurations exist)
NYC code (NYCEC 210.52(J) and 210.11(C)(5)) requires a dedicated 15A circuit for each AC outlet in each covered room. Some apartments will already have dedicated outlets — because they had window ACs before, or were wired to code at some point. If a dedicated outlet exists, no new circuit is required. But even existing dedicated outlets may need minor modification:
A 240V outlet (some older buildings wired this way for large AC units) needs to be rewired to 120V — relatively minor electrical work
A duplex outlet needs to be converted to simplex so that the heat pump – but no other appliance – can be plugged into it.
Observable conditions: Does a dedicated outlet exist near each window/HP location? Is it 120V or 240V? Is it duplex or simplex?
Data sources:
Rental listing interior photos — sometimes capture visible outlet conditions near windows; worth processing systematically for a first-pass signal; uncertain reliability
Structured field tech form during site visit — the reliable path; a tech visits a representative sample of apartments (say, one of each unit type) and records outlet conditions in a structured form, not free-text notes
Contract provisions handle the sampling risk: pricing assumes X condition based on inspection sample; unit-cost exceptions apply if field conditions differ
Timing: Phase 0 rough signal from listing photos; Phase 1 confirmation via structured site visit form
Variable 9: New dedicated circuit required
If no dedicated outlet exists near the HP location, a new 15A branch circuit must be run from the apartment panel. This requires an electrician and an available breaker slot in the panel.
Data needed: Whether dedicated outlet is absent (flows from Variable 8 above); available panel slots (flows from Variable 10 below).
Data sources: Same as Variable 8 for outlet presence; Variable 10 for panel capacity.
This is the single highest-stakes electrical variable at the apartment level. NYC code (NYCEC 215.2(A)(1)) requires a minimum 50A feeder to every dwelling unit. Buildings with sub-50A feeders have a pre-existing code violation that becomes a required remediation whenever permitted electrical work is done — and adding dedicated circuits for heat pumps counts as permitted work.
There are three possible conditions, each with a different cost consequence:
Feeder breaker < 50A, but conductor is rated for 50A
Breaker can be upsized without rewiring — relatively low cost
Feeder breaker < 50A and conductor cannot support upsize
Full feeder/riser replacement required — significant cost
The distinction between the panel nameplate rating and the actual available capacity (determined by the feeder breaker or upstream overcurrent device) is critical and counterintuitive. A panel labeled "60A" can have a 40A feeder breaker — actual available capacity is 40A. This can only be confirmed by looking at the physical breaker, not the panel label.
Data needed: Feeder breaker rating (amps); wire gauge of the conductors feeding the apartment (to determine if a breaker upsize is feasible without rewiring).
Data sources:
This is the hardest variable to get remotely. It definitionally requires eyes on the panel. A structured field tech form capturing feeder breaker rating and conductor gauge is the path.
NYC DOB permit records — buildings that have had electrical work permitted in the last 10–20 years may have permit descriptions indicating feeder upgrades; potentially useful as a probabilistic pre-visit signal, though the data is inconsistent
ConEd — ConEd knows service capacity and, in some cases, feeder information; there may be a path through owner authorization to get this programmatically, but the track record of getting useful data from ConEd in a timely way is poor. Worth exploring but not a dependency.
Building age cohort modeling — as a rough prior, pre-war and early post-war buildings tend to start with lower service; a probabilistic model by building age, type, and unit count could provide a "likely scenario" estimate before the site visit
Timing: Phase 1 (electrical site visit) for definitive answer; Phase 0 probabilistic estimate from building age/type
Even with adequate feeder capacity, a panel may have no unused breaker positions for new dedicated circuits. The resolution options are: use tandem (double-tap) breakers if the panel manufacturer allows them; or replace the panel entirely. Tandem breakers are a low-cost solution when permitted; panel replacement is more expensive.
Data needed: Count of used vs. available panel slots; panel manufacturer (determines tandem breaker compatibility).
Data sources:
Structured field tech form — panel photos with slot count; panel brand/model visible on the door label
Panel photos can potentially be analyzed with image recognition to auto-count slots, though this is a secondary optimization
Even if all apartments have 50 Amps of service, the total electrical service entering the building may be inadequate for the added load of all heat pumps operating simultaneously. The new (post heat pump) needs of a building are evaluated using one of two NEC code methodologies:
NEC 220.51 (nameplate approach): calculation can be done relatively quickly based on number of new heat pumps added but is inherently very conservative and is likely to trigger need for electrical service upgrade.
NEC 220.87 (measured load approach): substitutes the actual measured existing building load (via a CT logger on the service entrance) for the theoretical nameplate calculation; produces a materially lower required service ampacity; for a 50-unit building, typically reduces required service from ~1,000A to ~650A, which an 800A service handles
The 220.87 approach is the preferred path because it reflects reality rather than a theoretical worst case, but it requires a data logger to be installed for 30-days during winter conditions — before any heat pumps are installed. A service upgrade is necessary if the new (post heat pump) service requirements exceed the existing service that is available. Note that it may not be possible to easily identify existing service with a basic site visit. A specialized site visit by an electrician that removes the cover of the main building service and/or a data collection request from Con Edison may be required. Note that if a building requires a service upgrade, an additional process with the utility would be triggered that impacts both project cost and schedule.
Data needed: Existing service size (from utility or electrician); CT logger reading (one-time field data collection event, typically January–February).
Data sources:
ConEd or owner utility records — existing service rating; this is typically knowable from the utility account
Field electrician inspection — confirms service rating and physical service entrance condition
CT logger installation (one logger per building, ~30 days of monitoring) — this is a targeted data collection event that produces the key input for the 220.87 calculation; the cost of the logger and monitoring is small relative to the value of avoiding an unnecessary service upgrade determination
Timing: Phase 1 for service rating; CT logger ideally installed in advance during a winter window before the project is formally scoped (which implies we should be thinking about this earlier in the pipeline than the formal site visit)
Note: NYC's amended version of 220.87 excludes buildings with solar PV or peak load shaving systems from the 30-day exception. For any building with rooftop solar, the full 12-month demand history is required.
E. Mechanical Electrical Plumbing (MEP) Engineer Filing #
Variable 13: Engineering filing cost
NYC requires a licensed engineer to file permit documents with DOB for the electrical scope. Cost varies with the complexity of the scope — a clean building where all apartments pass the feeder check and no service upgrade is required is straightforward to file; a building requiring panel upgrades and a 220.87 service analysis is more involved.
The key leverage point here: the NEC analysis work we've already done essentially pre-packages the analytical deliverable for the filing engineer. If we can hand the filing engineer a building-specific version of our code analysis — showing which NEC methodologies apply, what the calculated service requirements are, and what the NYC-specific code provisions support — we meaningfully reduce their billable hours and improve the accuracy of the filing. This should be built into our product flow: our analysis output should be engineered to be the filing engineer's input.
In addition, "Manual J" calcs need to be performed to verify that heat pump output is appropriate for the needs of each room. Off the shelf packages like Conduit are designed to streamline the process for collecting the required input data (apartment floor plans, etc) and automating these calculations… but they will still have to be signed off on by an MEP. Finally, any thru wall masonry penetrations may also require some sort of MEP stamp.
Data needed: Full electrical scope (flows from Variables 9–12 above); building address and basic characteristics; floor plans, window areas and assumptions on existing wall/roof insulation levels.
Variable 14: Existing heater demolition and removal
Whether old heating equipment can be left in place or must be removed and disposed of. Varies significantly by system type — steam radiators (heavy, require pipe disconnection), fan coil units (easier), electric baseboard heaters (simplest) — and by building/tenant requirements.
Data needed: Existing heating system type; owner/lease requirements on heater removal.
Data sources:
Owner questionnaire — system type is typically known to the owner or super
Site visit photos — visible heating equipment in representative apartments
Opportunity to add insulation and air sealing to accessible roof cavities at time of project, reducing heat pump load and improving building performance. Only relevant where the roof construction has an accessible cavity. Con Edison actually requires this measure for heat pumps.
Data needed: Roof/attic construction type; accessible cavity (Y/N); current insulation condition.
Data sources:
Building age and construction type — provides a probabilistic prior (flat roofs with accessible cavities are common in NYC post-war stock)
Site visit — definitive
Timing: Phase 1
Variable 16: New windows coordinated with heat pump installation
Where packaged through-wall heat pumps are specified, new window installation can be coordinated to share mobilization costs and potentially provide larger openings that eliminate the need for new masonry penetrations. Window dimensions also determine whether existing openings can accommodate an Innova unit without new drilling — a significant cost driver if masonry work is required.
Data needed: Existing window dimensions; owner interest in window replacement; building's window replacement eligibility under city programs.
Data sources:
Facade photos — window condition visible; general size estimable
Conduit LiDAR scan — captures window dimensions as part of the apartment scan; this is one of the more directly useful applications of Conduit for our specific needs, since window dimensions affect both Manual J calculations and HP type determination
Field measurement — definitive for packaged unit fit determination
Timing: Phase 0 rough (facade photos); Phase 1 definitive (LiDAR or field measurement)
Variable 17: Con Edison service entrance consolidation (Daisy Chain)
Some buildings or complexes have multiple Con Edison service entrances — a complexity factor for electrical work and potentially an opportunity for service consolidation that connects to Daisy Chain cost estimating. Number and configuration of service entrances affects the scope of any service-level electrical work.
Data needed: Number of Con Edison service entrances; their locations in the building.
Data sources:
Owner/super interview — typically known at the building level
Electrician site visit — confirms
ConEd account records — if accessible through owner authorization
Heat Pump Retrofit Pricing Model: Cost Driver Variables & Data Sources
Background #
There was a prior plan to build our pricing intelligence by scraping unstructured cost data out of the RFPs flowing through our platform. That's not happening — the volume isn't there and won't be for some time. But I want to explain why I think that approach was actually the wrong tool even if we had the volume, and why an alternative approach is a better fit for this domain.
Most ML/AI pricing applications work by accumulating transactions, finding statistical correlations, and inferring patterns you didn't know in advance. That makes sense when the underlying system is opaque — you don't know why outcomes look the way they do, so you let the data tell you. But retrofit pricing for NYC multifamily buildings is not an opaque system. It's governed by two things that are fully legible from the outside: physics and regulatory code.
The National Electrical Code tells you exactly what service upgrade is triggered by a given panel condition. The physics of heat loss tells you exactly how many heat pumps a given apartment needs based on its floor area, envelope, and climate. These relationships don't need to be learned from data — they can be derived from first principles and encoded directly into a model. That's what my NEC analysis was: a systematic derivation of the causal structure of electrical cost drivers for plug-in heat pump installations.
The practical consequence for how you should think about data architecture: we are not trying to discover patterns we don't know. We already know the structure. We need targeted data to populate the inputs of that known structure. That's a fundamentally different and smaller data problem. Instead of needing thousands of completed projects to train on, we need a handful of well-chosen data sources that give us the specific observable conditions our model branches on.
Think of it this way: a physics-based model says "if the feeder breaker is below 50 amps, a compliance upgrade is required, at a cost of approximately $X per apartment." You don't need 10,000 buildings to know that — you need the code and a phone call to a few electricians to calibrate the cost. An ML model trying to learn the same thing from project data would need to observe the pattern statistically across hundreds of similar projects. We don't have those projects.
This document is a guide to the full map of a preliminary attempt to define cost driver variables in a pricing model for room heat pumps — what each variable is, what conditions it can take, what data it needs, and where that data could come from. This list is no doubt both somewhat wrong and incomplete – but it is indicative of the nature and breadth of cost driver inputs for this type of scope that is hopefully useful to inform thinking around a data architecture framework. Also important to note that the relative significance of these cost drivers are not addressed in this document. In practice, the range in some variables will be closer to rounding errors while the range in other variables could be the difference between a project moving forward.
Summary for Potential Data Architecture #
The key implications for what we're building:
The most important near-term infrastructure investment is the structured field tech form — a data entry interface that captures the right variables in structured fields (not free text) so that field observations flow directly into the cost model. Priority fields: feeder breaker rating (amps), conductor gauge, available panel slots, outlet condition by room type, existing sleeve presence by facade, service entrance rating.
Rental listing images are an underexplored automated source that TailorBird has been mining for a while. StreetEasy and similar platforms have listing photos for most NYC multifamily buildings. Some will show outlet conditions near windows, existing HVAC equipment, and apartment layouts. The question is whether we process these with computer vision to extract structured signals, or use them as human-reviewed context during pre-visit scoping.
Google/Bing Street View and satellite (if permitted by user agreements…) could be a standard automated spear-fishing first pass on the subset of addresses that are actively evaluating a retrofit — fire escape detection, facade condition, sleeve presence, street-level accessibility. Alternatively, if user agreements are still a barrier, I assume that there are other image data sources that we could buy. This is fast, cheap, and eliminates a class of site-visit questions before anyone physically goes to the building.
Conduit ($265/user/month plus an iPad Pro) gives us ACCA-certified Manual J load calculations and dimensioned 2D/3D floor plans from LiDAR scanning. It might be the right kind of tool for the heating load and floor plan data needs — window dimensions, room areas, load calcs. It does not touch electrical, which remains the harder data problem and requires the structured field form.
Bid data architecture: when we get to active bidding, the line items we present to contractors need to map to our variable structure. This has to be designed intentionally — if we just ask for a lump sum bid, we learn nothing about which variables are driving the cost.
Potential Data Collection Phases #
Here's how the variables above map onto a tiered collection process — which matters for product flow design:
Phase 0 — Remote, Pre-Commitment What we can determine before an owner makes any financial commitment, using public data, owner questionnaire, and automated sources.
Covers: unit mix, building age/type/size, rough HP count, facade conditions (fire escapes, sleeves, window types) via Street View and satellite, existing heating system type from questionnaire, rough building service size estimate from building cohort. Rental listing photos as a first-pass signal on apartment interior conditions.
Output: a range estimate with identified high-stakes variables still to confirm, and a clear explanation of what the site visit resolves.
Phase 1 — Light Field Survey A field tech spends one to two hours at the building with a structured data collection protocol.
Covers: iPad-based Conduit LiDAR scan of representative apartments (one of each unit type) for floor plans, window dimensions, and Manual J load calcs; structured form capturing panel photos, feeder breaker ratings, outlet conditions, available slots, and service entrance observations for a sample of apartments; exterior walkthrough documenting sleeve presence on all elevations.
Output: narrows the estimate materially; surfaces whether high-stakes variables (sub-50A feeders, service adequacy) are in play.
Phase 2 — Full Electrical Site Visit Electrician visits all or a statistically significant sample of apartments. CT logger installed on service entrance for 220.87 analysis if service adequacy is uncertain.
Output: project-ready electrical scope for pricing and filing.
Phase 3 — Bid Data Collection As contractor bids come in, we need to capture them at the line-item level that maps to our variable structure. "Electrical work" as a lump sum tells us nothing. We need: dedicated circuit additions, panel breaker upgrades, feeder upgrades, service upgrades — each as separate line items with unit costs and conditions.
Output: over time, calibrates the cost assumptions embedded in the model per variable state. This is where the statistical layer eventually gets built — but on top of the causal structure, not instead of it.
Base Scenario and Key Design Notes #
The model below is built around a reference building: post-war construction, 100 apartments (50 one-bedrooms, 40 two-bedrooms, 10 three-bedrooms), totaling approximately 260 heat pumps. At the base case — no significant new electrical work required — total scope lands at roughly $1.0M for window heat pumps (Gradient/Midea at ~$4K per unit installed) or $1.8M for packaged wall units (Innova at ~$7K per unit). The variables below explain why the real number could be materially different from either of those figures.
A few design principles built into this model that you should understand:
Contractor pricing reality. Contractors don't price the way our model is structured. They think in "person-days" and they adjust based on how busy they are. Our model may be more granular than how they actually estimate. That granularity is still worth maintaining on our side — it lets us validate their quotes, understand where risk lives, and calibrate our model against real bids over time. As bid data accumulates, we'll learn how to translate between our structured variables and how contractors package their estimates.
Lead time as a cost driver. Innova offers a unit price discount for owners who can tolerate longer equipment lead times. We call this out specifically because it illustrates a general principle: it is possible that other line items could have a timing/flexibility dimension, not just a fixed cost. Our data model should eventually capture/surface this to the extent that it's a lever that building owners can pull if their timeline allows.
Site visit as a conditional trigger, not a prerequisite. Not all inputs require someone to physically enter the building. The question — which matters a lot for the product flow — is what can be answered remotely (derisking an owner's initial commitment), what must be confirmed during a site visit, and what only becomes clear once contractors are actively bidding. The cost driver map below notes the timing tier for each variable.
Contract provisions for field surprises. Some conditions can only be sampled, not verified exhaustively across all apartments, before work begins. For these, the right approach is not to estimate with false precision — it's to establish contract structures with unit-cost provisions for exceptions. For example: "pricing assumes no new dedicated circuits are required; if field conditions require a new circuit, the per-circuit cost is $X." Our model should identify where these provisions are warranted and what the right exception unit costs are.
Room Heat Pump Cost Driver Map #
A. Source of Heat #
Variable 1: Heat pump count
This is the primary volume driver for equipment and labor cost. We assume one heat pump per habitable room — each bedroom and the living room.
Data needed: Unit mix (count of each apartment type in the building).
Data sources:
Timing: Phase 0 (remote, pre-commitment)
Variable 2: Heat pump type — window vs. packaged wall unit
Window heat pumps (Gradient, Midea) are cheaper per unit (~$4K installed) and simpler to install. Packaged through-wall units (Innova) cost more (~$7K installed) and require either existing wall sleeves or new masonry penetrations. Some buildings will require a mix — for instance, if fire escapes block certain windows on certain facades, those rooms need packaged units even if window units are the desired choice for other conditions.
The owner's aesthetic and cost preference also plays a role here. My hypothesis is that layperson owners can't form a real preference until they can see what each option looks like in their specific building. A rendering tool showing typical interior and exterior appearances for each unit type, adapted to their building's specific context, would be one way of addressing this.
Observable conditions:
Data sources:
Timing: Phase 0 for rough cut (Street View, satellite); Phase 1 for confirmation (exterior walkthrough)
Variable 3: Installation labor cost per heat pump
Labor cost per unit varies with heat pump type, building accessibility, and building size. Window units (Gradient vs. Midea) may also have different labor costs due to relative ease of installation — this should be explicitly priced out for each unit type. For packaged units, whether existing sleeves are present determines whether new masonry holes need to be drilled, which is a significant additional cost that varies with wall construction type.
Observable conditions:
Data needed: Above conditions plus baseline labor rate quotes from relevant contractors for each scenario.
Data sources:
Timing: Phase 0 for building basics; Phase 1 for sleeve confirmation; ongoing contractor calls for rate calibration
Variable 4: Resident door access
This is a real cost driver that most contractors handle qualitatively ("getting a feel from the site visit") rather than quantitatively. I'd argue it should be handled contractually rather than estimated — pricing should be based on a defined access protocol (e.g., two scheduled knocks per apartment; if not available after two attempts, a revisit fee of $X applies). This converts a soft cost factor into a contractual provision, removing it from the estimation problem.
Data source: Contract structure, not data collection. No field data needed if handled this way.
Variable 5: Attic stock (spare units)
A percentage of heat pumps purchased upfront as on-site spares for future failures. Default assumption: approximately 5% of total unit count. May need calibration against available storage space.
Data needed: Building storage capacity.
Data source: Site visit or building photos of common areas and basement.
Timing: Phase 1
Variable 6: Common area supplemental heating
After the central boiler is retired, basement-adjacent spaces and ground-floor lobbies may need supplemental heating. Residents on the first floor above an unheated or marginally heated basement are the most likely source of comfort complaints. Cost scales roughly with basement perimeter (for in-slab or perimeter heat) plus a per-zone cost for lobby mini-splits.
Data needed: Building footprint perimeter; number of common areas requiring supplemental heat.
Data sources:
Timing: Phase 0 rough estimate from satellite; Phase 1 for confirmation
Variable 7: Apartment-level supplemental heating
Rare, but theoretically possible for large apartments with significant exposed exterior kitchen or bathroom areas where heat pump coverage may be insufficient. To date, completed projects have not required this. The right approach is to treat it as a known risk that can be provisioned contractually rather than estimated per project.
Data needed: Floor plan compactness; comparable building data from existing projects.
Timing: Phase 1 (Conduit LiDAR would surface this if extreme floor plan configurations exist)
B. Source of Power — Room Level #
Variable 8: Existing dedicated outlet condition
NYC code (NYCEC 210.52(J) and 210.11(C)(5)) requires a dedicated 15A circuit for each AC outlet in each covered room. Some apartments will already have dedicated outlets — because they had window ACs before, or were wired to code at some point. If a dedicated outlet exists, no new circuit is required. But even existing dedicated outlets may need minor modification:
Observable conditions: Does a dedicated outlet exist near each window/HP location? Is it 120V or 240V? Is it duplex or simplex?
Data sources:
Timing: Phase 0 rough signal from listing photos; Phase 1 confirmation via structured site visit form
Variable 9: New dedicated circuit required
If no dedicated outlet exists near the HP location, a new 15A branch circuit must be run from the apartment panel. This requires an electrician and an available breaker slot in the panel.
Data needed: Whether dedicated outlet is absent (flows from Variable 8 above); available panel slots (flows from Variable 10 below).
Data sources: Same as Variable 8 for outlet presence; Variable 10 for panel capacity.
Timing: Phase 1
C. Source of Power — Apartment Level #
Variable 10: Apartment feeder capacity
This is the single highest-stakes electrical variable at the apartment level. NYC code (NYCEC 215.2(A)(1)) requires a minimum 50A feeder to every dwelling unit. Buildings with sub-50A feeders have a pre-existing code violation that becomes a required remediation whenever permitted electrical work is done — and adding dedicated circuits for heat pumps counts as permitted work.
There are three possible conditions, each with a different cost consequence:
The distinction between the panel nameplate rating and the actual available capacity (determined by the feeder breaker or upstream overcurrent device) is critical and counterintuitive. A panel labeled "60A" can have a 40A feeder breaker — actual available capacity is 40A. This can only be confirmed by looking at the physical breaker, not the panel label.
Data needed: Feeder breaker rating (amps); wire gauge of the conductors feeding the apartment (to determine if a breaker upsize is feasible without rewiring).
Data sources:
Timing: Phase 1 (electrical site visit) for definitive answer; Phase 0 probabilistic estimate from building age/type
Variable 11: Apartment panel breaker slot availability
Even with adequate feeder capacity, a panel may have no unused breaker positions for new dedicated circuits. The resolution options are: use tandem (double-tap) breakers if the panel manufacturer allows them; or replace the panel entirely. Tandem breakers are a low-cost solution when permitted; panel replacement is more expensive.
Data needed: Count of used vs. available panel slots; panel manufacturer (determines tandem breaker compatibility).
Data sources:
Timing: Phase 1
D. Source of Power — Building Service #
Variable 12: Building electrical service adequacy
Even if all apartments have 50 Amps of service, the total electrical service entering the building may be inadequate for the added load of all heat pumps operating simultaneously. The new (post heat pump) needs of a building are evaluated using one of two NEC code methodologies:
The 220.87 approach is the preferred path because it reflects reality rather than a theoretical worst case, but it requires a data logger to be installed for 30-days during winter conditions — before any heat pumps are installed. A service upgrade is necessary if the new (post heat pump) service requirements exceed the existing service that is available. Note that it may not be possible to easily identify existing service with a basic site visit. A specialized site visit by an electrician that removes the cover of the main building service and/or a data collection request from Con Edison may be required. Note that if a building requires a service upgrade, an additional process with the utility would be triggered that impacts both project cost and schedule.
Observable conditions: Existing service rating (amps); measured winter baseline load (from CT logger).
Data needed: Existing service size (from utility or electrician); CT logger reading (one-time field data collection event, typically January–February).
Data sources:
Timing: Phase 1 for service rating; CT logger ideally installed in advance during a winter window before the project is formally scoped (which implies we should be thinking about this earlier in the pipeline than the formal site visit)
Note: NYC's amended version of 220.87 excludes buildings with solar PV or peak load shaving systems from the 30-day exception. For any building with rooftop solar, the full 12-month demand history is required.
E. Mechanical Electrical Plumbing (MEP) Engineer Filing #
Variable 13: Engineering filing cost
NYC requires a licensed engineer to file permit documents with DOB for the electrical scope. Cost varies with the complexity of the scope — a clean building where all apartments pass the feeder check and no service upgrade is required is straightforward to file; a building requiring panel upgrades and a 220.87 service analysis is more involved.
The key leverage point here: the NEC analysis work we've already done essentially pre-packages the analytical deliverable for the filing engineer. If we can hand the filing engineer a building-specific version of our code analysis — showing which NEC methodologies apply, what the calculated service requirements are, and what the NYC-specific code provisions support — we meaningfully reduce their billable hours and improve the accuracy of the filing. This should be built into our product flow: our analysis output should be engineered to be the filing engineer's input.
In addition, "Manual J" calcs need to be performed to verify that heat pump output is appropriate for the needs of each room. Off the shelf packages like Conduit are designed to streamline the process for collecting the required input data (apartment floor plans, etc) and automating these calculations… but they will still have to be signed off on by an MEP. Finally, any thru wall masonry penetrations may also require some sort of MEP stamp.
Data needed: Full electrical scope (flows from Variables 9–12 above); building address and basic characteristics; floor plans, window areas and assumptions on existing wall/roof insulation levels.
Timing: Phase 1 scope determination feeds Phase 2 filing
F. Complementary Work Scopes #
Variable 14: Existing heater demolition and removal
Whether old heating equipment can be left in place or must be removed and disposed of. Varies significantly by system type — steam radiators (heavy, require pipe disconnection), fan coil units (easier), electric baseboard heaters (simplest) — and by building/tenant requirements.
Data needed: Existing heating system type; owner/lease requirements on heater removal.
Data sources:
Timing: Phase 0 (questionnaire); Phase 1 (confirmation)
Variable 15: Attic/roof insulation
Opportunity to add insulation and air sealing to accessible roof cavities at time of project, reducing heat pump load and improving building performance. Only relevant where the roof construction has an accessible cavity. Con Edison actually requires this measure for heat pumps.
Data needed: Roof/attic construction type; accessible cavity (Y/N); current insulation condition.
Data sources:
Timing: Phase 1
Variable 16: New windows coordinated with heat pump installation
Where packaged through-wall heat pumps are specified, new window installation can be coordinated to share mobilization costs and potentially provide larger openings that eliminate the need for new masonry penetrations. Window dimensions also determine whether existing openings can accommodate an Innova unit without new drilling — a significant cost driver if masonry work is required.
Data needed: Existing window dimensions; owner interest in window replacement; building's window replacement eligibility under city programs.
Data sources:
Timing: Phase 0 rough (facade photos); Phase 1 definitive (LiDAR or field measurement)
Variable 17: Con Edison service entrance consolidation (Daisy Chain)
Some buildings or complexes have multiple Con Edison service entrances — a complexity factor for electrical work and potentially an opportunity for service consolidation that connects to Daisy Chain cost estimating. Number and configuration of service entrances affects the scope of any service-level electrical work.
Data needed: Number of Con Edison service entrances; their locations in the building.
Data sources:
Timing: Phase 1