Data cleansing before migration: what to fix in the abstract layer first
The fastest way to spread a lease data problem is to move it into a new system. The old errors do not vanish during the move. They live on in the new system. Now they are harder to spot. They look more trustworthy just because they survived the move.
Data cleansing before the move stops this. You will not get a perfect abstract first. That would mean re-reading every lease from the source documents. That may not be possible. (An abstract is the short summary of a lease's key terms.) The real goal is simpler. Find out which problems exist. Fix the ones you can fix before the move. Write down the ones you cannot fix. Then your team knows what they are getting.
The three kinds of cleansing work
Cleansing work before a move falls into three kinds. There is format cleanup. There is consistency checking. There is fixing wrong values.
Format cleanup fixes things that block a clean import. It does not matter if the value is right or wrong. Dates in the wrong format are one example. So are percent signs versus decimals. So are dollar amounts with or without a dollar sign. So is text that runs past a field limit. You can often fix these with scripts. You do not need the source documents. The content is fine. Only the format needs to change.
Consistency checking looks at whether related fields agree. Say the end date does not equal the start date plus the stated term. That points to an error. Say the pro rata share percent does not match the numerator over the denominator. (Pro rata share is the slice of building costs a tenant pays. The numerator is the tenant's space. The denominator is the total space.) Then one of the three numbers is wrong. Say a rent step does not follow the stated rent formula. That points to a typo or a missed amendment. These checks usually need a person to find the right value.
Fixing wrong values handles fields that are just plain wrong. A wrong start date is one. An operating expense rule that copies the base lease, not the controlling rider, is another. A CAM cap percent with no carve-out list is a third. (CAM is common area maintenance, the shared costs of a property. A CAM cap limits how much those costs can rise. A carve-out is a cost left out of that cap.) You cannot fix these without the source documents. You may need to choose which leases to review first.
What to cleanse first
Not all cleansing work matters the same. Rank the work by two things. First, how much the field affects billing and daily work. Second, whether the new system requires the field.
Start with required fields. The new system will not import a blank required field. Find every required field in the new system. Check your old data for blanks in those fields. Fix the blanks before the move. Pull the value from the source documents. Or use a set placeholder and flag it for follow-up.
Next come high-impact fields. These drive billing math, deadlines, and CAM review. Cleanse them with the most care. They include pro rata share and denominator logic. They include base year and gross-up rules. (A base year sets a starting cost level for later comparison. A gross-up adjusts shared costs as if the building were full.) They include operating expense rules and exclusion lists. They include audit-rights terms. (Audit rights let a tenant review the landlord's charges.) Errors here flow into every reconciliation period until you fix them. (A reconciliation is the yearly true-up of estimated charges against real costs.)
Low-impact fields come last. Contact names and notice addresses fit here. You can often check and fix these after the move with no money at risk.
When source documents are gone
You will often find that some source documents are missing or incomplete. Some leases were read years ago. Some originals sit with the old broker or landlord. Some were lost in an earlier system move.
When you cannot get the source documents back before the move, your choices are few. But the choice you make matters.
Say the old value is the best you have and looks about right. You can load it with a flag. The flag says it was not checked against the source documents. Your team knows the field is unsure. They can chase the document down.
Say the old value is wrong but you cannot find the right one without the source document. Then load the field blank with an exception note. A blank field is more honest than a wrong one. A blank triggers follow-up. A wrong value that looks right triggers nothing.
Push hard to find missing documents before the move, not after. It is worth the effort for high-value and high-risk leases. Finding a missing amendment before the move costs less. Finding its billing damage after the move costs far more.
Use consistency checks as a cleansing tool
A full consistency check before the move is one of the best things you can do. It shows you, across the whole portfolio, which related fields do not agree. That reveals patterns where the abstraction went wrong.
Say a check finds 40% of leases where the pro rata share percent does not match the numerator and denominator. That points to a denominator that was not captured well during the first abstraction. The fix is to pull the denominator from source documents for those leases. Do not just patch the percent field.
Say a check finds option notice deadlines that do not match the end date and notice period. That points to deadlines typed in by hand, not figured from the end date. Some were typed in wrong.
Each pattern has its own fix. Format patterns can be fixed with scripts. Math patterns can be fixed by recomputing from the source fields. Wrong values need source document review.
Run the check before the move. Give yourself time to study each pattern and apply the right fix. That gives a far cleaner import than checking records one by one.
The cleansing freeze
Once cleansing starts, set a cleansing freeze. No one updates the old system during the cleansing window unless that change is also tracked in the cleansing workflow. Say someone adds amendments to the old system after you cleansed those leases. Then your cleansed data is stale again before the move.
A freeze does not mean you stop lease administration. It means you track every change made to the old system during the move window. You write it in the migration worksheet. Then you can put it in the moved data too.
The abstract-to-audit trigger framework ties these ideas to a clear workflow. It helps abstraction firms add expense-recovery services.
Frequently Asked Questions
What is data cleansing in the context of lease abstraction migration?
Data cleansing for lease abstraction migration is the process of reviewing and correcting the existing abstract data before loading it into a new system. It includes: standardizing field formats to match the target system requirements, resolving known field value errors, filling blank required fields or explicitly flagging them as unknown, reconciling inconsistencies between related fields, and removing or consolidating duplicate records.
What abstract fields most often require cleansing before migration?
The fields most often requiring cleansing are: date fields with inconsistent formatting, pro rata share fields where the percentage does not match the recorded numerator and denominator values, base year fields missing associated gross-up assumptions, CAM cap fields with no carve-out documentation, operating expense fields where the exclusion list was in a generic notes field, amendment-related fields not updated after amendments were filed, and any field where the prior system allowed free-text entry that will not map cleanly to a structured field in the target system.
How do you handle abstract records where the source documents are no longer available?
If the existing value is the best available information, load it into the new system with a flag indicating it could not be verified against source documents. If the existing value is known to be wrong but cannot be determined without the source document, load the field as blank with an exception note. Loading a known-wrong value without flagging it is the worst option, it creates a false sense of reliability.
What is field-level consistency validation and how does it help cleansing?
Field-level consistency validation checks whether related fields in the same record agree with each other: does the expiration date equal commencement plus stated term, does the pro rata share percentage match numerator divided by denominator, does the CAM cap carve-out list include categories also listed as controllable expenses. Inconsistencies typically indicate an error in at least one field, a field updated by an amendment without updating related fields, or a data entry error.
Should data cleansing happen before or after system implementation?
Data cleansing should happen before the full migration. Cleansing after migration is harder because the data is in a new system with a different interface and potentially different field configurations. The most efficient sequence is: run field-level consistency validation, prioritize which issues require source document review vs format normalization, resolve format issues through automated normalization, resolve substantive field errors through document review, run pre-migration testing on the cleansed data, then migrate.