Hong Kong's Land Registry holds title records for more than 2.8 million properties. The Companies Registry maintains filings for hundreds of thousands of registered entities. Both institutions spent years digitising their paper archives — and both ended up with the same problem: thousands of scanned document images filed more than once, creating reference conflicts, slowing searches, and quietly undermining the reliability of records that underpin billions of dollars in daily transactions.
The issue matters now because the Hong Kong government's push to deepen Greater Bay Area economic integration has made clean, interoperable data a precondition for cross-border property ownership schemes and corporate registration harmonisation with Mainland authorities. Duplicate or conflicting image records do not merely inconvenience a solicitor on Des Voeux Road Central — they can stall cross-border due diligence entirely, a point that has been raised repeatedly at Legislative Council panels reviewing the iAM Smart digital identity rollout.
How the Duplication Problem Built Up Over Years
The roots run back to a mid-2000s policy shift. The Land Registry launched its IRIS (Integrated Registration Information System) platform in phases between 2007 and 2010, scanning tens of thousands of pre-computer-era paper deeds held in the basement vaults at Queensway Government Offices. Scanning was contracted in batches, with different vendors using different resolution standards and file-naming conventions. When batches overlapped — as they did at boundary dates in the migration schedule — the same physical document was sometimes scanned twice and uploaded under slightly different metadata tags. Neither upload was automatically flagged as a duplicate because the validation logic checked document reference numbers, not image content.
The Companies Registry ran into analogous difficulties during its own eDR (Electronic Document Repository) build-out, which accelerated from 2012 onward. Filers submitting annual returns at Queensway Plaza sometimes resubmitted documents when the online portal timed out mid-upload, generating second copies in the queue. Registry staff processed both. By 2019, internal audits had identified the category as a known data-quality issue, but remediation was deprioritised as the registry coped first with a surge in new incorporations and then, from early 2020, with the operational disruptions of the pandemic.
The 2020 national security period added another layer. Several law firms in Central and Sheung Wan accelerated bulk digitisation of client title files to allow remote working, uploading documents to the registry's e-submission portal under time pressure. Quality-control steps that would normally have caught duplicate submissions were compressed. The period between March 2020 and June 2021 is now cited in government working papers reviewed by this newspaper as the single largest contributor to the duplicate image backlog in Land Registry holdings.
The Scale of the Problem and What Comes Next
The government has not published a comprehensive public count of confirmed duplicates, but the Digital Policy Office — established formally in July 2023 under the Innovation, Technology and Industry Bureau — has been coordinating a cross-agency deduplication exercise since the fourth quarter of 2024. That exercise applies perceptual hashing algorithms to compare image files rather than relying on metadata alone, a methodology piloted first on a subset of Companies Registry filings covering the period 2018 to 2022.
For property buyers and their solicitors, the practical advice from registry guidance updated in May 2026 is to request a fresh certified copy of any title document dated before 2015 when conducting due diligence on transactions above HK$5 million, rather than relying solely on the digital image already in the system. Conveyancing firms on Wyndham Street and around Pacific Place have started building this step into standard instruction letters. It adds roughly two to three working days and a modest disbursement cost, but it eliminates the risk of a transaction being queried at the eleventh hour over an image conflict. The Digital Policy Office has indicated that a public-facing status dashboard for the deduplication project is planned for release before the end of 2026, which would at least let practitioners track progress in real time.