Hong Kong's government digitisation programme has a problem it has been reluctant to discuss publicly: duplicate images — identical or near-identical photographs stored multiple times across different departmental servers — have consumed an estimated significant share of the roughly 2.8 petabytes of unstructured data the Hong Kong Government Cloud Platform manages as of mid-2026. The scale of the redundancy, documented in internal procurement notices published on the GovHK eTendering portal earlier this year, prompted the Office of the Government Chief Information Officer to begin a formal duplicate-image replacement exercise — a process that traces its roots back more than a decade.
The timing matters because Hong Kong is midway through its Smart City Blueprint 2.0, a framework that commits public bureaux to consolidated, interoperable data infrastructure by 2027. Bloated image repositories slow retrieval times, inflate cloud-storage contracts, and complicate the cross-agency data-sharing that Greater Bay Area integration increasingly demands. Every duplicate left in place is not just a storage cost — it is a latency problem for the emergency services dispatch systems, the planning department's aerial-survey databases, and the digital archives held at the Public Records Office in Kwun Tong.
How the Backlog Built Up
The duplication problem did not arrive suddenly. It accumulated through three distinct waves of digitisation activity. The first began around 2003, when the then-Information Technology and Broadcasting Bureau pushed individual departments to scan paper records independently, with no central deduplication standard. The Lands Department, whose Caine Road offices manage hundreds of thousands of cadastral survey images, ran its own ingestion pipeline entirely separate from the Planning Department's GIS photo library in North Point.
The second wave came after Typhoon Mangkhut in September 2018, when multiple agencies simultaneously archived drone-survey imagery of damaged infrastructure. Because no single authority owned the master repository, the Civil Engineering and Development Department, the Drainage Services Department, and the Fire Services Department each retained full copies of overlapping flight-path photographs. Procurement documents reviewed by The Daily Hong Kong show that at least one set of post-Mangkhut aerial images exists in no fewer than four separate departmental stores.
The third and most consequential wave followed the acceleration of remote-work infrastructure after 2020. As bureaux rapidly expanded cloud storage under the whole-of-government Microsoft Azure and local data-centre contracts, migration scripts frequently copied rather than moved image assets, compounding existing redundancy. By January 2024, the OGCIO's own internal audit — referenced in a Legislative Council Panel on Information Technology and Broadcasting paper tabled that month — identified duplicate-image density as one of three priority inefficiencies alongside legacy application code and orphaned user accounts.
What the Cleanup Involves — and What It Will Cost
The current remediation programme uses perceptual-hashing algorithms to flag visually identical or near-identical images across repositories, replacing confirmed duplicates with a single canonical file and a pointer record. The Technology Applied Research Fund, administered through the Hong Kong Applied Science and Technology Research Institute in Pak Shek Kok, has co-funded pilot work on the hashing tools since late 2024.
Cost estimates in the eTendering portal put the first phase of the deduplication contract — covering the Planning Department, Lands Department, and the Digital Office of the Home and Youth Affairs Bureau — at between HK$12 million and HK$18 million, with completion targeted for the third quarter of 2027. That figure covers software licensing, staff retraining at the Civil Service Training and Development Institute in Kowloon Tong, and the controlled deletion workflow that requires dual-bureau sign-off before any image is permanently removed.
Departments that have not yet been scheduled for the programme — including the Agriculture, Fisheries and Conservation Department, whose biodiversity photo archive in Tsim Bei Tsui spans more than two decades of fieldwork — are being advised to freeze new image uploads pending a methodology review. Procurement officers in those bureaux have been told to document their current storage volumes before October 2026, giving the OGCIO a baseline from which to sequence the remaining cleanup tranches. For agencies with imminent archiving deadlines, the practical advice from the OGCIO's circular issued in May 2026 is straightforward: hold, audit, then migrate — in that order.