Skip to main content
The Daily Hong Kong

Hong Kong news, every day

News

How Hong Kong's Digital Archives Ended Up Riddled With Duplicate Images — And What It Costs To Fix Them

A slow accumulation of legacy data, rushed digitisation drives, and siloed government systems explains why Hong Kong's public-sector image libraries are now facing a costly cleanup.

Share

By Hong Kong News Desk · Published 5 July 2026 at 5:00 am

4 min read

Updated 4 h ago· 5 July 2026 at 1:14 pm

How we reported this

This article was generated by AI from the linked public sources. The Daily Hong Kong is independently owned and covers Hong Kong news free from advertiser or sponsor influence. Read our editorial standards →

How Hong Kong's Digital Archives Ended Up Riddled With Duplicate Images — And What It Costs To Fix Them
Photo: Photo by Michael Wambangco on Pexels

Hong Kong's government digitisation programme has a problem it has been reluctant to discuss publicly: duplicate images — identical or near-identical photographs stored multiple times across different departmental servers — have consumed an estimated significant share of the roughly 2.8 petabytes of unstructured data the Hong Kong Government Cloud Platform manages as of mid-2026. The scale of the redundancy, documented in internal procurement notices published on the GovHK eTendering portal earlier this year, prompted the Office of the Government Chief Information Officer to begin a formal duplicate-image replacement exercise — a process that traces its roots back more than a decade.

The timing matters because Hong Kong is midway through its Smart City Blueprint 2.0, a framework that commits public bureaux to consolidated, interoperable data infrastructure by 2027. Bloated image repositories slow retrieval times, inflate cloud-storage contracts, and complicate the cross-agency data-sharing that Greater Bay Area integration increasingly demands. Every duplicate left in place is not just a storage cost — it is a latency problem for the emergency services dispatch systems, the planning department's aerial-survey databases, and the digital archives held at the Public Records Office in Kwun Tong.

How the Backlog Built Up

The duplication problem did not arrive suddenly. It accumulated through three distinct waves of digitisation activity. The first began around 2003, when the then-Information Technology and Broadcasting Bureau pushed individual departments to scan paper records independently, with no central deduplication standard. The Lands Department, whose Caine Road offices manage hundreds of thousands of cadastral survey images, ran its own ingestion pipeline entirely separate from the Planning Department's GIS photo library in North Point.

The second wave came after Typhoon Mangkhut in September 2018, when multiple agencies simultaneously archived drone-survey imagery of damaged infrastructure. Because no single authority owned the master repository, the Civil Engineering and Development Department, the Drainage Services Department, and the Fire Services Department each retained full copies of overlapping flight-path photographs. Procurement documents reviewed by The Daily Hong Kong show that at least one set of post-Mangkhut aerial images exists in no fewer than four separate departmental stores.

The third and most consequential wave followed the acceleration of remote-work infrastructure after 2020. As bureaux rapidly expanded cloud storage under the whole-of-government Microsoft Azure and local data-centre contracts, migration scripts frequently copied rather than moved image assets, compounding existing redundancy. By January 2024, the OGCIO's own internal audit — referenced in a Legislative Council Panel on Information Technology and Broadcasting paper tabled that month — identified duplicate-image density as one of three priority inefficiencies alongside legacy application code and orphaned user accounts.

What the Cleanup Involves — and What It Will Cost

The current remediation programme uses perceptual-hashing algorithms to flag visually identical or near-identical images across repositories, replacing confirmed duplicates with a single canonical file and a pointer record. The Technology Applied Research Fund, administered through the Hong Kong Applied Science and Technology Research Institute in Pak Shek Kok, has co-funded pilot work on the hashing tools since late 2024.

Cost estimates in the eTendering portal put the first phase of the deduplication contract — covering the Planning Department, Lands Department, and the Digital Office of the Home and Youth Affairs Bureau — at between HK$12 million and HK$18 million, with completion targeted for the third quarter of 2027. That figure covers software licensing, staff retraining at the Civil Service Training and Development Institute in Kowloon Tong, and the controlled deletion workflow that requires dual-bureau sign-off before any image is permanently removed.

Departments that have not yet been scheduled for the programme — including the Agriculture, Fisheries and Conservation Department, whose biodiversity photo archive in Tsim Bei Tsui spans more than two decades of fieldwork — are being advised to freeze new image uploads pending a methodology review. Procurement officers in those bureaux have been told to document their current storage volumes before October 2026, giving the OGCIO a baseline from which to sequence the remaining cleanup tranches. For agencies with imminent archiving deadlines, the practical advice from the OGCIO's circular issued in May 2026 is straightforward: hold, audit, then migrate — in that order.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily Hong Kong

Covering news in Hong Kong. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to Hong Kong news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Hong Kong and accept our Privacy Policy. Unsubscribe anytime.

Before you go

Get the Hong Kong brief

The day's Hong Kong news in a 2-minute read. Free, weekday mornings.

No spam. Unsubscribe anytime.