Skip to main content
The Daily Hong Kong

Hong Kong news, every day

News

Hong Kong's Digital Archives Contain Thousands of Duplicate Photos, Cleanup Stalls

Decades of fragmented digitisation projects, shifting government contractors and rapid platform migrations left the city's public image repositories riddled with redundant files — and cleaning them up is proving harder than anyone anticipated.

Share

By Hong Kong News Desk · Published 5 July 2026 at 6:17 am

4 min read

Updated 3 h ago· 5 July 2026 at 1:51 pm

How we reported this

This article was generated by AI from the linked public sources. The Daily Hong Kong is independently owned and covers Hong Kong news free from advertiser or sponsor influence. Read our editorial standards →

Hong Kong's Digital Archives Contain Thousands of Duplicate Photos, Cleanup Stalls
Photo: Thomas Chisholm Anstey / Public domain (Wikimedia Commons)

Hong Kong's public-sector digital image libraries contain an estimated tens of thousands of duplicate photograph entries, a problem that has quietly accumulated since the first major government digitisation push in the late 1990s and now threatens the credibility of several high-profile archival projects slated to go live before the end of 2026. The Hong Kong Public Records Office, housed in the government complex off Kwun Tong Road in Kowloon, acknowledged the issue in internal procurement documents circulated earlier this year, which called for specialist vendors to audit and reconcile image metadata across at least four legacy content management systems.

The timing matters. With the Greater Bay Area integration agenda accelerating cross-border data-sharing between Hong Kong institutions and Mainland counterparts in Shenzhen and Guangzhou, duplicated or mislabelled image assets are no longer just a housekeeping embarrassment. They represent a concrete barrier to interoperability. A photograph of the Star Ferry Pier timestamped incorrectly and stored under three separate file names in three separate databases becomes a liability the moment that record is expected to sync with a unified regional cultural heritage portal.

How the Duplication Accumulated

The roots of the problem stretch back to 1997 and the years immediately following the handover, when multiple government bureaux — working without a unified digital asset management standard — each commissioned their own scanning and cataloguing workflows. The Information Services Department on Edinburgh Place in Central ran one programme. The Leisure and Cultural Services Department operated another, covering photographs from city museums including the Hong Kong Museum of History in Tsim Sha Tsui. When the Government Records Service migrated to a new platform in 2007, batch imports pulled files from both systems without deduplication checks, according to the procurement tender documents.

The problem compounded again between 2015 and 2019, when a series of platform upgrades pushed by the Office of the Government Chief Information Officer encouraged departments to upload image collections to the centralised cloud infrastructure under the Digital Government Blueprint. Departments often uploaded their own local copies alongside the migrated central copies, effectively doubling the problem. By the time the Smart City Blueprint 2.0 was published in 2020, the duplication issue was documented internally but deprioritised against higher-visibility projects like the iAM Smart digital identity rollout.

Private-sector archives face parallel pressures. The South China Morning Post, which maintains one of the largest commercial photographic archives in the region, undertook its own internal deduplication exercise between 2022 and 2023 after migrating to a new digital asset management platform. Industry observers who work with multiple Hong Kong media clients describe the problem as sector-wide, noting that wire-service photograph feeds received over decades were routinely ingested multiple times across editorial and archive systems without automated matching.

The Technical and Commercial Stakes

Duplicate image replacement — the process of identifying a canonical version of an image, retiring redundant copies, and updating all internal links to point to the single authoritative file — is labour-intensive and technically complex. File hashes can catch exact duplicates, but near-duplicate images, such as cropped or colour-adjusted variants of the same original frame, require perceptual hashing algorithms or manual review. Vendors bidding on the Public Records Office contract quoted day rates of between HK$3,500 and HK$7,000 per specialist reviewer, according to figures in the publicly posted tender, with project timelines running from six to eighteen months depending on scope.

For Hong Kong's ambitions as a regional data hub, the stakes are practical. The city's Data Strategy published in 2024 explicitly positions Hong Kong as a governance model for structured data exchange within the Greater Bay Area. Unresolved duplication in foundational public archives undercuts that positioning, particularly against Singapore, which completed a comparable National Archives deduplication programme by 2022.

Institutions managing image collections should act now on a few clear steps. Conduct a file-hash audit first — it costs relatively little and eliminates exact duplicates immediately. Allocate budget in the current financial year for perceptual hashing software licences before the next platform migration cycle begins. And build deduplication requirements explicitly into any new vendor contracts, a lesson that multiple government bureaux learned only after the damage was done.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily Hong Kong

Covering news in Hong Kong. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to Hong Kong news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Hong Kong and accept our Privacy Policy. Unsubscribe anytime.

Before you go

Get the Hong Kong brief

The day's Hong Kong news in a 2-minute read. Free, weekday mornings.

No spam. Unsubscribe anytime.