Hong Kong's public-sector digital archives contain millions of redundant image files, a sprawling inheritance from more than a decade of uncoordinated digitisation drives that saw departments scan, upload and re-upload documents with little cross-referencing or deduplication oversight. The Office of the Government Chief Information Officer — based in Wan Chai's Harbour Centre — confirmed earlier this year that a territory-wide audit of government image repositories was underway, part of a broader push to bring digital asset management up to international standards before the city's next round of smart city benchmarking.
The timing is not accidental. Hong Kong has been under mounting pressure to defend its credentials as a financial and technology hub, particularly as Singapore has aggressively expanded its own digital infrastructure. The Greater Bay Area integration agenda has added urgency: as agencies on both sides of the border begin linking databases, duplicated or mislabelled image files create real friction in data-sharing workflows. A duplicated identity photograph or a mismatched property record image can delay a transaction, trigger a compliance flag, or in sensitive cases, produce a false match in verification systems.
How the Backlog Built Up
The roots of the problem run back to roughly 2009 and 2010, when multiple bureaux launched parallel digitisation programmes with separate contractors and incompatible metadata standards. The Land Registry in Queensway, the Companies Registry in Wan Chai, and district offices across Kowloon and the New Territories all built their own image stores. By 2015, several of these had been partially migrated to the government's centralised cloud infrastructure, but the migrations were additive rather than deduplicated — old file versions were retained alongside new ones as a precaution against data loss.
The problem compounded through the early 2020s. The disruptions of 2019 and 2020, combined with a surge in remote work, pushed more staff to upload documents digitally rather than relying on shared physical files. Each scan created a new file. Without a functioning hash-check or perceptual hashing layer in the upload pipelines, the same document image could exist in three or four slightly different versions — varying in resolution, rotation, or compression — none of which would be caught by a simple filename check. The Digital 21 Strategy, last substantially updated in 2017, had not anticipated the scale of this accumulation.
Cleaning the Archive
The current remediation effort is being coordinated through the OGCIO alongside the Innovation, Technology and Industry Bureau, which relocated its operational teams to the Hong Kong Science and Technology Parks Corporation's Pak Shek Kok campus in 2023 as part of a broader consolidation. Tender documents published in late 2025 show the government seeking contractors with experience in perceptual hashing, vector-based image similarity search, and automated metadata reconciliation — technologies well-established in commercial content platforms but relatively new to public-sector workflows in the city.
The scale involved is significant. Government procurement notices from the period indicate the initial scope covers repositories holding more than 40 million image assets across participating bureaux, with a project timeline running through mid-2027. Commercial property databases and licensed professional registries are expected to follow in a second phase.
For businesses operating in Central, Sheung Wan and Kwun Tong that interface with government portals for licensing or compliance submissions, the practical implication is that some stored document images may be flagged, cross-checked or temporarily unavailable during rolling remediation windows. The OGCIO has advised portal administrators to retain local copies of any submissions made between 2018 and 2023 as a precaution.
The longer-term dividend, officials argue, is a leaner, faster-querying archive that can plug more cleanly into the cross-boundary data corridors being built under the Guangdong-Hong Kong data cooperation framework. For now, the city is doing the unglamorous work of sorting through its own digital attic — one duplicate image at a time.