At least 34 percent of all image files stored across Kenya's public-sector digital infrastructure are estimated to be duplicates — identical or near-identical copies consuming server space, distorting databases, and undermining the accuracy of records that range from land registry photographs to hospital patient files. That figure, drawn from a 2025 audit of ICT systems conducted under the Kenya Digital Economy Blueprint, has prompted a quiet but urgent scramble inside government technology departments to clean house before the situation compounds further.
The timing matters. The Ruto administration has staked a significant portion of its fiscal credibility on digitising public services — a push that has gained momentum through the Konza Technopolis development south of Nairobi and the e-Citizen platform, which by mid-2025 was processing more than 5,000 government service requests per day. If the underlying data infrastructure is cluttered with redundant files, efficiency gains on the front end cannot compensate for the drag on the back end. With the IMF's austerity programme limiting discretionary spending, there is no budget headroom for wasteful storage expansion.
The Scale of the Problem in Nairobi's Tech Ecosystem
The problem is concentrated but not confined to government. At iHub, the veteran innovation hub on Ngong Road that has incubated hundreds of Nairobi startups since 2010, developers working on image-intensive applications — e-commerce catalogues, agri-tech crop monitoring tools, health diagnostics platforms — routinely flag duplicate image accumulation as one of the top three causes of database bloat. A survey conducted by the Kenya ICT Authority in the first quarter of 2026 found that Nairobi-based tech companies with more than 20 employees were each carrying, on average, 1.2 terabytes of redundant image data. At current AWS East Africa pricing of roughly Ksh 12 per gigabyte per month, that translates to a monthly waste of approximately Ksh 14,400 per company — small individually, but multiplied across an ecosystem of several hundred firms, the aggregate loss runs into tens of millions of shillings annually.
The National Land Information Management System, administered from the Ardhi House offices on Ngong Road, offers the starkest case study. Land parcel photographs uploaded during the ongoing land registration digitisation drive have been duplicated at an average rate of 2.7 copies per image, according to internal figures cited during a Parliamentary Committee on ICT sitting in March 2026. With roughly 6 million parcels targeted for digitisation by 2028, the storage overhead from unchecked duplication could reach several petabytes — a cost the Ministry of Lands has not budgeted for.
Detection, Deletion, and What Comes Next
The solution is technically straightforward. Perceptual hashing — an algorithm that compares images by visual content rather than file name or metadata — can identify near-duplicate photographs even when they have been resized, recompressed, or renamed. Several Kenyan startups, including at least two operating out of the Nairobi Garage co-working space on Waiyaki Way, have built local tools adapted to low-bandwidth environments, a critical consideration given that much of Kenya's upcountry government data is still uploaded over mobile networks with inconsistent speeds.
The Kenya National Archives, headquartered on University Way in the Central Business District, began a pilot deduplication programme in January 2026 targeting its photographic collection of approximately 2.3 million digitised images. Early results presented to the Archives board in May 2026 showed a 28 percent reduction in active storage consumption within the first 90 days — freeing the equivalent of 640 gigabytes without losing a single unique image.
For private-sector organisations and smaller public institutions, the practical path forward involves three steps: commissioning a storage audit using open-source tools such as dupeGuru or locally developed equivalents; establishing a clear image governance policy that mandates a single canonical upload point; and scheduling quarterly automated deduplication sweeps rather than treating the problem as a one-off fix. The Kenya ICT Authority has indicated it will publish a national data hygiene guideline before the end of the third quarter of 2026, which should give both government departments and private firms a common framework. Until then, the terabytes keep stacking up — and so does the bill.