Smarter Search for Law Enforcement: Evaluating Context-Aware and Traditional Approaches


Article Preview

Author: Archan Dutta & Thaís D. Poi

Issue: Summer Issue, 2026

Download Article

Abstract

Despite widespread and rapid advancements in natural language processing, most law enforcement case and report management systems still rely on keyword-based search engines. Information retrieval within law enforcement agencies depends on obtaining vast amounts of unstructured text data, including case reports, incident reports, and written statements. Investigators must sift through and analyze this data to find and discover meaningful leads. Keyword-based search engines face challenges with synonyms, equivalent terms, abbreviations, slang, paraphrases, and narrative fluctuation often found in police reports and case files. As a result, investigators may overlook potentially relevant cases unless they manually craft multiple keyword searches. This quantitative study evaluated, using a synthetic dataset generated for this study, whether context-aware models (also known as semantic models or embedding models) reach higher retrieval accuracy than traditional keyword-based search models on law enforcement reports. Results of the study showed that semantic models have significantly better performance than traditional keyword-based search, an improvement of 56 percentage points. Besides investigators, inspectors, and analysts, smarter and enhanced search may benefit other key parties like patrol officers, crime analysts, and prosecutors by allowing faster discovery of related incidents, improved case connection, and richer text analysis. These results highlight the potential to improve operational efficiency, reduce administrative and management burden, and strengthen public safety outcomes.