Blogs

Unicus helps a Bulgarian financial data provider build the region's largest financial database, accelerating processing by over 50%

Published : May 26, 2025

The Problem:

A financial data provider based in Bulgaria set an ambitious goal to build the largest comprehensive database of company financials in the region. However, various factors threatened this vision. Several OCR vendors failed to meet their accuracy and scalability requirements, trapping 40% of important financial data in subpar documents.

Main technical challenges:

- Inaccurate information extraction with current solutions

- Inadequate PDFs significantly reduced the effectiveness of data extraction

- Financial data is dispersed across various types of statements, each with different data representations

- Manual intervention and adjustments were necessary for missing data points.

- The system rejected documents that did not fully adhere to the established templates. Silent data loss from incomplete documents hindered the overall growth of the database.

The client's current system created a significant bottleneck by labeling any document with missing information as "data deficient" and outright rejecting it. Additionally, the client's system was not built to handle files with insufficient data. As a result, the client discovered significant information gaps that they had not previously recognized. In accordance with their standard procedure for manual processing, the client requested that we overcome this obstacle and continue to fill the template with derived data.

Our Solution

As a POC, we worked on around 50 documents with varying difficulty levels. Our financial data extraction agent extracted the information with 95% accuracy, which the client found to be highly impressive. This led to scaled-up delivery, processing 1,000 documents in batches.

Getting Past Implementation Obstacles: The Problem of Template Adherence

Addressing the client's requirement for template adherence resulted in the biggest breakthrough. We created a complex workaround that significantly altered the client team's data processing capabilities in close collaboration with our in-house subject matter experts.

Reconstructing Data Intelligently:

To guarantee that every document satisfies the precise template structure needed by the client's system, we developed automated systems that find and recreate missing data points.

Advanced Document Processing:

We applied AI-driven features specifically designed to manage low-quality documents while maintaining 95% data quality standards across previously inaccessible data sources.

Intelligent Mapping System:

We created systems that use rules to pull out information, making sure that different words for the same data are treated the same, unifying different ways of showing data, and combining information from various types of financial statements.

Automated Value Derivation:

New systems were created to fill in missing information by automatically changing and calculating values from notes and indirect references while keeping the data consistent and accurate.

Business Impact:

- The system eliminated almost all of the data gaps that previously restricted the database's coverage width, increasing coverage from 60% to 99%.

- Increased language coverage from 8–10 languages to 40+ languages, enabling comprehensive regional coverage and establishing the client as the go-to source for financial information across various markets.

- Enhanced data accuracy from 70–75% to 98%, providing a solid basis for financial analysis and decision-making

- By removing silent data loss and unlocking previously inaccessible information, we successfully built the largest comprehensive financial database in the region.


Strategic Business Results

Through this collaboration, the client's standing was improved from that of a regional player with data gaps to that of the leading source of financial data for all of their target markets. We helped them realize their ambitious goal and set new benchmarks for data quality and coverage in the financial information industry by resolving the template adherence issue and putting intelligent data reconstruction into practice.

Recent Posts

Information coverage increased to

99%

Data accuracy increased to

93-95%

One-month processes now completed

in 1-2 weeks