ICPR 2024 Competition on Word Image Recognition from Indic Scene Images

Competition Updates

Registration Opens : April 28, 2024

Training Data Release : June 3, 2024 (Revised)

Test Data Release : July 25, 2024

Registration Close : August 5, 2024

Results Submission Deadline : August 5, 2024

Winner Announcement : August 15, 2024

Recents Updates

Registration Open : April 28, 2024

Training Data Release : June 3, 2024

Introduction

Image 1
Fig. 1. Geographical Coverage of Targeted Languages: The figure illustrates the distribution of languages covered by the competition, spanning across various regions of India. With the competition taking place in India, this presents a significant opportunity to engage the local research community and witness enthusiastic participation.


Rich textual content found in natural settings contains valuable information that greatly enhances understanding of the environment in contemporary times. Such text is utilized for tasks like image search, translation, transliteration, assistive technologies (especially for the visually impaired), autonomous navigation, and more.

Scene text recognition is increasingly crucial, with its solution promising advancements in various downstream tasks like above. Despite significant progress in this field, there are areas for further enhancement. These include developing models that can handle diverse languages, fonts, layouts, and styles, as well as creating solutions resilient to text-related image imperfections such as blurriness, occlusion, and uneven illumination. Researchers have sought to tackle these challenges by curating datasets tailored to specific problems, each highlighting distinct features and representing subsets of real-world challenges.

Because Indian language scripts are visually more complex, and their output space is much larger than English languages, not all Latin STR models can mimic performance in Indian STR solutions. STR solutions have not progressed in the case of Indian languages, which are spoken by 17% of the world population, due to a lack of real datasets and models that are better equipped to handle the inherent complexities of the languages. Non-Latin languages have made less progress, and existing Latin STR models need to generalize better to different languages. Through this competition we aim to tackle this challenge within the scene text domain of Document Analysis and Recognition community.

In line with the goal of achieving accuracy in scene text recognition across diverse languages, this competition targets nearly all Indian languages. These languages, constituting approximately 17% of the world’s population, share syntactic and semantic similarities and serve as crucial means of communication.

The challenge comprises only one task - cropped word image recognition. The participants have to predict the text in all the cropped word images of the test set in a total of 10 targeted Indian languages.