In the last several years, AI/ML technologies have become pervasive in academia and industry, finding its utility in newer and challenging applications. While there has been a focus to build better, smarter and automated AI pipelines, little work has been done to systematically understand the challenges in determining the readiness of data to be fed to this pipeline. Given a business problem, questions whose answers are still elusive include: how does one select the right data from a data source? Is the data collected of the appropriate quality? If not, what cleaning techniques should be applied, and how to determine if the goals of data cleaning are achieved? and so on. Researchers and practitioners alike have increasingly come to the realization that the real-world utility of an ML model is only as good as the data it has been trained on. Therefore, developing techniques and frameworks that help us determine the readiness of data for training and deploying machine learning models is of utmost importance.
The goal of this workshop will be to get researchers working in the fields of data acquisition, data labeling, data quality, data preparation and AutoML areas to understand how the data issues, their detection and remediation will help towards building better models. With the focus on different modalities such as structured data, time series data, text data and graph data, this workshop invites researchers from academia and industry to submit novel propositions for systematically identifying and mitigating data issues for making it AI ready. Methods of data assessment can change depending on the modality of the data. This workshop will invite submissions for data readiness for different modalities: structured (or tabular) data, unstructured (such as text) data, graph structured (relational, network) data, time series data, etc. We would like to explore state-of-the-art deep learning and AI concepts such as deep reinforcement learning, graph neural networks, self-supervised learning, capsule networks and adversarial learning to address the problems of data assessment and readiness.
Authors are invited to submit original, previously unpublished research papers. Research papers, up to 10 pages, describing original and novel research work, including research results and evaluations should be submitted. Research papers should not have been published or submitted for publication concurrently elsewhere.
Papers should be written in English, following Springer LNCS style including all text, references, appendices, and figures. Since it is single blind review process, please include author names and affiliations. For formatting instructions and templates, see the Springer Web page: http://www.springer.de/comp/lncs/authors.html (LNCS Template Overleaf). Submitted papers will be evaluated by at least three members of the international program committee. At least one author of each accepted paper must register and participate in the workshop to present the paper. The workshop papers will be included in LNCS/LNAI post Proceedings of PAKDD Workshops published by Springer .
Submissions should be made via the Easychair system through the submission page available here: https://easychair.org/my/conference?conf=datareadiness2021
The submitted papers must not be previously published anywhere and must not be under consideration by any other conference or journal during the data-datareadiness2021 review process.
To be announced.
For any queries reach out to us at firstname.lastname@example.org