Machines Learn Better with Better Data Ontology: Lessons from Philosophy of Induction and Machine Learning Practice

Dan Li

Download from

dx.doi.org

More download options

Machines Learn Better with Better Data Ontology: Lessons from Philosophy of Induction and Machine Learning Practice

Dan Li

Minds and Machines 33 (3):429-450 (2023) Copy BIBT_EX

Abstract

As scientists start to adopt machine learning (ML) as one research tool, the security of ML and the knowledge generated become a concern. In this paper, I explain how supervised ML can be improved with better data ontology, or the way we make categories and turn information into data. More specifically, we should design data ontology in such a way that is consistent with the knowledge that we have about the target phenomenon so that such ontology can help us make the inductive leap. I do so by thinking through a thought experiment, Goodman’s New Riddle of Induction (Fact, fiction, and forecast, Harvard University Press, 1955). Goodman’s riddle helps flesh out three problems of induction: (1) the problem of equal goodies, that there are often too many equally good inductive results given the same data; (2) the problem of diverging performance, that these equally good results can give opposite predictions in the future; and (3) the problem of mediocrity, that when averaged across all equally possible datasets and tasks, no inductive algorithm outperforms any other. I show that all these three problems are manifested as real obstacles in ML practice, namely, the Rashomon effect (Breiman in Stat Sci 16(3):199–231, 2001), the problem of underspecification (D’Amour et al. in J Mach Learn Res, 2020, https://doi.org/10.48550/arXiv.2011.03395 ), and the No Free Lunch theorem (Wolpert in Neural Comput 8(7):1341–90, 1996, https://doi.org/10.1162/neco.1996.8.7.1341 ). Lastly, I argue that proper data ontology can help mitigate these problems and I demonstrate how using concrete examples from climate science. This research highlights the links between philosophers’ discussions of induction and implications in ML practice.

Cite

Plain text

BibTeX

Formatted text

Zotero

EndNote

Reference Manager

RefWorks

Options

Edit

Mark as duplicate

Find it on Scholar

Request removal from index

Revision history

Author's Profile

Dan Li

Baruch College (CUNY)

Keywords

Artificial Intelligence Cognitive Psychology Game Theory, Economics, Social and Behav. Sciences Philosophy of Mind Philosophy of Science Theory of Computation

Reprint years

DOI

10.1007/s11023-023-09639-9

Other Versions

No versions found

My notes

Analytics

Added to PP
2023-06-23

Downloads
71 (#294,972)

6 months
22 (#135,175)

Historical graph of downloads

How can I increase my downloads?

Author's Profile

Dan Li

Baruch College (CUNY)

Citations of this work

On trusting chatbots.P. D. Magnus - forthcoming - Episteme.

Searching for Features with Artificial Neural Networks in Science: The Problem of Non-Uniqueness.Siyu Yao & Amit Hagar - 2024 - International Studies in the Philosophy of Science 37 (1):51-67.

Add more citations

References found in this work

Fact, Fiction, and Forecast.Nelson Goodman - 1955 - Philosophy 31 (118):268-269.

A material theory of induction.John D. Norton - 2003 - Philosophy of Science 70 (4):647-670.

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Cynthia Rudin - 2019 - Nature Machine Intelligence 1.

Can Machines Learn How Clouds Work? The Epistemic Implications of Machine Learning Methods in Climate Science.Suzanne Kawamleh - 2021 - Philosophy of Science 88 (5):1008-1020.

The Lack of A Priori Distinctions Between Learning Algorithms.David H. Wolpert - 1996 - Neural Computation 8 (7):1341–1390.

View all 11 references / Add more references

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

Machines Learn Better with Better Data Ontology: Lessons from Philosophy of Induction and Machine Learning Practice

Abstract

Author's Profile

Categories

Keywords

Reprint years

DOI

Other Versions

Links

PhilArchive

External links

Through your library

My notes

Similar books and articles

Analytics

Author's Profile

Citations of this work

References found in this work