Dataset (machine learning)
A collection of raw data, commonly (but not exclusively) organized in one of the following formats:1
- a spreadsheet
- a file in CSV format
Chacteristics
A dataset is characterized by its size and diversity. Good datasets are both large and highly diverse:2
- Size indicates the number of examples.
- Diversity indicates the range those examples cover.