Bucketization

Converting a single feature into multiple binary features called buckets or bins, typically based on a value range. The chopped feature is typically a continuous feature.1

For example, instead of representing temperature as a single continuous floating-point feature, you could chop ranges of temperatures into discrete buckets, such as:

  • <=10<= 10 degrees Celsius would be the “cold” bucket.
  • 112411 - 24 degrees Celsius would be the “temperate” bucket.
  • >=25> = 25 degrees Celsius would be the “warm” bucket.

The model will treat every value in the same bucket identically. For example, the values 13 and 22 are both in the temperate bucket, so the model treats the two values identically.

Footnotes

  1. developers.google.com/machine-learning/glossary#bucketing

2024 © ak