close
Data processing
Aggregation(合併)
譬如體重和身高
合併為BMI
Sampling
Simple random sampling 隨機取樣
Sampling without replacement 取樣後不補回
Sampling with replacement 取樣後補回
Stratified sampling 資料分堆(partition)後各別取樣並merge
Dimensionality Reduction
資料屬性變高的時候
PCA(python library)
redundant features
Irrelevant features
tec:
brute force
embedding
filter
wrapper
Feature Engineering
Feature subset seletion
Feature creation
Discretization and binarization
feature extraction
mapping data to new space
feature construction
Attribute transformation
正規化標準化
正規化好像比較有意義
標準化將資料轉常態分佈
相似度
相似性、相異性
Euclidean Distance
文章標籤
全站熱搜