全球旧事资料 分类
HW3
PartIwritte
assig
me
t:
1Clusterthefollowi
g8poi
tsi
tothreeclusters
A1210A225A384A458A575A664A712A849Thedista
cefu
ctio
isEuclidea
dista
ceSupposei
itiallyweassig
A1A4A7asthece
terofeachclusterrespectivelyUsetheKMea
salgorithmtoshowo
ly1Thethreeclusterce
terafterthefirstrou
dexecutio
点A2到A1的距离为5;到A4为32;到A7为10;同理有下表A2A3A5A6A8A15A4A7
32
5
1053
6252
1317
2
352958
213
5
由上表可知第一轮后3个集为1A12A3A4A5A6A83A2A7他们的中心是1210266315352Thefi
althreeclusters第二轮210661535A1042542A2A3A4A5A65
17
22
25425185325205
62
13
52
5
2
2
213
fA7A8
65
5
41
25365
13
第二轮后3个集为1A1A82A3A4A5A63A2A7更新后他们的中心是139526552531535第三轮395A1A2A3A4A5A6A7A8655251535
12521255525625362539256025125
42812520312538125981252312518125408125203125
4252542518532520525365
第二轮后3个集为1A1A8A42A3A5A63A2A7
2PerformAGNESclusteri
go
thedataseti
Questio
1Showyourresultsby
drawi
gade
drogram树状图eachstepmergestwoclusterswiththemi
imumdista
ceThede
drogramshouldclearlyshowtheorderi
whichthepoi
tsaremergedA1210A225A384A458A575A664A712A849步骤0步骤1步骤2步骤3步骤4步骤5A12,10A1A2A2A3A4A3A4A22,5A3A4A5A6A5A6A1A2A38,4A5A6A7A3A4AA45,8A4A5A65A6A7A57,5A5,A6A8A66,4A71,2A84,9
PartIILabassig
me
t
f1Productio
Recomme
datio
Usi
gthe“ba
kestimatio
data”estimatethedecisio
treethatpredictspepasafu
ctio
oftheothervariablesSelect“Expert”a
dset“pru
i
gseverity”at75Setthe“Type”ofpepas“Flag”a
dthe“Directio
”as“out”Builddecisio
treesusi
gthreeoptio
s“Mi
imumrecordsperchildbra
ch”valuesbei
ga56b15a
dc10
otselecti
g“useglobalpru
i
g”1Ha
di
theco
fusio
matrixforaba
dco
the“validatio
data”
a56
b15
fc10
f2Ha
di
Whichofthethreetreeswillyouusetoscorethedatai
aholdoutdatalista
dwhy23li
es
f选择15的理由是如上图,第一个的识别率是7395,第二个识别率是8697,第三个的识别率是8837,15的识别率明显高于56的,而与第三个的识别率差不多,但是第三个的子树太多会造成运行缓慢,综合来说15的效果较好。3Ha
di
Forthefollowi
gdataAppe
dix1usi
gtherulesfromthebestdecisio
treefilli
therecomme
datio

Chur
Ma
ageme
t
Thegoalofthisassig
me
tistor
好听全球资料 返回顶部