sql重復(fù)數(shù)據(jù)只取一條 sql查詢(xún)中如何去除某個(gè)字段重復(fù)的數(shù)據(jù)?
sql查詢(xún)中如何去除某個(gè)字段重復(fù)的數(shù)據(jù)?可以?xún)?yōu)化,我們最近在做的是用了3臺(tái)30G內(nèi)存,12核心的虛擬機(jī)。對(duì)于6億1千4百萬(wàn)的數(shù)據(jù)(Mysql中是130G,parquet格式是30G),執(zhí)行多字段分組聚
sql查詢(xún)中如何去除某個(gè)字段重復(fù)的數(shù)據(jù)?
可以?xún)?yōu)化,我們最近在做的是用了3臺(tái)30G內(nèi)存,12核心的虛擬機(jī)。對(duì)于6億1千4百萬(wàn)的數(shù)據(jù)(Mysql中是130G,parquet格式是30G),執(zhí)行多字段分組聚合 某字段count(distinct)耗時(shí)在30秒左右,而對(duì)于普通查詢(xún)耗時(shí)在2秒內(nèi)。所以我覺(jué)得對(duì)于你的情況,優(yōu)化空間還很大
用SQL語(yǔ)句怎么過(guò)濾重復(fù)數(shù)據(jù)?
1、查找表中多余的重復(fù)記錄,重復(fù)記錄是根據(jù)單個(gè)字段(peopleId)來(lái)判斷復(fù)制代碼代碼如下:select * from peoplewhere peopleId in (select peopleId from people group by peopleId having count(peopleId) > 1)2、刪除表中多余的重復(fù)記錄,重復(fù)記錄是根據(jù)單個(gè)字段(peopleId)來(lái)判斷,只留有rowid最小的記錄復(fù)制代碼代碼如下:delete from people where peopleId in (select peopleId from people group by peopleId having count(peopleId) > 1)and rowid not in (select min(rowid) from people group by peopleId having count(peopleId)>1)