今天,我们来学习一下,Pandas中的关于行列选择的十大技能,这些技能,绝对是你使用Pandas的过程中,需要用到的,因为,你肯定也想像Excel一样,任性地操作Python中的数据框。
先来导入我们的演示数据,这里直接复制执行就可以了。
#importthepandasmodule
importpandasaspd
#Createanexampledataframeaboutafictionalarmy
raw_data={
'regiment':['Nighthawks','Nighthawks','Nighthawks','Nighthawks','Dragoons',
'Dragoons','Dragoons','Dragoons','Scouts','Scouts','Scouts','Scouts'],
'company':['1st','1st','2nd','2nd','1st','1st','2nd','2nd','1st','1st','2nd','2nd'],
'deaths':[523,52,25,616,43,234,523,62,62,73,37,35],
'battles':[5,42,2,2,4,7,8,3,4,7,8,9],
'size':[1045,957,1099,1400,1592,1006,987,849,973,1005,1099,1523],
'veterans':[1,5,62,26,73,37,949,48,48,435,63,345],
'readiness':[1,2,3,3,2,1,2,3,2,1,2,3],
'armored':[1,0,1,1,0,1,0,1,0,0,1,1],
'deserters':[4,24,31,2,3,4,24,31,2,3,2,3],
'origin':['Arizona','California','Texas','Florida','Maine','Iowa','Alaska','Washington','Oregon','Wyoming','Louisana','Georgia']
}
df=pd.DataFrame(
raw_data,
columns=['regiment','company','deaths','battles','size','veterans','readiness','armored','deserters','origin']
)
df=df.set_index('origin')
df.head()
技能1、选择一列
df['size']
技能2、选择多列
df[['size','veterans']]
技能3、根据一个行索引,选择出一行
#Selectallrowswiththeindexlabel"Arizona"
df.loc[:'Arizona']
技能4、根据一个行序号,选择出从开始到这个序号的行
#Selecteveryrowupto3
df.iloc[:2]
技能5、根据两个行序号,选择出从第一个序号到第二个序号的行
df.iloc[1:2]
技能6、根据一个行序号,选择出从这个行序号开始到结束的行
df.iloc[2:]
技能7、根据一个列序号,选择出从开始列到这个序号的所有列
#Selectthefirst2columns
df.iloc[:,:2]
技能8、条件过滤
#Selectrowswheredf.deathsisgreaterthan50
df[df['deaths'] |