我们可以通过多种不同的方式访问 Pandas DataFrame 中的元素。通常,我们可以使用行和列标签访问 DataFrame 的行、列或单个元素。我们来看一些示例:
# We print the store_items DataFrame
print(store_items)
# We access rows, columns and elements using labels
print()
print('How many bikes are in each store:\n', store_items[['bikes']])
print()
print('How many bikes and pants are in each store:\n', store_items[['bikes', 'pants']])
print()
print('What items are in Store 1:\n', store_items.loc[['store 1']])
print()
print('How many bikes are in Store 2:', store_items['bikes']['store 2'])
data:image/s3,"s3://crabby-images/a0a57/a0a57570d573a3967dd6a4c294ac2ee84e135a4f" alt=""
data:image/s3,"s3://crabby-images/39bc0/39bc01d721d6193ae4137a8a77c69de567719b5b" alt=""
data:image/s3,"s3://crabby-images/818fb/818fb9bb347aacee230bf7ca0f257370760a8eb7" alt=""
data:image/s3,"s3://crabby-images/df00b/df00bf931644f45676329851fee42d83db8e9432" alt=""
请注意,在访问 DataFrame 中的单个元素时,就像上个示例一样,必须始终提供标签,并且列标签在前,格式为 dataframe[column][row]
。例如,在检索商店 2 中的自行车数量时,我们首先使用列标签 bikes,然后使用行标签 store 2。如果先提供行标签,将出错。
我们还可以通过添加行或列修改 DataFrame。我们先了解如何向 DataFrame 中添加新的列。假设我们想添加每个商店的衬衫库存。为此,我们需要向 store_items DataFrame 添加一个新列,表示每个商店的衬衫库存。我们来编写代码:
# We add a new column named shirts to our store_items DataFrame indicating the number of shirts in stock at each store. We
# will put 15 shirts in store 1 and 2 shirts in store 2
store_items['shirts'] = [15,2]
# We display the modified DataFrame
store_items
data:image/s3,"s3://crabby-images/e0df6/e0df6decbfe3c410b9ea653b393f664c63f0f412" alt=""
可以看出,当我们添加新的列时,新列添加到了 DataFrame 的末尾。
还可以使用算术运算符向 DataFrame 中的其他列之间添加新列。我们来看一个示例:
# We make a new column called suits by adding the number of shirts and pants
store_items['suits'] = store_items['pants'] + store_items['shirts']
# We display the modified DataFrame
store_items
data:image/s3,"s3://crabby-images/b4777/b4777374f3d58fa18b7ea6fb76ebd8b498a7c1ff" alt=""
假设现在你开了一家新店,需要将该商店的商品库存添加到 DataFrame 中。为此,我们可以向 store_items
Dataframe 中添加一个新行。要向 DataFrame 中添加行,我们首先需要创建新的 Dataframe,然后将其附加到原始 DataFrame 上。我们来看看代码编写方式
# We create a dictionary from a list of Python dictionaries that will number of items at the new store
new_items = [{'bikes': 20, 'pants': 30, 'watches': 35, 'glasses': 4}]
# We create new DataFrame with the new_items and provide and index labeled store 3
new_store = pd.DataFrame(new_items, index = ['store 3'])
# We display the items at the new store
new_store
data:image/s3,"s3://crabby-images/e3699/e36999e115996aa4af1f640fbb2b76b8771b0b3e" alt=""
现在,我们使用 .append()
方法将此行添加到 store_items DataFrame 中。
# We append store 3 to our store_items DataFrame
store_items = store_items.append(new_store)
# We display the modified DataFrame
store_items
data:image/s3,"s3://crabby-images/9fe8c/9fe8c63bd64b9208a355b557e8d4d45ee0d5561f" alt=""
注意,将新行附加到 DataFrame 后,列按照字母顺序排序了。
我们还可以仅使用特定列的特定行中的数据向 DataFrame 添加新的列。例如,假设你想在商店 2 和 3 中上一批新手表,并且新手表的数量与这些商店原有手表的库存一样。我们来看看如何编写代码
# We add a new column using data from particular rows in the watches column
store_items['new watches'] = store_items['watches'][1:]
# We display the modified DataFrame
store_items
data:image/s3,"s3://crabby-images/f7a04/f7a049d7e13429b8dc03307af05406f800649b32" alt=""
我们还可以将新列插入 DataFrames 的任何位置。dataframe.insert(loc,label,data)
方法使我们能够将新列(具有给定列标签
和给定数据
)插入 dataframe
的 loc
位置。我们将名称为 shoes 的新列插入 suits 列前面。因为 suits 的数字索引值为 4,我们将此值作为 loc
。我们来看看代码编写方式:
# We insert a new column with label shoes right before the column with numerical index 4
store_items.insert(4, 'shoes', [8,5,0])
# we display the modified DataFrame
store_items
bikes | glasses | pants | shirts | shoes | suits | watches | new watches | |
---|---|---|---|---|---|---|---|---|
store 1 | 20 | NaN | 30 | 15.0 | 8 | 45.0 | 35 | NaN |
store 2 | 15 | 50.0 | 5 | 2.0 | 5 | 7.0 | 10 | 10.0 |
store 3 | 20 | 4.0 | 30 | NaN | 0 | NaN | 35 | 35.0 |
就像我们可以添加行和列一样,我们也可以删除它们。要删除 DataFrame 中的行和列,我们将使用 .pop()
和 .drop()
方法。.pop()
方法仅允许我们删除列,而 .drop()
方法可以同时用于删除行和列,只需使用关键字axis
即可。我们来看一些示例:
# We remove the new watches column
store_items.pop('new watches')
# we display the modified DataFrame
store_items
bikes | glasses | pants | shirts | shoes | suits | watches | |
---|---|---|---|---|---|---|---|
store 1 | 20 | NaN | 30 | 15.0 | 8 | 45.0 | 35 |
store 2 | 15 | 50.0 | 5 | 2.0 | 5 | 7.0 | 10 |
store 3 | 20 | 4.0 | 30 | NaN | 0 | NaN | 35 |
# We remove the watches and shoes columns
store_items = store_items.drop(['watches', 'shoes'], axis = 1)
# we display the modified DataFrame
store_items
data:image/s3,"s3://crabby-images/d490a/d490a4ebe4b06d3bb806aacba16f9fb30a7e9135" alt=""
# We remove the store 2 and store 1 rows
store_items = store_items.drop(['store 2', 'store 1'], axis = 0)
# we display the modified DataFrame
store_items
data:image/s3,"s3://crabby-images/f70f8/f70f8c13dcdb5ef79a0295a8a7e8427469586c1f" alt=""
有时候,我们可能需要更改行和列标签。我们使用 .rename()
方法将 bikes 列标签改为 hats
# We change the column label bikes to hats
store_items = store_items.rename(columns = {'bikes': 'hats'})
# we display the modified DataFrame
store_items
data:image/s3,"s3://crabby-images/16624/166241c3ba23a1c1f3ec638a6520408292bab60b" alt=""
现在再次使用 .rename()
方法更改行标签。
# We change the row label from store 3 to last store
store_items = store_items.rename(index = {'store 3': 'last store'})
# we display the modified DataFrame
store_items
data:image/s3,"s3://crabby-images/ff43e/ff43e264f61de359e2ad9b60e9f1b29f1c1c14f0" alt=""
你还可以将索引改为 DataFrame 中的某个列。
# We change the row index to be the data in the pants column
store_items = store_items.set_index('pants')
# we display the modified DataFrame
store_items
data:image/s3,"s3://crabby-images/ad4df/ad4df0e8c54db91a9ae530769b27319c4f80ccce" alt=""
网友评论