公众号:尤而小屋
作者:Peter
编辑:Peter
大家好,我是Peter~
本文主题:对比SQL,学习Pandas操作!
在SQL中查询数据的时候我们所有各种操作,主要是通过select、where、group by等多个关键词的组合查询来实现的。本文中介绍的如何在相同的需求下,通过pandas来实现取数操作。
data:image/s3,"s3://crabby-images/35409/35409f0a34e74263fa757776aab2f9566d9d5938" alt=""
比较方向
- 查询全部数据
- 前N条
- 后N条
- 中间段数据
- 部分字段
- 指定等式条件
- 指定不等式条件
- 取反操作
- 指定多个条件
- 指定计算等式
- 模糊查询
- 排序
- 分组统计
- 取别名
参考资料
因为本文主要介绍的是如何通过pandas来获取我们想要的数据,也是pandas的各种取数技巧,参考之前介绍的3篇文章:
模拟数据
在数据库中,我们先模拟了3份数据:
1、学生信息表
-- 学生信息
mysql> select * from Student;
+------+--------+------------+-------+
| s_id | s_name | s_birth | s_sex |
+------+--------+------------+-------+
| 01 | 赵雷 | 1990-01-01 | 男 |
| 02 | 钱电 | 1990-12-21 | 男 |
| 03 | 孙风 | 1990-05-20 | 男 |
| 04 | 李云 | 1990-08-06 | 男 |
| 05 | 周梅 | 1991-12-01 | 女 |
| 06 | 吴兰 | 1992-03-01 | 女 |
| 07 | 郑竹 | 1989-07-02 | 女 |
| 08 | 王菊 | 1990-01-20 | 女 |
+------+--------+------------+-------+
8 rows in set (0.00 sec)
2、一份用户表
data:image/s3,"s3://crabby-images/d367c/d367cf918a2c5f80be32dad25c05fb6712109a59" alt=""
3、一份水果商品价格表
data:image/s3,"s3://crabby-images/487ab/487abdeddeefe9e5ca6db22a6d52220cdf133f25" alt=""
下面开始介绍不同需求下基于pandas和SQL的取数实现
取出全部数据
SQL实现
select * from Student;
data:image/s3,"s3://crabby-images/1220b/1220b952abcccc6195aae09c9605c105c657ab67" alt=""
Pandas实现
data:image/s3,"s3://crabby-images/9c5c0/9c5c0b9bdca73b20ced911668e7e3fac2de4e452" alt=""
前N条数据
SQL实现
查看前5条数据:
data:image/s3,"s3://crabby-images/44ec2/44ec28cce8480a153dab0ca50c4b3f6042cf3ea9" alt=""
data:image/s3,"s3://crabby-images/3bc53/3bc5345a845ee32bb2bbb93f6b88c3661f8267d4" alt=""
Pandas实现
head方法默认是前5条:
data:image/s3,"s3://crabby-images/58e8f/58e8fb06b90fa5a56f33b679acf40d9e2beaa8e8" alt=""
指定查看前7条数据:
data:image/s3,"s3://crabby-images/ca233/ca233280c6998204b3bae4f1ebdab0ba01e5315e" alt=""
后N条数据
select *
from (select * from Student
order by s_id desc
limit 5)t -- 临时结果表:倒序输出的最后5条
order by s_id; -- 再使用一次排序,将顺序还原
data:image/s3,"s3://crabby-images/97a7e/97a7ea70a6ec349a85a50060151ae7cd29b05855" alt=""
Pandas实现
tail方法默认是后5条:
data:image/s3,"s3://crabby-images/dae7e/dae7edc5dfd5367178ed58b89e8d2f7a730afd90" alt=""
指定查看4条
data:image/s3,"s3://crabby-images/9fbfe/9fbfe875dce60e2f90425e68142bfe25d192f1c1" alt=""
切片数据
SQL实现
data:image/s3,"s3://crabby-images/9f27d/9f27dc6a8775f835504dbebd6c419160752ccf98" alt=""
Pandas实现
使用pandas中的切片来查看某个连续区间内的数据:
data:image/s3,"s3://crabby-images/8ef32/8ef3210831753e62629941d40ce829ea9a0becb3" alt=""
取出部分字段
SQL实现
data:image/s3,"s3://crabby-images/e2508/e2508477437941aaa5c166a0bb8677a4665b6802" alt=""
Pandas实现
df1[["id","name","sex"]] # 方式1
df2.filter(items=["id","age","createtime"]) # 方式2
data:image/s3,"s3://crabby-images/664cb/664cb5361179e67f3a5239dbb60968bee94eb969" alt=""
指定等式条件
SQL实现
data:image/s3,"s3://crabby-images/df898/df898d4c96cd7fee8fb82d29ed1f63ee2a3c062c" alt=""
data:image/s3,"s3://crabby-images/1566d/1566de8c81d67bc37b59057483f91b50c48b5b40" alt=""
Pandas实现
df1[df1["sex"] == "男"] # 方式1
df1.query('sex=="男"') # 方式2
data:image/s3,"s3://crabby-images/0eec5/0eec578be8f66d278b19854754285f55e1944a78" alt=""
指定id号或者年龄age:
data:image/s3,"s3://crabby-images/e842b/e842b0a2bc7754ea7b103fb42658967eef06f370" alt=""
指定不等式条件
SQL实现
select * from Student where s_sex!= "男";
select * from user where age > 18;
select * from user where id <= 3;
data:image/s3,"s3://crabby-images/a5a48/a5a48a97afc09be587523d102fbaefac2cdbcb8e" alt=""
Pandas实现
data:image/s3,"s3://crabby-images/212d1/212d1115391d1debdb4a9095bea2fbba8970fe55" alt=""
data:image/s3,"s3://crabby-images/40099/40099a8f63ce68d35a2340429343655e836131b3" alt=""
data:image/s3,"s3://crabby-images/48a9a/48a9ad67ceb164cde4aa6f89246f6391a3564439" alt=""
取反操作
SQL实现
mysql> select * from Student where s_sex != "男";
data:image/s3,"s3://crabby-images/1a80c/1a80c211cad7591ea42136ce76ae48347669300a" alt=""
Pandas实现
data:image/s3,"s3://crabby-images/f3e83/f3e83398e71cda2e165310b19e0cde733939bd64" alt=""
data:image/s3,"s3://crabby-images/d64b4/d64b4a69c183202537cc248a7e4166da59a6c068" alt=""
data:image/s3,"s3://crabby-images/a8d2e/a8d2ed7aae6d785414ca821ef09910caee009f35" alt=""
指定多个条件
SQL实现
select * from Student where s_birth <="1991-01-01" and s_sex= "男";
select * from user where age < 20 and fee > 60;
select * from user where age < 20 and fee > 60;
Pandas实现
data:image/s3,"s3://crabby-images/fc633/fc6338701e81f97951598aafd0ec87d2491741b4" alt=""
指定计算等式
SQL实现
select * from user where age % 3 = 0; -- 年龄分别是3或者2的倍数
select * from user where age % 2 = 0;
data:image/s3,"s3://crabby-images/a13eb/a13eb0cb51175fab738dc823e35289f43a57505d" alt=""
Pandas实现
data:image/s3,"s3://crabby-images/6f587/6f58718c7d63619e44993fc5076ffa2b8a4d9bbd" alt=""
data:image/s3,"s3://crabby-images/3ff8f/3ff8ff6cb224bb2bdef092bb040fd802aef150b4" alt=""
模糊查询
SQL实现
SQL的关键词是like:
- 左匹配
- 右匹配
- 全匹配
data:image/s3,"s3://crabby-images/72251/72251288039af5f366fc570ae163cf4df377afc5" alt=""
Pandas实现
data:image/s3,"s3://crabby-images/9b872/9b872fb14360810af4b2c79ce7dc8cf5fdf765f9" alt=""
data:image/s3,"s3://crabby-images/e487a/e487a94cb6152e2794403ac16c9450e265e46f90" alt=""
排序
默认是升序,可以指定为降序
SQL实现
1、单个字段
data:image/s3,"s3://crabby-images/5238e/5238ec3a1f12782211e4dffb904cdc8882ae3939" alt=""
select * from Student order by s_birth desc; -- 改成升序
2、多个字段的排序
data:image/s3,"s3://crabby-images/f3786/f37864b00756b9a8aa8b84ad90f932715e0b64b3" alt=""
Pandas实现
1、单个字段
data:image/s3,"s3://crabby-images/b8285/b828547bce825a71ad42acb83b08159f218c119a" alt=""
data:image/s3,"s3://crabby-images/bb428/bb428fdc65e429c35ef9e0b846e5050a6ae4510b" alt=""
2、多个字段
data:image/s3,"s3://crabby-images/71897/71897e9a7c6f370d601fa5a2f85310830b40e8e4" alt=""
分组统计
SQL实现
通过group by 来进行分组统计:
data:image/s3,"s3://crabby-images/4926e/4926ed5f6861ec472d73f1fa4c9815d9544cee77" alt=""
Pandas实现
先看看df3的数据,一个水果会对应多个价格,我们水果的名称对价格汇总:
data:image/s3,"s3://crabby-images/67af0/67af038e9bb79fe640999d72c8252d6fee2ef53c" alt=""
df3.groupby("name").agg({"price":"sum"}).reset_index() # 方式1
df3.groupby("name")["price"].sum().reset_index() # 方式2
data:image/s3,"s3://crabby-images/bd0d1/bd0d1d1c3f0eeacbc17d780db72efe532ce1c9ec" alt=""
取别名
SQL实现
通过使用as 关键词:
select name as 水果, sum(price) as 价格 from products group by name;
data:image/s3,"s3://crabby-images/1f4c7/1f4c75ba96f06260de91b18681d94161d674cba3" alt=""
Pandas实现
Pandas是通过rename函数来实现的:
df3.groupby("name").agg({"price":"sum"}).reset_index().rename(columns={"name":"水果","price":"价格"})
data:image/s3,"s3://crabby-images/40e51/40e51531f57dce0c026ecfa3115b970df6865c83" alt=""
网友评论